In [4]:
import pandas as pd
import numpy as np

Testing out applying the heuristic framework on a single element here

In [5]:
import heuristic_model as hm
heuristic_model = hm.load_model("heuristics.json")

In [8]:
recipes = pd.read_parquet('../data_sources/recipepairs/recipes.parquet') 
pairs = pd.read_parquet('../data_sources/recipepairs/pairs.parquet') 
pairs_subset = pairs[pairs['name_iou'] > 0.7]

In [13]:
def get_recipe_by_id(id):
	return recipes.loc[recipes['id'] == id]['ingredients'].explode().tolist()

Look at one hand-picked example

In [16]:
generated = heuristic_model(get_recipe_by_id(pairs_subset.iloc[552].base), 'dairy-free')
actual = get_recipe_by_id(pairs_subset.iloc[552].target)
print(generated)
print(actual)

['bean', 'carrot', 'vegan cheese', 'chicken', 'chicken broth', 'coriander', 'corn', 'garlic', 'onion', 'rotel', 'seasoning', 'coconut cream', 'starch', 'tortilla', 'water']
['chicken', 'chili powder', 'chipotle chile', 'coriander', 'corn kernel', 'garlic', 'lime juice', 'lime wedge', 'low sodium chicken broth', 'onion', 'salt', 'tomato', 'tortilla', 'vegetable oil']


We're going to have a number of scenarios like this, where the alternate is significantly different. How do we want to handle metrics in this case? Maybe flatten the pairs subsets so all the alternates are in one thing? We'll need to check if that meaninigfully changes any metrics

Next, lets take a look at simple accuracy, where accuracy is defined as "does the heuristic model produce exactly the target recipe in the dataset?"

In [18]:
def is_heuristic_correct(row):
	generated = heuristic_model(get_recipe_by_id(row['base']), 'vegan')
	actual = get_recipe_by_id(row['target'])
	return 1 if set(generated) == set(actual) else 0

In [19]:
pairs_vegan = pairs_subset[pairs_subset['categories'].apply(lambda x: 'vegan' in x)]
pairs_vegan.apply(is_heuristic_correct, axis=1).mean()

np.float64(0.0001630639720345288)

In [20]:
pairs_vegetarian = pairs_subset[pairs_subset['categories'].apply(lambda x: 'vegetarian' in x)]
pairs_vegetarian.apply(is_heuristic_correct, axis=1).mean()

np.float64(7.711467723651843e-06)

In [21]:
pairs_df = pairs_subset[pairs_subset['categories'].apply(lambda x: 'dairy_free' in x)]
pairs_df.apply(is_heuristic_correct, axis=1).mean()

np.float64(5.6284544639272354e-05)

These are ridiculously low numbers. Let's see what happens if we compress the datasets so that we have base-all target mappings

In [22]:
def is_heuristic_correct_compressed(row):
	generated = heuristic_model(get_recipe_by_id(row['base']), 'vegan')
	for target in row['target']:
		if set(generated) == set(get_recipe_by_id(target)):
			return 1
	return 0

In [23]:
pairs_vegan_compressed = pairs_vegan.groupby('base', as_index=False).agg({'target': list})
pairs_vegan_compressed.apply(is_heuristic_correct_compressed, axis=1).mean()

np.float64(0.00046063238245648666)

In [24]:
pairs_vegetarian_compressed = pairs_vegetarian.groupby('base', as_index=False).agg({'target': list})
pairs_vegetarian_compressed.apply(is_heuristic_correct_compressed, axis=1).mean()

np.float64(5.076657528683115e-05)

In [25]:
pairs_df_compressed = pairs_df.groupby('base', as_index=False).agg({'target': list})
pairs_df_compressed.apply(is_heuristic_correct_compressed, axis=1).mean()

np.float64(0.0002826824388971036)

This does help somewhat, but the numbers are still tiny. Let's try out scoring methods with more of a gradient