# FOOD RECIPE RECOMMENDATION ENGINE

## Part 2d: Market Basket Analysis

In [1]:
import pandas as pd
from mlxtend.frequent_patterns import association_rules, apriori
from src.market_basket import *

### Load data

In [2]:
# Load recipes
recipes = pd.read_feather("./data/recipes.feather")

# Load interactions
interactions = pd.read_feather("./data/interactions.feather")

In [3]:
filtered_ratings = filter_ratings(interactions, n1=100, n2=100) # Only keep recipes with 100+ ratings and users who have rated 100+ recipes
filtered_ratings = map_recipes(recipes, filtered_ratings) # Only keep user IDs and recipe names

In [4]:
# One-hot encoding
onehot = filtered_ratings.pivot_table(index="user_id", columns="recipe_name", aggfunc=len, fill_value=0)
onehot = onehot > 0 # Returns "True" if there exists a user-recipe interaction and "False" otherwise

### Create association rules

We will use the `apriori` algorithm to generate association rules. We can more or less understand from some the names of the `antecedents` and `consequents` why they are associated together. For example, `Quick Cinnamon Rolls No Yeast` is associated with `Mean's Dutch Babies` (both are desserts), and `Paula Deen Crock Pot Macaroni And Cheese` is associated with `Slow Cooker Macaroni Cheese` (both are mac 'n cheese).

In [5]:
frequent_itemsets = apriori(onehot, min_support=0.00001, max_len=2, use_colnames=True) # Compute frequent items
rules = association_rules(frequent_itemsets) # Create association rules

If a person decides to make `Bacon Lattice Tomato Muffins Rsc`, we will recommend the five recipes in the `consequents` column.

### Make recommendations

In [6]:
recipe = "Bacon Lattice Tomato Muffins Rsc"
rules[rules.antecedents.apply(str).str.contains(recipe)].sort_values('lift', ascending=False).head(5)

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction
347,(Bacon Lattice Tomato Muffins Rsc),(Mile High Cabbage Pie 5fix),0.00073,0.00292,0.00073,1.0,342.5,0.000728,inf
293,(Bacon Lattice Tomato Muffins Rsc),(Kittencal's Caramel Apples),0.00073,0.006569,0.00073,1.0,152.222222,0.000725,inf
343,(Bacon Lattice Tomato Muffins Rsc),(Mexican Stack Up Rsc),0.00073,0.007299,0.00073,1.0,137.0,0.000725,inf
489,(Bacon Lattice Tomato Muffins Rsc),(Sweet Bacon Wrapped Venison Tenderloin),0.00073,0.008029,0.00073,1.0,124.545455,0.000724,inf
357,(Bacon Lattice Tomato Muffins Rsc),(N Y C Corned Beef And Cabbage),0.00073,0.008759,0.00073,1.0,114.166667,0.000724,inf


### Conclusion

The problem with market basket analysis for this dataset is that we do not have enough information to come up with frequent itemsets even with a very small `min_support`. Most recipes are not frequently used with other recipes, and most recipe pairs are not used by enough users. However, market basket analysis is a very interpretable method that that could work really well with restaurant menus, as we would be more likely to discover frequent buying patterns there.