<head><h1 align="center">
Recipe Recommender System
</h1></head>  
  
<head><h3 align="center">Memory Management & Model Serialization Notebook</h3></head>  

In this project I build a recommender system.

## Setup

### Package Installations

In [1]:
# %pip install wordcloud
# %pip install surprise

### Import Statements

In [1]:
# import libraries
import boto3, re, sys, os, time, math, csv, json, pickle, sagemaker, urllib.request
from sys import getsizeof
from os import system
from math import floor
from copy import deepcopy
from collections import defaultdict # used in get_top_n() function from Surprise documentation
from sagemaker import get_execution_role
import numpy as np                                
import pandas as pd                               
import matplotlib.pyplot as plt  
import seaborn as sns
from wordcloud import WordCloud, STOPWORDS  

from surprise.similarities import cosine, msd, pearson # for Memory-based Methods (Neighborhood-based)
from surprise.prediction_algorithms import SVD, knns, KNNWithMeans, KNNBasic, KNNBaseline, KNNWithZScore, CoClustering, BaselineOnly, NormalPredictor, NMF, SVDpp, SlopeOne
from surprise.model_selection import GridSearchCV, cross_validate, train_test_split
from surprise import Reader, Dataset, accuracy, dump

from IPython.display import Image                 
from IPython.display import display               
from time import gmtime, strftime                 
from sagemaker.predictor import csv_serializer   

# Setting display options for DataFrames and plots
pd.set_option('display.max_rows', 6)
pd.set_option('display.max_columns', 20)
pd.set_option('display.max_colwidth', 200)
sns.set_theme(font_scale=1.5)
%matplotlib inline

# Define IAM role
role = get_execution_role()
prefix = 'sagemaker/knn'
containers = {'us-west-2': '174872318107.dkr.ecr.us-west-2.amazonaws.com/knn:1',
              'us-east-1': '382416733822.dkr.ecr.us-east-1.amazonaws.com/knn:1',
              'us-east-2': '404615174143.dkr.ecr.us-east-2.amazonaws.com/knn:1',
              'eu-west-1': '438346466558.dkr.ecr.eu-west-1.amazonaws.com/knn:1',
              'ap-northeast-1': '351501993468.dkr.ecr.ap-northeast-1.amazonaws.com/knn:1',
              'ap-northeast-2': '835164637446.dkr.ecr.ap-northeast-2.amazonaws.com/knn:1',
              'ap-southeast-2': '712309505854.dkr.ecr.ap-southeast-2.amazonaws.com/knn:1'}

my_region = boto3.session.Session().region_name # set the region of the instance
print("Success - the MySageMakerInstance is in the " + my_region + " region. You will use the " + containers[my_region] + " container for your SageMaker endpoint.")

Success - the MySageMakerInstance is in the us-east-2 region. You will use the 404615174143.dkr.ecr.us-east-2.amazonaws.com/knn:1 container for your SageMaker endpoint.


### Functions

In [2]:
def get_top_n(predictions, n=10):
    """Return the top-N recommendation for each user from a set of predictions.

    Args:
        predictions(list of Prediction objects): The list of predictions, as
            returned by the test method of an algorithm.
        n(int): The number of recommendation to output for each user. Default
            is 10.

    Returns:
    A dict where keys are user (raw) ids and values are lists of tuples:
        [(raw item id, rating estimation), ...] of size n.
    """

    # First map the predictions to each user.
    top_n = defaultdict(list)
    for uid, iid, true_r, est, _ in predictions:
        top_n[uid].append((iid, est))

    # Then sort the predictions for each user and retrieve the k highest ones.
    for uid, user_ratings in top_n.items():
        user_ratings.sort(key=lambda x: x[1], reverse=True)
        top_n[uid] = user_ratings[:n]

    return top_n

# Data

In [3]:
rdf = pd.read_csv('data/RAW_recipes.csv')
idf = pd.read_csv('data/RAW_interactions.csv')

## Cleaning & Data Preparation

In [4]:
# Cleaning: Dropping row that contains a NaN value for recipe name
rdf.drop(labels=721, inplace = True)

# Cleaning/FE: Creating columns for recipe's respective nutrients
rdf['kcal'] = rdf.nutrition.apply(lambda x: x[1:-1].split(sep=', ')[0])
rdf['fat'] = rdf.nutrition.apply(lambda x: x[1:-1].split(sep=', ')[1])
rdf['sugar'] = rdf.nutrition.apply(lambda x: x[1:-1].split(sep=', ')[2])
rdf['salt'] = rdf.nutrition.apply(lambda x: x[1:-1].split(sep=', ')[3])
rdf['protein'] = rdf.nutrition.apply(lambda x: x[1:-1].split(sep=', ')[4])
rdf['sat_fat'] = rdf.nutrition.apply(lambda x: x[1:-1].split(sep=', ')[5])
rdf['carbs'] = rdf.nutrition.apply(lambda x: x[1:-1].split(sep=', ')[6])

# Cleaning: Imputing outlier value to median
rdf['minutes'] = np.where(rdf.minutes == 2147483647,
                         rdf.minutes.median(),
                         rdf.minutes)

idf['date'] = pd.to_datetime(idf.date)

# A reader is still needed but only the rating_scale param is requiered.
reader = Reader(rating_scale=(1, 5))

# The columns must correspond to user id, item id and ratings (in that order).
sidf = Dataset.load_from_df(idf[['user_id', 'recipe_id', 'rating']], reader)

# Model Serialization

#### Optimum BaselineOnly ALS Model

In [None]:
# Setting the Baseline ALS algorithm set with optimum hyperparameters to a variable
tuned_bsl_als = BaselineOnly(bsl_options = {'method':'als',
                                            'n_epochs': 5,
                                            'reg_u': 10,
                                            'reg_i': 10}, verbose=True)

# tuned_bsl_als.predict(165623, 132263, verbose=True) # this can be used to generate the predicted rating for any user-item combination.
algo = tuned_bsl_als
algo.fit(trainset)

# Compute predictions of the 'original' algorithm.
als_preds = algo.test(trainset.build_testset())

# tuned_bsl_als.get_neighbors(165623, 10) # not possible with baseline als

In [10]:
# Setting the Baseline ALS algorithm set with optimum hyperparameters to a variable
tuned_bsl_als = BaselineOnly(bsl_options = {'method':'als',
                                            'n_epochs': 5,
                                            'reg_u': 10,
                                            'reg_i': 10}, verbose=True)

In [58]:
tuned_bsl_als.predict(165623, 132263, verbose=True)

user: 165623     item: 132263     r_ui = None   est = 3.37   {'was_impossible': False}


Prediction(uid=165623, iid=132263, r_ui=None, est=3.369978531598852, details={'was_impossible': False})

In [None]:
trainset = sidf.build_full_trainset()

algo = tuned_bsl_als
algo.fit(trainset)

# Compute predictions of the 'original' algorithm.
predictions = algo.test(trainset.build_testset())

# Here we use scikit-surprise's dump module to save the optimized algorithm state.
file_name = 'models/ALS_bsl_tuned_model'
dump.dump(file_name, predictions=als_predictions, algo=algo)
_, loaded_algo = dump.load(file_name) # ... and here we reload the file...

# and now we ensure that the algo is still the same by checking the predictions.
predictions_loaded_algo = loaded_algo.test(trainset.build_testset())
assert predictions == predictions_loaded_algo
print('Predictions are the same')

Estimating biases using als...


#### Optimum BaselineOnly SGD Model

In [17]:
tuned_bsl_sgd = BaselineOnly(bsl_options = {'method':'sgd',
                                            'n_epochs': 20,
                                            'reg_u': .02,
                                            'reg_i': .005}, verbose=True)

In [18]:
trainset = sidf.build_full_trainset()

algo = tuned_bsl_sgd
algo.fit(trainset)

# Compute predictions of the 'original' algorithm.
predictions = algo.test(trainset.build_testset())

# Here we use scikit-surprise's dump module to save the optimized algorithm state.
file_name = 'models/SGD_bsl_tuned_model'
dump.dump(file_name, algo=algo)
_, loaded_algo = dump.load(file_name) # ... and here we reload the file...

# and now we ensure that the algo is still the same by checking the predictions.
predictions_loaded_algo = loaded_algo.test(trainset.build_testset())
assert predictions == predictions_loaded_algo
print('Predictions are the same')

Estimating biases using sgd...
Predictions are the same


#### Optimum SVD Model

In [None]:
trainset = sidf.build_full_trainset()

algo = gs_svd
algo.fit(trainset)

# Compute predictions of the 'original' algorithm.
predictions = algo.test(trainset.build_testset())

# Here we use scikit-surprise's dump module to save the optimized algorithm state.
file_name = 'models/serialized_optimum_svd_model'
dump.dump(file_name, algo=algo)
_, loaded_algo = dump.load(file_name) # ... and here we reload the file...

# and now we ensure that the algo is still the same by checking the predictions.
predictions_loaded_algo = loaded_algo.test(trainset.build_testset())
assert predictions == predictions_loaded_algo
print('Predictions are the same')

# Evaluation

We can use the `tuned_bsl_als` hyperparameter-tuned algorithm variables instantiated above in the ***Model Serialization*** section, or import the models we saved with surprise's `dump` module.  
  
For each type of algorithm we use, respective to the subsections below, I have commented the code used for both methods, for ease of use, access, & to assist in the facilitation of understanding each section.

### Build Trainset & Testset

Below I build the trainset from the surprise dataset object (`sidf`) that we've been working with throughout evaluation. The trainset and testset are used for each algorithm to compare predicted user-item recommendations.

In [5]:
trainset = sidf.build_full_trainset()

In [6]:
testset = trainset.build_testset()

## ALS Recommender Model Predictions

#### From ALS Variable

In [16]:
# Assigning hyperparameter-tuned ALS Baseline algorithm to a variable
tuned_bsl_als = BaselineOnly(bsl_options = {'method':'als',
                                            'n_epochs': 5,
                                            'reg_u': 10,
                                            'reg_i': 10}, verbose=True)
# Fitting the ALS algorithm to the trainset
tuned_bsl_als.fit(trainset)

# And finally, assigning the algorithm's predictions on the testset to a variable
als_predictions = tuned_bsl_als.test(testset)

Estimating biases using als...


In [17]:
# r_ui: the True rating of the user for each respective recipe
# est: the predicted rating of the user for each respective recipe
als_predictions 

[Prediction(uid=38094, iid=40893, r_ui=4.0, est=4.717762948133046, details={'was_impossible': False}),
 Prediction(uid=38094, iid=16954, r_ui=5.0, est=4.720104295625348, details={'was_impossible': False}),
 Prediction(uid=38094, iid=40753, r_ui=5.0, est=4.965711302120515, details={'was_impossible': False}),
 Prediction(uid=38094, iid=34513, r_ui=5.0, est=4.78439695346031, details={'was_impossible': False}),
 Prediction(uid=38094, iid=69545, r_ui=5.0, est=4.847718336466992, details={'was_impossible': False}),
 Prediction(uid=38094, iid=49064, r_ui=4.0, est=4.567575332072738, details={'was_impossible': False}),
 Prediction(uid=38094, iid=80044, r_ui=5.0, est=4.530482925844968, details={'was_impossible': False}),
 Prediction(uid=38094, iid=30565, r_ui=5.0, est=4.762056899841326, details={'was_impossible': False}),
 Prediction(uid=38094, iid=29493, r_ui=5.0, est=4.631227317798427, details={'was_impossible': False}),
 Prediction(uid=38094, iid=34509, r_ui=5.0, est=4.798115836448116, details

In [18]:
# 1,132,367 — confirming the length of the list of predictions 
# to ensure that each user-recipe interaction is accounted for
# len(als_predictions)

1132367

In [20]:
# using the get_top_n() function to see the top recommended recipes for each user
als_top_5 = get_top_n(als_predictions, 5)

In [21]:
als_top_5

defaultdict(list,
            {38094: [(40753, 4.965711302120515),
              (20500, 4.921787132391187),
              (4764, 4.877474116841598),
              (69545, 4.847718336466992),
              (35302, 4.832634597686826)],
             1293707: [(52282, 5),
              (134316, 4.975509579676866),
              (219563, 4.852369435142123),
              (7404, 4.821341897450531),
              (376391, 4.812859943898884)],
             8937: [(38031, 4.529608600941765),
              (20128, 4.463915020866052),
              (55392, 4.454353032814657),
              (41596, 4.425487486967377),
              (59635, 4.393057743079915)],
             126440: [(379639, 5),
              (45539, 5),
              (53594, 5),
              (143719, 5),
              (88290, 5)],
             57222: [(146201, 4.738707012396724),
              (75737, 4.728293921738709),
              (170114, 4.7165674784873515),
              (60560, 4.715434538327113),
              (26772, 4

In [20]:
# idf[idf.user_id == 38094].rating.value_counts() # this user rated 29 recipes with 5, and 6 recipes with 4

5    29
4     6
Name: rating, dtype: int64

#### Alternative: From Imported Saved ALS Model

In [26]:
# file_name = 'models/ALS_bsl_tuned_model'
# als_model = dump.load(file_name)[1]
# file_name = None
# als_fit_train = als_model.fit(trainset)
# als_preds = als_fit_train.test(testset)
# als_preds # predicted user recipe ratings

In [31]:
# top_n = get_top_n(als_preds, n=5)

# # Print the recommended items for each user
# for uid, user_ratings in top_n.items():
#     print(uid, [iid for (iid, _) in user_ratings])

## Baseline Model with SGD — Predicted Recommendations

#### From `tuned_bsl_sgd` Variable

In [7]:
# Assigning hyperparameter-tuned Baseline Stochastic Gradient Descent algorithm to a variable
tuned_bsl_sgd = BaselineOnly(bsl_options = {'method':'sgd',
                                            'n_epochs': 20,
                                            'reg_u': .02,
                                            'reg_i': .005}, verbose=True)
# Fitting the ALS algorithm to the trainset
tuned_bsl_sgd.fit(trainset)

# And finally, assigning the algorithm's predictions on the testset to a variable
sgd_predictions = tuned_bsl_sgd.test(testset)

Estimating biases using sgd...


In [8]:
# r_ui: the True rating of the user for each respective recipe
# est: the predicted rating of the user for each respective recipe
sgd_predictions 

[Prediction(uid=38094, iid=40893, r_ui=4.0, est=4.799321217777709, details={'was_impossible': False}),
 Prediction(uid=38094, iid=16954, r_ui=5.0, est=4.797712235032227, details={'was_impossible': False}),
 Prediction(uid=38094, iid=40753, r_ui=5.0, est=5, details={'was_impossible': False}),
 Prediction(uid=38094, iid=34513, r_ui=5.0, est=4.870552565674106, details={'was_impossible': False}),
 Prediction(uid=38094, iid=69545, r_ui=5.0, est=4.9501615496990565, details={'was_impossible': False}),
 Prediction(uid=38094, iid=49064, r_ui=4.0, est=4.608937478584497, details={'was_impossible': False}),
 Prediction(uid=38094, iid=80044, r_ui=5.0, est=4.554244839312158, details={'was_impossible': False}),
 Prediction(uid=38094, iid=30565, r_ui=5.0, est=4.829816273250924, details={'was_impossible': False}),
 Prediction(uid=38094, iid=29493, r_ui=5.0, est=4.628568109932685, details={'was_impossible': False}),
 Prediction(uid=38094, iid=34509, r_ui=5.0, est=4.874499530454643, details={'was_impossi

In [11]:
# 1,132,367 — confirming the length of the list of predictions 
# to ensure that each user-recipe interaction is accounted for
len(sgd_predictions)

1132367

In [14]:
# using the get_top_n() function to see the top recommended recipes for each user
sgd_top_5 = get_top_n(sgd_predictions, 5)

In [15]:
sgd_top_5

defaultdict(list,
            {38094: [(40753, 5),
              (20500, 5),
              (4764, 4.9868731410801335),
              (69545, 4.9501615496990565),
              (35302, 4.922203972863365)],
             1293707: [(134316, 5),
              (52282, 5),
              (219563, 4.956070382485434),
              (7404, 4.92934390192857),
              (376391, 4.915376559541175)],
             8937: [(38031, 4.542060568996136),
              (20128, 4.4695416268470565),
              (55392, 4.4366124744479425),
              (41596, 4.4115541596155206),
              (10620, 4.373107058901768)],
             126440: [(379639, 5),
              (45539, 5),
              (53594, 5),
              (315187, 5),
              (143719, 5)],
             57222: [(146201, 4.826654883616256),
              (170114, 4.79880092615459),
              (26772, 4.784103646047251),
              (75737, 4.767082151917793),
              (61195, 4.754183656242576)],
             52282: [(131

In [20]:
# idf[idf.user_id == ????].rating.value_counts() # input a user_id inplace of ???? to see what recipes they have interacted with

5    29
4     6
Name: rating, dtype: int64

#### Alternative: From Imported Saved Baseline with SGD Model

In [None]:
# file_name = 'models/SGD_bsl_tuned_model'
# sgd_model = dump.load(file_name)[1]
# file_name = None
# sgd_fit_train = sgd_model.fit(trainset)
# sgd_preds = sgd_fit_train.test(testset)
# sgd_preds # predicted user recipe ratings

In [9]:
# top_n = get_top_n(sgd_preds, n=10)

# # Print the recommended items for each user
# for uid, user_ratings in top_n.items():
#     print(uid, [iid for (iid, _) in user_ratings])

## Singular Value Decomposition Model — Predicted Recommendations

#### From `tuned_bsl_sgd` Variable

In [7]:
# Assigning hyperparameter-tuned Baseline Stochastic Gradient Descent algorithm to a variable
tuned_bsl_sgd = BaselineOnly(bsl_options = {'method':'sgd',
                                            'n_epochs': 20,
                                            'reg_u': .02,
                                            'reg_i': .005}, verbose=True)
# Fitting the ALS algorithm to the trainset
tuned_bsl_sgd.fit(trainset)

# And finally, assigning the algorithm's predictions on the testset to a variable
sgd_predictions = tuned_bsl_sgd.test(testset)

Estimating biases using sgd...


In [8]:
# r_ui: the True rating of the user for each respective recipe
# est: the predicted rating of the user for each respective recipe
sgd_predictions 

[Prediction(uid=38094, iid=40893, r_ui=4.0, est=4.799321217777709, details={'was_impossible': False}),
 Prediction(uid=38094, iid=16954, r_ui=5.0, est=4.797712235032227, details={'was_impossible': False}),
 Prediction(uid=38094, iid=40753, r_ui=5.0, est=5, details={'was_impossible': False}),
 Prediction(uid=38094, iid=34513, r_ui=5.0, est=4.870552565674106, details={'was_impossible': False}),
 Prediction(uid=38094, iid=69545, r_ui=5.0, est=4.9501615496990565, details={'was_impossible': False}),
 Prediction(uid=38094, iid=49064, r_ui=4.0, est=4.608937478584497, details={'was_impossible': False}),
 Prediction(uid=38094, iid=80044, r_ui=5.0, est=4.554244839312158, details={'was_impossible': False}),
 Prediction(uid=38094, iid=30565, r_ui=5.0, est=4.829816273250924, details={'was_impossible': False}),
 Prediction(uid=38094, iid=29493, r_ui=5.0, est=4.628568109932685, details={'was_impossible': False}),
 Prediction(uid=38094, iid=34509, r_ui=5.0, est=4.874499530454643, details={'was_impossi

In [11]:
# 1,132,367 — confirming the length of the list of predictions 
# to ensure that each user-recipe interaction is accounted for
len(sgd_predictions)

1132367

In [14]:
# using the get_top_n() function to see the top recommended recipes for each user
sgd_top_5 = get_top_n(sgd_predictions, 5)

In [15]:
sgd_top_5

defaultdict(list,
            {38094: [(40753, 5),
              (20500, 5),
              (4764, 4.9868731410801335),
              (69545, 4.9501615496990565),
              (35302, 4.922203972863365)],
             1293707: [(134316, 5),
              (52282, 5),
              (219563, 4.956070382485434),
              (7404, 4.92934390192857),
              (376391, 4.915376559541175)],
             8937: [(38031, 4.542060568996136),
              (20128, 4.4695416268470565),
              (55392, 4.4366124744479425),
              (41596, 4.4115541596155206),
              (10620, 4.373107058901768)],
             126440: [(379639, 5),
              (45539, 5),
              (53594, 5),
              (315187, 5),
              (143719, 5)],
             57222: [(146201, 4.826654883616256),
              (170114, 4.79880092615459),
              (26772, 4.784103646047251),
              (75737, 4.767082151917793),
              (61195, 4.754183656242576)],
             52282: [(131

#### Alternative: From Imported Saved Funk-SVD Model

In [None]:
# file_name = 'models/serialized_optimized_svd_model'
# svd_model = dump.load(file_name)[1]
# file_name = None
# svd_fit_train = svd_model.fit(trainset)
# svd_preds = svd_fit_train.test(testset)
# svd_preds # predicted user recipe ratings

In [9]:
# top_n = get_top_n(svd_preds, n=10)

# # Print the recommended items for each user
# for uid, user_ratings in top_n.items():
#     print(uid, [iid for (iid, _) in user_ratings])

### Further Prediction Examples

In [72]:
als_fit.predict(165623, 132263, verbose=True)

user: 165623     item: 132263     r_ui = None   est = 3.37   {'was_impossible': False}


Prediction(uid=165623, iid=132263, r_ui=None, est=3.3745201993261214, details={'was_impossible': False})

#### Example Users ID Reference

In [71]:
idf.user_id.value_counts(sort=True, ascending=False)[500:550] # Useful code for finding users that have rated many recipes

165623     277
173880     277
2549237    276
          ... 
28087      252
153188     252
256795     252
Name: user_id, Length: 50, dtype: int64

In [73]:
tuned_bsl_als.predict(165623, 132263, verbose=True)

user: 165623     item: 132263     r_ui = None   est = 3.37   {'was_impossible': False}


Prediction(uid=165623, iid=132263, r_ui=None, est=3.3745201993261214, details={'was_impossible': False})

In [74]:
idf[idf.user_id == 165623]

Unnamed: 0,user_id,recipe_id,date,rating,review
9947,165623,365179,2009-11-29,5,This was a tasty little sandwich - I liked the addition red onion but used mayo instead of miracle whip and smoky cheddar instead of American.
12558,165623,213767,2007-03-16,4,"I am a big germie so I couldn't bring myself to add a stone to the soup but we did use a potato. I used green bell pepper instead of red, also omitted the corn and yellow squash. I didn't use the ..."
12735,165623,146129,2009-11-29,5,Very good - especially with leftovers. I had to make a few changes based on the picky eaters here. I cooked it as directed but before adding the cheese I used the immersion blender and whipped it ...
...,...,...,...,...,...
1118276,165623,349157,2009-03-24,4,This is a pretty basic recipe in ease and flavor. It is like comfort food without all the hassle. I would make this again but add some veggies like broccoli to it. Thanks for an easy dinner.
1121537,165623,95798,2007-03-19,3,"We had this for lunch today and thought it was just ok. The only change I made was I didn't have a 14 3/4 oz can of salmon but did have 3 - 6oz cans, so I used a little extra salmon."
1128405,165623,198021,2007-03-05,5,"We had this over ice cream last night ... GREAT! And over pancakes this morning ... EVEN BETTER!! The next day we took 1 cup of this sauce with 1 1/2 scoops vanilla icecream, 1 cup of milk and 7 i..."


#### Example Recipes ID Reference

In [None]:
# Meats, Fried & Barbecue
# 63986 — chicken lickin good pork chops # 19

# Vegatable & Vegan
# 132263 — 5 minute vegan pancakes # 482

# Desserts
# 52035 — oreo balls # 354
# 46877 — uncle bill s whipped shortbread cookies # 315
# 42198 — better than sex strawberries # 5
# 52804 — jiffy extra moist carrot cake # 8

# Pasta/Italian
# 22176 — classic baked ziti # 316

# Asian
# 48760 — szechuan noodles with spicy beef sauce # 315
# 103215 — panda express orange chicken # 347

# Fish
# 53914 — mama s supper club tilapia permesan # 472

# Fried
# 108364 — southern fried chicken # 323

## Understanding Methods of Building `testset` — 
# Note: REDUCE RECIPES/USERS from SIDF / IDF

Now let's import our saved models and have a look at the predictions.

### Inspecting build_full_trainset and build_testset Methods/Attributes

In [9]:
trainset = sidf.build_full_trainset()

In [10]:
getsizeof(trainset)

56

In [11]:
testset = trainset.build_testset()

In [12]:
getsizeof(testset) # 9.784696 megabytes

9784696

In [45]:
predictions = algo.test(testset)

In [46]:
predictions

[Prediction(uid=38094, iid=40893, r_ui=4.0, est=4.665102491701689, details={'was_impossible': False}),
 Prediction(uid=38094, iid=16954, r_ui=5.0, est=4.6234436594111195, details={'was_impossible': False}),
 Prediction(uid=38094, iid=40753, r_ui=5.0, est=4.8826154836441304, details={'was_impossible': False}),
 Prediction(uid=38094, iid=34513, r_ui=5.0, est=4.761318744163554, details={'was_impossible': False}),
 Prediction(uid=38094, iid=69545, r_ui=5.0, est=4.795950347809112, details={'was_impossible': False}),
 Prediction(uid=38094, iid=49064, r_ui=4.0, est=4.814548290117393, details={'was_impossible': False}),
 Prediction(uid=38094, iid=80044, r_ui=5.0, est=4.410186915119732, details={'was_impossible': False}),
 Prediction(uid=38094, iid=30565, r_ui=5.0, est=4.715618538653177, details={'was_impossible': False}),
 Prediction(uid=38094, iid=29493, r_ui=5.0, est=4.58675508819778, details={'was_impossible': False}),
 Prediction(uid=38094, iid=34509, r_ui=5.0, est=4.769320072378886, detai

In [48]:
als_top_5 = get_top_n(predictions, 5)

In [19]:
idf[(idf.user_id == 126440) & (idf.recipe_id == 408734)]

Unnamed: 0,user_id,recipe_id,date,rating,review
105271,126440,408734,2011-06-11,5,Great spicy sauce for the tender chicken. I made half a recipe with three thighs and one breast.<br/>Only used 1/2 t. each of cayenne and pepper flakes and it was perfect for us.


In [None]:
trainset.predict()

In [None]:
testset.

In [14]:
testset

[(38094, 40893, 4.0),
 (38094, 16954, 5.0),
 (38094, 40753, 5.0),
 (38094, 34513, 5.0),
 (38094, 69545, 5.0),
 (38094, 49064, 4.0),
 (38094, 80044, 5.0),
 (38094, 30565, 5.0),
 (38094, 29493, 5.0),
 (38094, 34509, 5.0),
 (38094, 20500, 5.0),
 (38094, 38104, 5.0),
 (38094, 39907, 4.0),
 (38094, 77818, 5.0),
 (38094, 18007, 5.0),
 (38094, 4764, 5.0),
 (38094, 49387, 5.0),
 (38094, 71569, 5.0),
 (38094, 72929, 5.0),
 (38094, 81845, 5.0),
 (38094, 69501, 5.0),
 (38094, 44244, 5.0),
 (38094, 36653, 4.0),
 (38094, 40923, 5.0),
 (38094, 79964, 5.0),
 (38094, 40852, 5.0),
 (38094, 36381, 4.0),
 (38094, 16391, 5.0),
 (38094, 31503, 4.0),
 (38094, 26259, 5.0),
 (38094, 71499, 5.0),
 (38094, 63965, 5.0),
 (38094, 32377, 5.0),
 (38094, 35302, 5.0),
 (38094, 22319, 5.0),
 (1293707, 40893, 5.0),
 (1293707, 134316, 5.0),
 (1293707, 39446, 5.0),
 (1293707, 253891, 5.0),
 (1293707, 204257, 0.0),
 (1293707, 99564, 4.0),
 (1293707, 115110, 5.0),
 (1293707, 219563, 5.0),
 (1293707, 178556, 5.0),
 (1293707

In [25]:
# First train an (als) algorithm on the movielens dataset.
trainset = sidf.build_full_trainset()
algo = tuned_bsl_als # variable assigned above in section 5: Model Serialization/ Optimum BaselineOnly ALS Model
algo.fit(trainset)

Estimating biases using als...


<surprise.prediction_algorithms.baseline_only.BaselineOnly at 0x7fbcb1a21128>

In [None]:
# Than predict ratings for all pairs (u, i) that are NOT in the training set.
testset = trainset.build_anti_testset()

In [26]:

predictions = algo.test(testset)


MemoryError: 

In [17]:



top_n = get_top_n(predictions, n=10)

# Print the recommended items for each user
for uid, user_ratings in top_n.items():
    print(uid, [iid for (iid, _) in user_ratings])

### ANTI: Inspecting build_full_trainset and build_anti_testset Methods/Attributes

The `.build_anti_testset()` method creates predictions on all user-item interactions that are *not* in the trainset.  

Note: **THIS METHOD WILL LIKELY REQUIRE DATASET TRIMMING TO CIRCUMVENT MEMORY ISSUES** 

In [5]:
train_experiment = sidf.build_full_trainset()

In [6]:
getsizeof(train_experiment)

56

In [11]:
testset_exp = train_experiment.build_anti_testset()

MemoryError: 

In [None]:
idf.user_id.unique()

In [12]:
getsizeof(testset) # 9.784696 megabytes

9784696

In [45]:
predictions = algo.test(testset)

In [46]:
predictions

[Prediction(uid=38094, iid=40893, r_ui=4.0, est=4.665102491701689, details={'was_impossible': False}),
 Prediction(uid=38094, iid=16954, r_ui=5.0, est=4.6234436594111195, details={'was_impossible': False}),
 Prediction(uid=38094, iid=40753, r_ui=5.0, est=4.8826154836441304, details={'was_impossible': False}),
 Prediction(uid=38094, iid=34513, r_ui=5.0, est=4.761318744163554, details={'was_impossible': False}),
 Prediction(uid=38094, iid=69545, r_ui=5.0, est=4.795950347809112, details={'was_impossible': False}),
 Prediction(uid=38094, iid=49064, r_ui=4.0, est=4.814548290117393, details={'was_impossible': False}),
 Prediction(uid=38094, iid=80044, r_ui=5.0, est=4.410186915119732, details={'was_impossible': False}),
 Prediction(uid=38094, iid=30565, r_ui=5.0, est=4.715618538653177, details={'was_impossible': False}),
 Prediction(uid=38094, iid=29493, r_ui=5.0, est=4.58675508819778, details={'was_impossible': False}),
 Prediction(uid=38094, iid=34509, r_ui=5.0, est=4.769320072378886, detai

In [48]:
als_top_5 = get_top_n(predictions, 5)

### Inspecting Train Test Split Objects

To clarify on the different ways of constructing trainset/testset objects in scikit-surprise:
- Building `testset` off of a `trainset` constructed using **`sidf.build_full_trainset`** produces a testset as large as (equal in length to) `sidf`, wheras... 
- Performing **`train_test_split(sidf, test_size=.25)`** actually splits `sidf` in to two, with `tts_test` (the testset in this circumstance) being $\frac{1}{4}$ of the items in `sidf`.

In [39]:
len(tts_test), len(testset) # testset variable assigned in next section: Inspecting build_fulltrainset and build_testset Methods/Attributes

(283092, 1132367)

In [27]:
tts_train, tts_test = train_test_split(sidf, test_size=.25)

# We'll use our tuned Baseline ALS algorithm.
algo = tuned_bsl_als

# Train the algorithm on the trainset, and predict ratings for the testset
algo.fit(tts_train)
predictions = algo.test(tts_test)

# Then compute RMSE
accuracy.rmse(predictions)

Estimating biases using als...
RMSE: 1.2164


1.2163772843400735

In [37]:
idf[idf.user_id == 177443] # what user 177,443 actually thought of various recipes

Unnamed: 0,user_id,recipe_id,date,rating,review
5041,177443,378910,2013-04-18,5,"After seeing Ina Garten make this savory coeur a la creme on her tv show &quot;The Barefoot Contessa&quot;, I had to make it and was thrilled to find the recipe right here on food dot com. Wonderf..."
7361,177443,361942,2009-08-25,5,"Fantastic! I made these poetical egg salad tea sandwiches today for luncheon, and they were enjoyed immensely. Made the butter with fresh mint from my herb garden, which added a very lovely accen..."
10790,177443,23044,2009-07-09,5,"Absolutely fantastic!!! Made the pasta with spinach and pine nuts, plus cherry tomatoes from the garden. I served it as a room temperature salad, and it was devoured---this pasta dish is so delici..."
...,...,...,...,...,...
1128201,177443,87153,2009-05-11,5,"Oh this was good!!! Loved the combination of tart, sweet creamy and spicy all in one refreshing and healthy dessert! I used fresh pineapple, and did not wait but gobbled this right up. Am thinking..."
1129236,177443,270685,2009-01-08,5,Lovely drink! Had this one in honor of my granddaughter's birth on January 6 (her name is Victoria Rose). Cheers !!!
1131877,177443,396620,2011-11-07,5,"This is such a simple and tasty way to prepare sugar snap peas--and has quickly become one of our favorites! Will make often. Thank you so much for posting, Lori Mama!"


In [40]:
tts_test_preds = get_top_n(predictions, 5) # using our function to produce the top 5 recipes for all users in tts_test
len(tts_test_preds) # 81,660: the number of users we have generated the top 5 recommended recipes for 

81660

In [33]:
algo.predict(203111, 73087, verbose=True) # predicted r_ui is 4.37, true r_ui is 2.0  

user: 203111     item: 73087      r_ui = None   est = 4.37   {'was_impossible': False}


Prediction(uid=203111, iid=73087, r_ui=None, est=4.373626823190885, details={'was_impossible': False})

In [35]:
idf[idf.user_id == 203111].rating.mean()

4.089285714285714

In [29]:
getsizeof(tts_train), getsizeof(tts_test)

(56, 2380488)

In [30]:
tts_test

[(177443, 64235, 5.0),
 (386585, 286395, 4.0),
 (160497, 70173, 5.0),
 (56003, 14089, 5.0),
 (946853, 32062, 0.0),
 (2796729, 499768, 5.0),
 (125388, 216193, 5.0),
 (470351, 70147, 5.0),
 (199020, 264500, 5.0),
 (47510, 379434, 5.0),
 (195879, 265751, 4.0),
 (173579, 95533, 5.0),
 (18745, 66241, 5.0),
 (335469, 137253, 5.0),
 (135377, 217970, 0.0),
 (461724, 88290, 5.0),
 (1706426, 365793, 5.0),
 (296809, 326191, 5.0),
 (2000423164, 11763, 5.0),
 (2692620, 495271, 5.0),
 (1666435, 107786, 5.0),
 (397284, 51550, 5.0),
 (445492, 232069, 5.0),
 (1368446, 312881, 5.0),
 (439797, 102678, 5.0),
 (1802488208, 140486, 3.0),
 (171854, 227052, 5.0),
 (223854, 381896, 5.0),
 (142200, 93016, 5.0),
 (75449, 820, 4.0),
 (145830, 81853, 5.0),
 (614389, 23779, 3.0),
 (762159, 73825, 3.0),
 (303700, 346344, 5.0),
 (743724, 217973, 0.0),
 (118268, 46982, 4.0),
 (126004, 19896, 5.0),
 (203111, 73087, 2.0),
 (601528, 246791, 5.0),
 (30490, 22516, 0.0),
 (125109, 120969, 5.0),
 (357222, 283312, 5.0),
 (116

# Reference

### Unused Data Preperation

In [17]:
# ui and sidf are the same size (56 bites per sys.getsizeof(ui), so using ui in place of sidf will NOT help memory issues
ui_reader = Reader(line_format=' user item rating', sep=',', skip_lines=1)
ui = Dataset.load_from_file('data/ui_ratings.csv', ui_reader)

In [18]:
ui_reader = None
ui = None

### Recipe Reference

In [11]:
# als_preds.get_neighbors(67888, 10) # backyard style barbecued ribs

In [None]:
ingID = pd.read_pickle('../data/ingr_map.pkl')

In [55]:
pd.set_option('display.max_rows', 50)

In [None]:
# Meats, Fried & Barbecue
# 63986 — chicken lickin good pork chops # 19

# Vegatable & Vegan
# 132263 — 5 minute vegan pancakes # 482

# Desserts
# 52035 — oreo balls # 354
# 46877 — uncle bill s whipped shortbread cookies # 315
# 42198 — better than sex strawberries # 5
# 52804 — jiffy extra moist carrot cake # 8

# Pasta/Italian
# 22176 — classic baked ziti # 316

# Asian
# 48760 — szechuan noodles with spicy beef sauce # 315
# 103215 — panda express orange chicken # 347

# Fish
# 53914 — mama s supper club tilapia permesan # 472

# Fried
# 108364 — southern fried chicken # 323

In [56]:
idf.recipe_id.value_counts(sort=True, ascending=False)[50:100]

132263    482
76930     473
53914     472
15411     465
8701      461
73825     455
90674     441
15242     431
32844     428
80156     428
261889    427
200296    424
63828     411
31128     410
71373     404
70165     400
33921     396
15865     389
26370     388
77585     381
34382     378
33489     368
9836      368
27144     367
114392    366
27210     360
106251    357
66596     357
52035     354
43023     349
69630     349
63786     349
205890    348
103215    347
128956    344
82925     344
26217     344
46922     333
95569     330
47195     325
15072     325
349246    325
16531     324
108364    323
8782      323
3470      322
107997    321
22176     316
48760     315
46877     315
Name: recipe_id, dtype: int64

In [65]:
rdf[rdf.id == 108364]

Unnamed: 0,name,id,minutes,contributor_id,submitted,tags,nutrition,n_steps,steps,description,ingredients,n_ingredients,kcal,fat,sugar,salt,protein,sat_fat,carbs
192373,southern fried chicken,108364,40.0,25455,2005-01-15,"['60-minutes-or-less', 'time-to-make', 'main-ingredient', 'cuisine', 'preparation', 'north-american', 'poultry', 'american', 'southern-united-states', 'chicken', 'deep-fry', 'stove-top', 'dietary'...","[797.0, 61.0, 4.0, 125.0, 110.0, 57.0, 16.0]",18,"['heat peanut oil in a large deep pot to 350f', 'for sauce mixture: in a medium-sized bowl , beat the eggs with the water', 'add hot sauce and whisk together well', 'pour this mixture into a large...","thanks paula deen (foodtv) for the best fried chicken i have ever made! it has a nice, savory flavor and a coating that has just the right amount of crispiness. it was a good thing i had an extra ...","['chicken', 'eggs', 'water', 'hot sauce', 'salt', 'fresh ground black pepper', 'garlic powder', 'all-purpose flour', 'baking powder']",9,797.0,61.0,4.0,125.0,110.0,57.0,16.0


In [66]:
idf[idf.recipe_id == 108364]

Unnamed: 0,user_id,recipe_id,date,rating,review
472347,33588,108364,2005-02-16,5,"Great chicken,very tasty!It wasn't hot,but nice & crispy.Thanks Linda"
472348,9579,108364,2005-05-15,5,OH MY GOODNESS! What a wonderful recipe. Leave it to Paula to come up with this. I couldn't quit eating chicken. This will be the only recipe I use from now on. Thanks for the recipe. :)
472349,221044,108364,2005-06-20,5,"*Perfect* Sharlene! I only wish that my Philips 6161 deep fryer had more finite temperature control settings...The options were 320 or 360 >8-(. Was a *BIT* too dark @ 10 minutes, but am confide..."
472350,191050,108364,2005-09-09,5,This was so great! This was only my second time frying chicken and all I had was a bag of tenderloins and they came out excellent! DH said I could make em' anytime. They were almost like little...
472351,254076,108364,2005-10-21,5,I have officially replaced my fried chicken recipe. I made this last night for dinner and it was delicious!!! From now on when I make fried chicken I will be using this recipe. I only altered o...
...,...,...,...,...,...
472826,2000533380,108364,2018-02-10,0,I am thinking about trying this recipe tomorrow. I am hoping it turns out like home fried chicken from my childhood. Wish me luck.
472827,2001996561,108364,2018-02-17,0,"Almost perfect, but I would definitely use less salt."
472828,2002043471,108364,2018-03-11,5,Made this tonight and it was delish. This is a keeper. The chicken was crispy on the outside and moist inside.
472829,2002079430,108364,2018-03-30,5,So good and so easy!


In [43]:
idf[idf.recipe_id ] # chicken lickin good pork chops # 19

Unnamed: 0,user_id,recipe_id,date,rating,review
927061,4470,137739,2006-02-18,5,"I used an acorn squash and recipe#137681 Sweet Mexican spice blend. Only used 1 tsp honey & 1 tsp butter between both halves,, sprinkled the squash liberally with the spice mix. Baked covered for..."
927062,593927,137739,2010-08-21,5,This was a nice change. I used butternut squash and the sweet option using a good local honey and unsalted butter. I did not add salt. We ate this on top of recipe#322603 with Balkan yogurt. I may...
927063,178427,137739,2011-12-05,5,Excellent recipe! I used butternut squash and the sweet option. The mexican spice mix put this over the top. Thanks for sharing.


In [39]:
idf[idf.recipe_id == 63986] # chicken lickin good pork chops # 19

Unnamed: 0,user_id,recipe_id,date,rating,review
594378,60992,63986,2003-06-19,5,I made this for dinner tonight and the chops were tasty and fork tender! If you are salt sensitive (I am not) I would cut back on the salt and look for low sodium soup. I couldn't resist adding a ...
594379,95743,63986,2004-08-06,5,"A big hit with the meat-and-potatoes guy in the house. Served the chops with mashed potatoes. Be sure you don't overcook the chops if yours aren't 1"" thick. Thanks, Chuck."
594380,37471,63986,2004-10-02,5,Wow....great crockpot recipe!! The chops were very tasty and so tender! Many thanks!
...,...,...,...,...,...
594394,475397,63986,2010-05-01,5,Love this recipe... easy and the pork chops fall apart!! Added sliced carrots to the bottom of the crock pot and served over mashed potatoes. I also added more dry mustard and ground pepper to t...
594395,451605,63986,2012-02-17,5,"THANK YOU!! This recipe is to die for! MANY years ago I came across this recipe (for my crockpot) and prepared it frequently as it was such a hit. A thousand moves later & with recipes lost, I've ..."
594417,2838993,63986,2018-07-29,0,"One of the best old crockpot recipes! I always layer the browned/floured and seasoned chops with thinly sliced potatoes and onions. Also, you can substitute the soup with any kind of chicken broth..."


In [37]:
idf[idf.recipe_id == 42198] # better than sex strawberries

Unnamed: 0,user_id,recipe_id,date,rating,review
455924,88342,42198,2003-08-04,0,"You shouldn't be recommending eating raw eggs without a serious warning. The volumn of the ""BOX"" of sugar wuld be helpfull too."
455925,37229,42198,2004-07-07,5,"Very yummy! I didn't realize my 8x8 pan was dirty so I had to use a slightly smaller dish and less strawberries and cream as a result, but it turned out wonderful! I took it to my inlaws for 4th o..."
455926,35526,42198,2004-08-02,4,I used a half a box of sugar and it was plenty sweet. Not sure what happened but alot of the vanilla wafer crumbs never got wet enough to be a part of the dessert. Loved the taste though as did ...
455927,195984,42198,2005-02-15,5,I have looked for years for this recipe. My aunt made it when I was a child and I loved it. I thought it was called French Strawberry Pudding. I tried to make it from the memory of watching her. I...
455928,545717,42198,2008-05-17,5,"Delicious! The only change I made was that I used egg substitute in place of the eggs, it worked fine. Thank you for sharing this refreshing dessert."


In [35]:
idf[idf.recipe_id == 52804] # jiffy extra moist carrot cake

Unnamed: 0,user_id,recipe_id,date,rating,review
1023556,64642,52804,2003-03-23,5,"A lovely, light little cake just perfect for the two of us. I used the juice from the pineapple for the water. Very easy and quick to make. Will be making it again."
1023557,45999,52804,2004-06-09,5,"A very easy and quick cake to make. I, too, used the juice from the pineapple in place of the water. This cake was extremely moist and enjoyed by all. I iced it with a cream cheese icing."
1023558,38218,52804,2004-07-11,5,"Lovely cake! I also used the pineapple juice. I haven't seen a Jiffy cake in years. Corn bread, yes, but the stores in my area don't stock Jiffy CAKES. I used half a regular Betty Crocker mix...."
...,...,...,...,...,...
1023561,469437,52804,2007-10-22,5,We made this in the rice cooker and it was great! It took about 1.5 cycles of the cooker to get the doneness we wanted. I also used coconut cream pudding mix because we didn't have vanilla. We top...
1023562,422893,52804,2012-04-09,5,"Yum, yum, this is definitely lovely & moist & tastes great, really easy to make too. I got 6 cupcakes & one round cake from this mix, subbed flaked almonds for the pecans and left out the nutmeg o..."
1023563,2676919,52804,2013-02-04,5,Thanks for the Carrot cake rice cooker recipe! I've been looking for it for a long time and I will definitely try it. :)


### Package Installations

In [2]:
# import boto3, re, sys, os, time, math, csv, json, pickle, sagemaker, urllib.request
# from os import system
# %conda install --yes --prefix {sys.prefix} -c conda-forge scikit-surprise

# Note: In AWS SageMaker, occassionally this will throw an 'EnvironmentLocationNotFound' error. 
# This occurs when 'system' has not yet been imported

#   > full error:
#     EnvironmentLocationNotFound: Not a conda environment: /home/ec2-user/SageMaker/recipe-book/{sys.prefix}
#
#   > anaconda3 envs path located at:
#     /home/ec2-user/anaconda3/envs/ 
#        * (might be preceeded by '~')

### Dataset Filenames in `data/`:  

ingr_map.pkl  
  
interactions_test.csv  
interactions_train.csv  
interactions_validation.csv  
  
PP_recipes.csv  
PP_users.csv  
  
RAW_interactions.csv  
RAW_recipes.csv  

test.json  
train.json  

In [None]:
# rdf = pd.read_csv('data/RAW_recipes.csv')
# idf = pd.read_csv('data/RAW_interactions.csv')

In [8]:
# pp_rdf = pd.read_csv('data/PP_recipes.csv')
# pp_idf = pd.read_csv('data/PP_users.csv')

# ingID = pd.read_pickle('data/ingr_map.pkl')

# itrain_df = pd.read_csv('data/interactions_train.csv')
# ival_df = pd.read_csv('data/interactions_validation.csv')

I have used the below code to download the large csv files to this notebook instance's data folder.  
Running again should not be necessary but do if there is an error setting the data to a DataFrame variable below.

In [6]:
# try:
#     urllib.request.urlretrieve ('https://sagemaker-studio-t1ems8mtnoj.s3.us-east-2.amazonaws.com/RAW_interactions.csv', 'data/RAW_interactions.csv')
#     print('Success: downloaded RAW_interactions.csv.')
# except Exception as e:
#     print('Data load error: ',e)

# try:
#     idf = pd.read_csv('data/RAW_interactions.csv',index_col=0)
#     print('Success: Data loaded into dataframe.')
# except Exception as e:
#     print('Data load error: ',e)

# try:
#     urllib.request.urlretrieve ('https://sagemaker-studio-t1ems8mtnoj.s3.us-east-2.amazonaws.com/PP_recipes.csv', 'data/PP_recipes.csv')
#     print('Success: downloaded PP_recipes.csv.')
# except Exception as e:
#     print('Data load error: ',e)

Some of my favorite plots below for reference.

In [29]:
### REFERENCE — My PLOT USING HUE from King County Project


# x = hue_df.date.value_counts().sort_index()
# sns.set_palette('Pastel1')#, 1) 
# fig, ax = plt.subplots(figsize=(15,6))
# sns.lineplot(x=hue_df.date, 
#              y=hue_df.date.value_counts(), 
#              hue=hue_df['five']);



# fig, ax = plt.subplots(figsize=(15,6))
# ax.set_title('Density of Properties at Different Prices \n Waterfront vs. Non-waterfront', size = 18)

# sns.lineplot(data=df, x=x.index, y=x[x.five ==1].values, 
#              color = 'y', label = 'Rating = 5', kde=True, stat='density', ax = ax)
# sns.lineplot(data=df, x=x.index, y=x[x.five ==0].values, 
#              color = 'b', label = 'Ratings\nrange 0-4', bins = 30, kde=True, stat='density', ax = ax)
# ax.legend(labels=['Rating = 5', 'Ratings\nrange 0-4']);

# Resources

## *Food.com Recipes and Interactions* — Full Description


### Source

> [Orginally Sourced on Kaggle](https://www.kaggle.com/shuyangli94/food-com-recipes-and-user-interactions/discussion/121778), provided by [Shuyang Li](https://www.kaggle.com/shuyangli94)

### Context

This dataset consists of 180K+ recipes and 700K+ recipe reviews covering 18 years of user interactions and uploads on *Food.com* (formerly *GeniusKitchen*) used in the paper listed below under **Acknowledgements**.

### Content

This dataset contains three sets of data from *Food.com*:

**Interaction splits**

- `interactions_test.csv`
- `interactions_validation.csv`
- `interactions_train.csv`
  
**Preprocessed data for result reproduction**  
In this format, the recipe text metadata is tokenized via the GPT subword tokenizer with start-of-step, etc. tokens.

- `PP_recipes.csv`
- `PP_users.csv`
  
  To convert these files into the `pickle` format required to run our code off-the-shelf, you may use `pandas.read_csv` and `pandas.to_pickle` to convert the CSV's into the proper `pickle` format.

### Acknowledgements

>*Generating Personalized Recipes from Historical User Preferences*  
Bodhisattwa Prasad Majumder, Shuyang Li, Jianmo Ni, Julian McAuley
EMNLP, 2019  
https://www.aclweb.org/anthology/D19-1613/

### Citation

> Shuyang Li, *Food.com Recipes and Interactions* (2019),  
doi:10.34740/KAGGLE/DSV/783630.