# Recipe Recommender System based on input food's ingredients

The Recipe Recommender is system that uses the names of food the user has previously consumed or prefers to present other similar foods that are related to the initial input by their relative ingredients.
_________________

## Overview 

The recommender system is designed to use Content-Based filter, another option could be to use a Collaborative filtering, but since this project does not store preferences or personally identifiable data about user, this approach was not selected, however, a more complete system would integrate both system for a better recommendation.

source: https://www.kaggle.com/code/yyzz1010/content-based-filtering-recipe-recommender 
source: https://www.kaggle.com/code/sagarbapodara/movie-recommendation-system-web-app 
source: https://www.youtube.com/watch?v=ijtxuF_5kEU&ab_channel=AISciences
source: https://www.geeksforgeeks.org/sklearn-feature-extraction-with-tf-idf/

### Importing libraries

In [2]:
import os
import numpy as np
import pandas as pd
import pickle
import sklearn
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.metrics.pairwise  import  linear_kernel
from sklearn.feature_extraction.text import TfidfVectorizer
import warnings
import ast
import nltk 
from nltk.stem.porter import PorterStemmer
import json

## importing dataset

In [3]:
foodData = pd.read_csv('Food Ingredients and Recipe Dataset with Image Name Mapping.csv')
foodData.info() ## overview of datatypes and instance count
foodData ## display the first and last 4 rows, to show data is being loaded properly  

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 13501 entries, 0 to 13500
Data columns (total 6 columns):
 #   Column               Non-Null Count  Dtype 
---  ------               --------------  ----- 
 0   Unnamed: 0           13501 non-null  int64 
 1   Title                13496 non-null  object
 2   Ingredients          13501 non-null  object
 3   Instructions         13493 non-null  object
 4   Image_Name           13501 non-null  object
 5   Cleaned_Ingredients  13501 non-null  object
dtypes: int64(1), object(5)
memory usage: 633.0+ KB


Unnamed: 0.1,Unnamed: 0,Title,Ingredients,Instructions,Image_Name,Cleaned_Ingredients
0,0,Miso-Butter Roast Chicken With Acorn Squash Pa...,"['1 (3½–4-lb.) whole chicken', '2¾ tsp. kosher...","Pat chicken dry with paper towels, season all ...",miso-butter-roast-chicken-acorn-squash-panzanella,"['1 (3½–4-lb.) whole chicken', '2¾ tsp. kosher..."
1,1,Crispy Salt and Pepper Potatoes,"['2 large egg whites', '1 pound new potatoes (...",Preheat oven to 400°F and line a rimmed baking...,crispy-salt-and-pepper-potatoes-dan-kluger,"['2 large egg whites', '1 pound new potatoes (..."
2,2,Thanksgiving Mac and Cheese,"['1 cup evaporated milk', '1 cup whole milk', ...",Place a rack in middle of oven; preheat to 400...,thanksgiving-mac-and-cheese-erick-williams,"['1 cup evaporated milk', '1 cup whole milk', ..."
3,3,Italian Sausage and Bread Stuffing,"['1 (¾- to 1-pound) round Italian loaf, cut in...",Preheat oven to 350°F with rack in middle. Gen...,italian-sausage-and-bread-stuffing-240559,"['1 (¾- to 1-pound) round Italian loaf, cut in..."
4,4,Newton's Law,"['1 teaspoon dark brown sugar', '1 teaspoon ho...",Stir together brown sugar and hot water in a c...,newtons-law-apple-bourbon-cocktail,"['1 teaspoon dark brown sugar', '1 teaspoon ho..."
...,...,...,...,...,...,...
13496,13496,Brownie Pudding Cake,"['1 cup all-purpose flour', '2/3 cup unsweeten...",Preheat the oven to 350°F. Into a bowl sift to...,brownie-pudding-cake-14408,"['1 cup all-purpose flour', '2/3 cup unsweeten..."
13497,13497,Israeli Couscous with Roasted Butternut Squash...,"['1 preserved lemon', '1 1/2 pound butternut s...",Preheat oven to 475°F.\nHalve lemons and scoop...,israeli-couscous-with-roasted-butternut-squash...,"['1 preserved lemon', '1 1/2 pound butternut s..."
13498,13498,Rice with Soy-Glazed Bonito Flakes and Sesame ...,['Leftover katsuo bushi (dried bonito flakes) ...,"If using katsuo bushi flakes from package, moi...",rice-with-soy-glazed-bonito-flakes-and-sesame-...,['Leftover katsuo bushi (dried bonito flakes) ...
13499,13499,Spanakopita,['1 stick (1/2 cup) plus 1 tablespoon unsalted...,Melt 1 tablespoon butter in a 12-inch heavy sk...,spanakopita-107344,['1 stick (1/2 cup) plus 1 tablespoon unsalted...


# Pre-processing of dataset

Overview show that instances in the dataset are stored as a list and strings, and contain an unnecessary column: Unnamed
which will have to be dropped before feeding the data to the ML algorithm, in order to get a consistent recommendation.

In [4]:
foodData.isnull().sum() # get count of missing values

Unnamed: 0             0
Title                  5
Ingredients            0
Instructions           8
Image_Name             0
Cleaned_Ingredients    0
dtype: int64

In [5]:
# Drop all rows that contain missing values
foodData.dropna(inplace=True)

# Drop the unnecessary column
foodData.pop('Unnamed: 0')

# Define a lambda function to clean the ingredients
clean_ingredients = lambda x: ', '.join([ingredient.strip("' ") for ingredient in x.strip('[]').split(',')])

# Apply the lambda function to the 'Ingredients' column
foodData['Cleaned_Ingredients'] = foodData['Cleaned_Ingredients'].apply(clean_ingredients)
# Get the count of duplicated values


# Reorder the columns
columns = foodData.columns.tolist()
columns = columns[:1] + columns[-1:] + columns[1:-1]
foodData = foodData[columns]

# Print the modified DataFrame
print(foodData.head(5))

                                               Title  \
0  Miso-Butter Roast Chicken With Acorn Squash Pa...   
1                    Crispy Salt and Pepper Potatoes   
2                        Thanksgiving Mac and Cheese   
3                 Italian Sausage and Bread Stuffing   
4                                       Newton's Law   

                                 Cleaned_Ingredients  \
0  1 (3½–4-lb.) whole chicken, 2¾ tsp. kosher sal...   
1  2 large egg whites, 1 pound new potatoes (abou...   
2  1 cup evaporated milk, 1 cup whole milk, 1 tsp...   
3  1 (¾- to 1-pound) round Italian loaf, cut into...   
4  1 teaspoon dark brown sugar, 1 teaspoon hot wa...   

                                         Ingredients  \
0  ['1 (3½–4-lb.) whole chicken', '2¾ tsp. kosher...   
1  ['2 large egg whites', '1 pound new potatoes (...   
2  ['1 cup evaporated milk', '1 cup whole milk', ...   
3  ['1 (¾- to 1-pound) round Italian loaf, cut in...   
4  ['1 teaspoon dark brown sugar', '1 teaspoon

In [6]:
ps = PorterStemmer() # NLP to reduce words to their root form
                     # removes plurals, verbs, diff versions of the same word
                     # are treated the same 
def stem(text):
    List = []
    for i in text.split():
        List.append(ps.stem(i))
    string = " ".join(List)
    return string

In [7]:
tfidf = TfidfVectorizer(stop_words="english") # ignores strings that are filler words
foodData['Cleaned_Ingredients'] = foodData['Cleaned_Ingredients'].apply(stem)
# create a matrix of vertices that with be used for similarity cosine
tfidf_matrix = tfidf.fit_transform(foodData['Cleaned_Ingredients']) 

In [8]:
foodData.pop('Cleaned_Ingredients')
# foodData.pop('Ingredients')
# foodData = foodData.rename(columns={"Cleaned_Ingredients": "Ingredients"})

0        1 (3½–4-lb.) whole chicken, 2¾ tsp. kosher sal...
1        2 larg egg whites, 1 pound new potato (about 1...
2        1 cup evapor milk, 1 cup whole milk, 1 tsp. ga...
3        1 (¾- to 1-pound) round italian loaf, cut into...
4        1 teaspoon dark brown sugar, 1 teaspoon hot wa...
                               ...                        
13496    1 cup all-purpos flour, 2/3 cup unsweeten coco...
13497    1 preserv lemon, 1 1/2 pound butternut squash,...
13498    leftov katsuo bushi (dri bonito flakes) from m...
13499    1 stick (1/2 cup) plu 1 tablespoon unsalt butt...
13500    12 medium to larg fresh poblano chile (2 1/4 l...
Name: Cleaned_Ingredients, Length: 13493, dtype: object

In [9]:
similarity_Cos = linear_kernel(tfidf_matrix, tfidf_matrix) # create a reference of each food against each other

In [10]:
indices = pd.Series(foodData.index,index=foodData['Title']) # create index values to each title in the dataset
print(indices)

Title
Miso-Butter Roast Chicken With Acorn Squash Panzanella                     0
Crispy Salt and Pepper Potatoes                                            1
Thanksgiving Mac and Cheese                                                2
Italian Sausage and Bread Stuffing                                         3
Newton's Law                                                               4
                                                                       ...  
Brownie Pudding Cake                                                   13496
Israeli Couscous with Roasted Butternut Squash and Preserved Lemon     13497
Rice with Soy-Glazed Bonito Flakes and Sesame Seeds                    13498
Spanakopita                                                            13499
Mexican Poblano, Spinach, and Black Bean "Lasagne" with Goat Cheese    13500
Length: 13493, dtype: int64


In [31]:
def recommendation(title, similarity=similarityCos):
    # index of the food name
    idx = indices[title]
    # get index and similarity score
    similarity_scores = list(enumerate(similarity[idx]))  
    # sort by similarity score, in descending order
    sorted_similarity_scores = sorted(similarity_scores, key=lambda x: x[1], reverse=True)  
    # get top 5 similar items
    top_similar_indices = [score[0] for score in sorted_similarity_scores[0:6]]  
    # get the recommended food data
    recommended_food_data = foodData.iloc[top_similar_indices]  
    # convert DataFrame to a list of dictionaries
    recommended_food_data_list = recommended_food_data.to_dict('records')
    return recommended_food_data_list

recommendation('Watermelon-Mint Agua Fresca')

[{'Title': 'Watermelon-Mint Agua Fresca',
  'Ingredients': "['1/4 cup (packed) fresh mint leaves', '1/4 cup sugar or agave syrup', '5 cups peeled, seeded, coarsely chopped watermelon (from about a 2 1/2-pound watermelon)', '1/4 cup fresh lime juice', 'Mint sprigs (for serving)']",
  'Instructions': 'Combine mint leaves, sugar, and 1/4 cup water in a small pot. Bring to a boil and stir until sugar has dissolved. Transfer mixture to a heatproof container and chill, uncovered, until cool, about 30 minutes.\nStrain mint syrup into a blender; discard mint leaves. Add watermelon and lime juice and blend until very smooth. Using a fine-mesh sieve, strain into a pitcher; discard solids. Add 2 cups water and stir well to combine. Serve with mint sprigs.\nAqua fresca can be stored in an airtight container and chilled for up to 1 day.',
  'Image_Name': 'watermelon-mint-agua-fresca-56389829'},
 {'Title': 'Watermelon Granita with Blueberries',
  'Ingredients': "['6 cups (about 2 pounds) seedless wa

In [11]:
pickle.dump(similarityCos, open('similarity.pkl','wb'))
pickle.dump(foodData, open('foodData.pkl', 'wb'))