## Personalized Food Recommendations

### Introduction

Personalize food recommendations using Content based recommender to display the foods that are similar to the ones liked by the user in the past and for new user display the top selling items


### Dataset Preparation

Capture the past behaviour of a customer and based on that, recommend food which the users might be likely to buy. Collect the food details and their description to calculate the similarity betweek items

### Importing the libraries

In [1]:
import pandas as pd
import numpy as np

In [2]:
from sklearn.feature_extraction.text import TfidfVectorizer, CountVectorizer
from sklearn.metrics.pairwise import linear_kernel, cosine_similarity

### Import Dataset

In [3]:
#fd = pd.read_csv('C:\\Users\\admin\\Desktop\\newdata.csv')

fd = pd.read_csv('C:\\Users\\admin\\Desktop\\latest.csv',encoding = "ISO-8859-1")
fd.head()

Unnamed: 0,User ID,Food Name,Overview,Rating,Total Quantity Sales Per day
0,1000,Idli,savoury rice cake South Indian dish made with ...,4.2,1200
1,1001,Dosa,type of pancake South Indian dish made with Ri...,3.5,1000
2,1002,Granola Bars,Deserts made with grains,4.0,100
3,1003,Pasta,staple food of Italian cuisine,3.0,200
4,1004,South Full Meals,South Indian meals for perfect lunch,5.0,4000


In [4]:
fd.columns

Index(['User ID', 'Food Name', 'Overview', 'Rating',
       'Total Quantity Sales Per day'],
      dtype='object')

###  New User : Display Top sellers

In [5]:
#Top sellers
fd.nlargest(5,'Total Quantity Sales Per day')

Unnamed: 0,User ID,Food Name,Overview,Rating,Total Quantity Sales Per day
13,1013,Tea,aromatic beverage,5.0,5000
14,1014,Coffee,Beverage best for refreshing,1.0,5000
4,1004,South Full Meals,South Indian meals for perfect lunch,5.0,4000
11,1011,Chicken Briyani,delicious savory rice dish that is loaded with...,5.0,4000
25,1025,Butter Naan,"Famous Indian Butter Naan, great with curries",1.0,1600


### Existing User : Recommend foods that are similar to the ones liked by the user in the past 

Create Content Based Recommenders based on food description and cosine similarity matrix

1. Term Frequency (TF) and Inverse Document Frequency (IDF) used to determine the relative importance of a word in document

In [7]:
tf = TfidfVectorizer(analyzer='word',ngram_range=(1, 2),min_df=0, stop_words='english')
tfidf_matrix = tf.fit_transform(fd['Overview'])

In [8]:
tfidf_matrix.shape

(27, 199)

2. Cosine Similarity to calculate a numeric quantity that denotes the similarity between two food items

In [9]:
cosine_sim = linear_kernel(tfidf_matrix, tfidf_matrix)

In [10]:
cosine_sim[0]

array([1.        , 0.50693264, 0.        , 0.        , 0.13568055,
       0.06888957, 0.31028129, 0.        , 0.18743304, 0.0532044 ,
       0.13568055, 0.09894157, 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.02636735, 0.        , 0.02636735, 0.        ,
       0.02657496, 0.        ])

Now we have a pairwise cosine similarity matrix for all the food items in our dataset.

In [11]:
fd = fd.reset_index()
titles = fd['Food Name']
indices = pd.Series(fd.index, index=fd['Food Name'])

In [12]:
def get_recommendations(title):
    idx = indices[title]
    sim_scores = list(enumerate(cosine_sim[idx]))
    sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
    sim_scores = sim_scores[1:10]
    food_indices = [i[0] for i in sim_scores]
    return titles.iloc[food_indices]

### Top-5 approach : Recommend the Top 5 similar products

In [13]:
get_recommendations('Chapati').head(5)

23       Kulcha
22    Roti Dhal
1          Dosa
3         Pasta
24         Naan
Name: Food Name, dtype: object

In [14]:
get_recommendations('Gulab Jamun').head(5)

2     Granola Bars
18      Kaju Katli
26         Basundi
0             Idli
1             Dosa
Name: Food Name, dtype: object

### Use a Count Vectorizer to create count matrix

In [15]:
count = CountVectorizer(analyzer='word',ngram_range=(1, 2),min_df=0, stop_words='english')
count_matrix = count.fit_transform(fd['Overview'])

In [16]:
cosine_sim = cosine_similarity(count_matrix, count_matrix)

In [17]:
cosine_sim[0]

array([1.        , 0.63628476, 0.        , 0.        , 0.22941573,
       0.13834289, 0.48420012, 0.        , 0.34585723, 0.11128298,
       0.22941573, 0.17770466, 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.06362848, 0.        , 0.06362848, 0.        ,
       0.06917145, 0.        ])

In [18]:
def get_recommendations1(title):
    idx = indices[title]
    sim_scores = list(enumerate(cosine_sim[idx]))
    sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
    sim_scores = sim_scores[1:10]
    food_indices = [i[0] for i in sim_scores]
    return titles.iloc[food_indices]

### Top-5 approach : Recommend the Top 5 similar products

In [19]:
get_recommendations1('Chapati').head(5)

23       Kulcha
1          Dosa
22    Roti Dhal
3         Pasta
24         Naan
Name: Food Name, dtype: object

In [20]:
get_recommendations1('Gulab Jamun').head(5)

18      Kaju Katli
2     Granola Bars
26         Basundi
0             Idli
1             Dosa
Name: Food Name, dtype: object

### Limitations

This model works based on description of food products, so dataset curation must  inorder for this model to work better

### This model can be enhanced by taking feedback from user for food item and taking rating into account