# **`Zomato Recommendation System`**

## *Background*


Recommendation systems are crucial for online food apps as they enhance user experience and drive customer engagement. With a vast selection of food options, recommendation systems help users discover new restaurants and cuisines tailored to their preferences, saving them time and effort. By analyzing user behavior and preferences, recommendation systems can provide personalized recommendations, increasing user satisfaction and fostering customer loyalty. Additionally, recommendation systems can help online food apps optimize their operations by promoting specific restaurants or dishes based on popularity and user demand.

## *Library*

In [1]:
# Library
import numpy as np
import pandas as pd
import seaborn as sns 
import matplotlib.pyplot as plt

# Recommendation System
from sklearn.metrics import jaccard_score
from scipy.spatial.distance import pdist, squareform

In [2]:
# Column Setting

# pd.set_option('display.max_rows', None)
pd.reset_option('display.max_rows')

## *Load Dataset*

In [4]:
# DF

df = pd.read_csv("zomato.csv",encoding='latin-1')
df.head()

Unnamed: 0,url,address,name,online_order,book_table,rate,votes,phone,location,rest_type,dish_liked,cuisines,approx_cost(for two people),reviews_list,menu_item,listed_in(type),listed_in(city)
0,https://www.zomato.com/bangalore/jalsa-banasha...,"942, 21st Main Road, 2nd Stage, Banashankari, ...",Jalsa,Yes,Yes,4.1/5,775,080 42297555\r\n+91 9743772233,Banashankari,Casual Dining,"Pasta, Lunch Buffet, Masala Papad, Paneer Laja...","North Indian, Mughlai, Chinese",800,"[('Rated 4.0', 'RATED\n A beautiful place to ...",[],Buffet,Banashankari
1,https://www.zomato.com/bangalore/spice-elephan...,"2nd Floor, 80 Feet Road, Near Big Bazaar, 6th ...",Spice Elephant,Yes,No,4.1/5,787,080 41714161,Banashankari,Casual Dining,"Momos, Lunch Buffet, Chocolate Nirvana, Thai G...","Chinese, North Indian, Thai",800,"[('Rated 4.0', 'RATED\n Had been here for din...",[],Buffet,Banashankari
2,https://www.zomato.com/SanchurroBangalore?cont...,"1112, Next to KIMS Medical College, 17th Cross...",San Churro Cafe,Yes,No,3.8/5,918,+91 9663487993,Banashankari,"Cafe, Casual Dining","Churros, Cannelloni, Minestrone Soup, Hot Choc...","Cafe, Mexican, Italian",800,"[('Rated 3.0', ""RATED\n Ambience is not that ...",[],Buffet,Banashankari
3,https://www.zomato.com/bangalore/addhuri-udupi...,"1st Floor, Annakuteera, 3rd Stage, Banashankar...",Addhuri Udupi Bhojana,No,No,3.7/5,88,+91 9620009302,Banashankari,Quick Bites,Masala Dosa,"South Indian, North Indian",300,"[('Rated 4.0', ""RATED\n Great food and proper...",[],Buffet,Banashankari
4,https://www.zomato.com/bangalore/grand-village...,"10, 3rd Floor, Lakshmi Associates, Gandhi Baza...",Grand Village,No,No,3.8/5,166,+91 8026612447\r\n+91 9901210005,Basavanagudi,Casual Dining,"Panipuri, Gol Gappe","North Indian, Rajasthani",600,"[('Rated 4.0', 'RATED\n Very good restaurant ...",[],Buffet,Banashankari


In order to generate simple recommendation system we only need Restaurant ID, Restaurant Name, Cuisines, Aggregate rating and Votes columns, then we need to drop the rest.

In [4]:
df.columns

Index(['Restaurant ID', 'Restaurant Name', 'Country Code', 'City', 'Address',
       'Locality', 'Locality Verbose', 'Longitude', 'Latitude', 'Cuisines',
       'Average Cost for two', 'Currency', 'Has Table booking',
       'Has Online delivery', 'Is delivering now', 'Switch to order menu',
       'Price range', 'Aggregate rating', 'Rating color', 'Rating text',
       'Votes'],
      dtype='object')

In [5]:
dfRS = df[['Restaurant ID','Restaurant Name','Cuisines','Aggregate rating','Votes']]
dfRS

Unnamed: 0,Restaurant ID,Restaurant Name,Cuisines,Aggregate rating,Votes
0,6317637,Le Petit Souffle,"French, Japanese, Desserts",4.8,314
1,6304287,Izakaya Kikufuji,Japanese,4.5,591
2,6300002,Heat - Edsa Shangri-La,"Seafood, Asian, Filipino, Indian",4.4,270
3,6318506,Ooma,"Japanese, Sushi",4.9,365
4,6314302,Sambo Kojin,"Japanese, Korean",4.8,229
...,...,...,...,...,...
9546,5915730,NamlÛ± Gurme,Turkish,4.1,788
9547,5908749,Ceviz AÛôacÛ±,"World Cuisine, Patisserie, Cafe",4.2,1034
9548,5915807,Huqqa,"Italian, World Cuisine",3.7,661
9549,5916112,Aôôk Kahve,Restaurant Cafe,4.0,901


## *Data Cleaning*

Gathering information of every columns

In [10]:
# Columns Description
def dataDesc():
    listItem = []
    for col in dfRS.columns :
        listItem.append(
            [col, 
            dfRS[col].dtype,
            dfRS[col].isna().sum(),
            round(dfRS[col].isna().sum()/len(dfRS)*100,2),
            dfRS[col].nunique(),
            list(dfRS[col].drop_duplicates().sample(2).values)]
        )
    descData = pd.DataFrame(data = listItem,
                            columns = ['Column','Data Type', 'Missing Value',
                                        'Pct Missing Value', 'Num Unique', 'Unique Sample'])
    return descData
    
printf('descData')

NameError: name 'printf' is not defined

We detected missing value in Cuisines column, we only need to drop them due to we need a valid value for recommending Cuisines

In [7]:
# Drop Missing Value

dfRS = dfRS.dropna()

In [8]:
dfRS

Unnamed: 0,Restaurant ID,Restaurant Name,Cuisines,Aggregate rating,Votes
0,6317637,Le Petit Souffle,"French, Japanese, Desserts",4.8,314
1,6304287,Izakaya Kikufuji,Japanese,4.5,591
2,6300002,Heat - Edsa Shangri-La,"Seafood, Asian, Filipino, Indian",4.4,270
3,6318506,Ooma,"Japanese, Sushi",4.9,365
4,6314302,Sambo Kojin,"Japanese, Korean",4.8,229
...,...,...,...,...,...
9546,5915730,NamlÛ± Gurme,Turkish,4.1,788
9547,5908749,Ceviz AÛôacÛ±,"World Cuisine, Patisserie, Cafe",4.2,1034
9548,5915807,Huqqa,"Italian, World Cuisine",3.7,661
9549,5916112,Aôôk Kahve,Restaurant Cafe,4.0,901


In [9]:
# Clean column names

# Clean column names

dfRS = dfRS.rename(columns={'Restaurant ID': 'restaurant_id'})
dfRS = dfRS.rename(columns={'Restaurant Name': 'restaurant_name'})
dfRS = dfRS.rename(columns={'Cuisines': 'cuisines'})
dfRS = dfRS.rename(columns={'Aggregate rating': 'aggregate_rating'})
dfRS = dfRS.rename(columns={'Votes': 'votes'})
dfRS

Unnamed: 0,restaurant_id,restaurant_name,cuisines,aggregate_rating,votes
0,6317637,Le Petit Souffle,"French, Japanese, Desserts",4.8,314
1,6304287,Izakaya Kikufuji,Japanese,4.5,591
2,6300002,Heat - Edsa Shangri-La,"Seafood, Asian, Filipino, Indian",4.4,270
3,6318506,Ooma,"Japanese, Sushi",4.9,365
4,6314302,Sambo Kojin,"Japanese, Korean",4.8,229
...,...,...,...,...,...
9546,5915730,NamlÛ± Gurme,Turkish,4.1,788
9547,5908749,Ceviz AÛôacÛ±,"World Cuisine, Patisserie, Cafe",4.2,1034
9548,5915807,Huqqa,"Italian, World Cuisine",3.7,661
9549,5916112,Aôôk Kahve,Restaurant Cafe,4.0,901


This step we do cleaning using klib to cleans and standardizes column names

In [10]:
# Check duplicate rows in General

dfRS.duplicated().sum()

0

No duplicated rows detected

In [11]:
# Check duplicate 'restaurant_name'

dfRS['restaurant_name'].duplicated().sum()

2105

In [12]:
dfRS['restaurant_name'].value_counts()

Cafe Coffee Day             83
Domino's Pizza              79
Subway                      63
Green Chick Chop            51
McDonald's                  48
                            ..
The Town House Cafe          1
The G.T. Road                1
The Darzi Bar & Kitchen      1
Smoke On Water               1
Walter's Coffee Roastery     1
Name: restaurant_name, Length: 7437, dtype: int64

After checking the duplicate restaurant name we got 2105 duplication, this is occured due to the raw data showing multiple restaurant names in different locations. Since in this condition we do not consider the location, we can only use one name. For this case, firstly we sort by rating and drop the duplicate

In [13]:
dfRS = dfRS.sort_values(by=['restaurant_name','aggregate_rating'],ascending=False)
dfRS[dfRS['restaurant_name']=="Domino's Pizza"].head()

Unnamed: 0,restaurant_id,restaurant_name,cuisines,aggregate_rating,votes
3031,143,Domino's Pizza,"Pizza, Fast Food",3.7,336
1844,5065,Domino's Pizza,"Pizza, Fast Food",3.6,146
2448,15078,Domino's Pizza,"Pizza, Fast Food",3.6,86
7618,18263236,Domino's Pizza,"Pizza, Fast Food",3.6,24
8437,384,Domino's Pizza,"Pizza, Fast Food",3.6,547


We can see the duplicated name is sorted by rating, now we only need the first row and drop th rest

In [14]:
dfRS = dfRS.drop_duplicates('restaurant_name',keep='first')
dfRS

Unnamed: 0,restaurant_id,restaurant_name,cuisines,aggregate_rating,votes
9523,6000871,íukuraÛôa SofrasÛ±,"Kebab, Izgara",4.4,296
3120,18222559,{Niche} - Cafe & Bar,"North Indian, Chinese, Italian, Continental",4.1,492
9334,7100938,wagamama,"Japanese, Asian",3.7,131
9454,6401789,tashas,"Cafe, Mediterranean",4.1,374
4659,18361747,t Lounge by Dilmah,"Cafe, Tea, Desserts",3.6,34
...,...,...,...,...,...
6998,18336489,#OFF Campus,"Cafe, Continental, Italian, Fast Food",3.7,216
2613,18311951,#InstaFreeze,Ice Cream,0.0,2
9148,18378803,#Dilliwaala6,North Indian,3.7,124
2459,3100446,#45,Cafe,3.6,209


In [15]:
dfRS['restaurant_name'].value_counts()

íukuraÛôa SofrasÛ±         1
French Omelette             1
Four Queens Dairy Cream     1
Fourteen Eleven Tea Cafe    1
Fozzie's Pizzaiolo          1
                           ..
Pizza Street                1
Pizza Treat                 1
Pizza Yum                   1
Pizza ÛÁl Forno             1
 Let's Burrrp               1
Name: restaurant_name, Length: 7437, dtype: int64

All names only has one value, we can proceen to filtering

### Filter Rating
___

In this case we want to highlight and recommed restaurans that are rated above 4.0

In [16]:
dfRS = dfRS[dfRS['aggregate_rating']>=4.0]
dfRS

Unnamed: 0,restaurant_id,restaurant_name,cuisines,aggregate_rating,votes
9523,6000871,íukuraÛôa SofrasÛ±,"Kebab, Izgara",4.4,296
3120,18222559,{Niche} - Cafe & Bar,"North Indian, Chinese, Italian, Continental",4.1,492
9454,6401789,tashas,"Cafe, Mediterranean",4.1,374
9385,6113857,sketch Gallery,"British, Contemporary",4.5,148
1837,18418247,feel ALIVE,"North Indian, American, Asian, Biryani",4.7,69
...,...,...,...,...,...
1468,18408054,19 Flavours Biryani,"Mughlai, Hyderabadi",4.1,84
2484,18233317,145 Kala Ghoda,"Fast Food, Beverages, Desserts",4.2,1606
2292,2100784,11th Avenue Cafe Bistro,"Cafe, American, Italian, Continental",4.1,377
751,2600031,10 Downing Street,"North Indian, Chinese",4.0,257


Resulting the dataframe reduced to 1236 rows

## *Recommendation System*

In [17]:
# Split Cuisines into list

dfRS['cuisines'] = dfRS['cuisines'].str.split(', ')
dfRS

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  dfRS['cuisines'] = dfRS['cuisines'].str.split(', ')


Unnamed: 0,restaurant_id,restaurant_name,cuisines,aggregate_rating,votes
9523,6000871,íukuraÛôa SofrasÛ±,"[Kebab, Izgara]",4.4,296
3120,18222559,{Niche} - Cafe & Bar,"[North Indian, Chinese, Italian, Continental]",4.1,492
9454,6401789,tashas,"[Cafe, Mediterranean]",4.1,374
9385,6113857,sketch Gallery,"[British, Contemporary]",4.5,148
1837,18418247,feel ALIVE,"[North Indian, American, Asian, Biryani]",4.7,69
...,...,...,...,...,...
1468,18408054,19 Flavours Biryani,"[Mughlai, Hyderabadi]",4.1,84
2484,18233317,145 Kala Ghoda,"[Fast Food, Beverages, Desserts]",4.2,1606
2292,2100784,11th Avenue Cafe Bistro,"[Cafe, American, Italian, Continental]",4.1,377
751,2600031,10 Downing Street,"[North Indian, Chinese]",4.0,257


We can see now the 'cuisines' column has converted to listed value, now we need to explode to seperate each value

In [18]:
# Exploding 'cuisines' 

dfRS = dfRS.explode('cuisines',ignore_index=True)
dfRS

Unnamed: 0,restaurant_id,restaurant_name,cuisines,aggregate_rating,votes
0,6000871,íukuraÛôa SofrasÛ±,Kebab,4.4,296
1,6000871,íukuraÛôa SofrasÛ±,Izgara,4.4,296
2,18222559,{Niche} - Cafe & Bar,North Indian,4.1,492
3,18222559,{Niche} - Cafe & Bar,Chinese,4.1,492
4,18222559,{Niche} - Cafe & Bar,Italian,4.1,492
...,...,...,...,...,...
2966,2100784,11th Avenue Cafe Bistro,Italian,4.1,377
2967,2100784,11th Avenue Cafe Bistro,Continental,4.1,377
2968,2600031,10 Downing Street,North Indian,4.0,257
2969,2600031,10 Downing Street,Chinese,4.0,257


Every single cuisines is seperated to single row, resulted 2971 rows

In [19]:
# Cuisines Check

dfRS['cuisines'].value_counts()

North Indian    270
Italian         237
Chinese         200
Continental     199
Cafe            177
               ... 
Pub Food          1
Durban            1
Irish             1
Persian           1
Sunda             1
Name: cuisines, Length: 128, dtype: int64

Now the data has been splited into single cuisines only

In [20]:
# Cross Tabulate Restaurant Name and Cuisines

xTabRestoCuisines = pd.crosstab(dfRS['restaurant_name'],
                                dfRS['cuisines'])
xTabRestoCuisines

cuisines,Afghani,African,American,Andhra,Arabian,Argentine,Asian,Asian Fusion,Australian,Awadhi,...,Teriyaki,Tex-Mex,Thai,Tibetan,Turkish,Turkish Pizza,Vegetarian,Vietnamese,Western,World Cuisine
restaurant_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
'Ohana,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
10 Downing Street,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
11th Avenue Cafe Bistro,0,0,1,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
145 Kala Ghoda,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
19 Flavours Biryani,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
feel ALIVE,0,0,1,0,0,0,1,0,0,0,...,0,0,0,0,0,0,0,0,0,0
sketch Gallery,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
tashas,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
{Niche} - Cafe & Bar,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


We cross tabbulate restaurant_name and cuisines to capture raw value for similarity scoring, this process will tag restaurant name with '1' in related cuisines

In [21]:
# Checking on restaurant name value
xTabRestoCuisines.loc['feel ALIVE'].values

array([0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

Here is an example of crosstab value for 'feel ALIVE' restaurant name

In [22]:
# Resto Names Sample

dfRS['restaurant_name'].sample(20, random_state=101)

1308                    Mrs. Wilkes' Dining Room
2784                                    Baltazar
888                                    Rose Cafe
2713                         Big City Bread Cafe
1162                                Olive Bistro
221                            Transmetropolitan
1403                          Maxims Pastry Shop
1381                                      Meraki
1363                            Mimi's Bakehouse
2466                            Cappuccino Blast
1169                               Oh So Stoned!
1671                       Karakí_y Gí_llí_oÛôlu
147                                    Via Delhi
209                  Tu-Do Vietnamese Restaurant
258     Tian - Asian Cuisine Studio - ITC Maurya
2649                           Boise Fry Company
247                           Ting's Red Lantern
1170                                Odeon Social
319                                   The Sizzle
690                              Sree Annapoorna
Name: restaurant_nam

We call sample name to test similar

In [23]:
# Measure Similarity

print(jaccard_score(xTabRestoCuisines.loc["Olive Bistro"].values,
                    xTabRestoCuisines.loc["Rose Cafe"].values))

0.3333333333333333


We can see the sample similarity from Jaccard scor showing 0.33 similarity between these two resto

In [24]:
# Create Similarity Value DF

jaccardDist = pdist(xTabRestoCuisines.values, metric='jaccard')
jaccardMatrix = squareform(jaccardDist)
jaccardSim = 1 - jaccardMatrix
dfJaccard = pd.DataFrame(
    jaccardSim, 
    index=xTabRestoCuisines.index,
    columns=xTabRestoCuisines.index)

dfJaccard

restaurant_name,'Ohana,10 Downing Street,11th Avenue Cafe Bistro,145 Kala Ghoda,19 Flavours Biryani,1918 Bistro & Grill,2 Dog,22nd Parallel,3 Wise Monkeys,38 Barracks,...,Zoeys Pizzeria,Zolocrust - Hotel Clarks Amer,Zombie Burger + Drink Lab,Zuka Choco-la,Zunzi's,feel ALIVE,sketch Gallery,tashas,{Niche} - Cafe & Bar,íukuraÛôa SofrasÛ±
restaurant_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
'Ohana,1.0,0.0,0.000000,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,...,0.0,0.0,0.0,0.000000,0.00,0.000000,0.0,0.0,0.000000,0.0
10 Downing Street,0.0,1.0,0.000000,0.0,0.0,0.0,0.000000,0.0,0.0,0.200000,...,0.0,0.0,0.0,0.000000,0.00,0.200000,0.0,0.0,0.500000,0.0
11th Avenue Cafe Bistro,0.0,0.0,1.000000,0.0,0.0,0.0,0.166667,0.0,0.0,0.333333,...,0.0,0.4,0.0,0.000000,0.00,0.142857,0.0,0.2,0.333333,0.0
145 Kala Ghoda,0.0,0.0,0.000000,1.0,0.0,0.0,0.000000,0.0,0.0,0.000000,...,0.0,0.0,0.2,0.333333,0.00,0.000000,0.0,0.0,0.000000,0.0
19 Flavours Biryani,0.0,0.0,0.000000,0.0,1.0,0.0,0.000000,0.0,0.0,0.000000,...,0.0,0.0,0.0,0.000000,0.00,0.000000,0.0,0.0,0.000000,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
feel ALIVE,0.0,0.2,0.142857,0.0,0.0,0.0,0.166667,0.0,0.0,0.600000,...,0.0,0.0,0.0,0.000000,0.00,1.000000,0.0,0.0,0.142857,0.0
sketch Gallery,0.0,0.0,0.000000,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,...,0.0,0.0,0.0,0.000000,0.00,0.000000,1.0,0.0,0.000000,0.0
tashas,0.0,0.0,0.200000,0.0,0.0,0.0,0.000000,0.0,0.0,0.000000,...,0.0,0.0,0.0,0.000000,0.25,0.000000,0.0,1.0,0.000000,0.0
{Niche} - Cafe & Bar,0.0,0.5,0.333333,0.0,0.0,0.0,0.000000,0.0,0.0,0.333333,...,0.0,0.4,0.0,0.000000,0.00,0.142857,0.0,0.0,1.000000,0.0


Here is the big data fram which consist all similarity value to all restaurants, from this DF we can creat recommedation tool based on one resataurant name and showing the other names with highest similar score

In [25]:
# Resto Names Sample

dfRS['restaurant_name'].sample(20)

2017    Flatbread Neapolitan Pizzeria
2281      Culture Club - Bar De Tapas
2523                    Cafe Southall
2143                       Eat Street
2856                        Ardor 2.1
2647                    Bombay Brunch
1834                     Hickory Park
894                              Roka
2248                 Delhi Club House
2217                      Dharma Blue
1092                        Paper Fig
573                  Terraí_o Itíçlia
1611     Kuremal Mohan Lal Kulfi Wale
2715                        Big Chill
2757                         Barcelos
2025               Fisherman's Corner
2323                       Cool Point
452           The Darzi Bar & Kitchen
383               The Hangout by 1861
2543                    Cafe LazyMojo
Name: restaurant_name, dtype: object

# **Final Recommendation System**

We try to capture recommedation for other restaurant and combining the best rating to it, so the recommendation is objectively sharper

In [26]:
# Make Recomendation

# Input Initial Restaurant Name
resto = 'Ooma'

sim = dfJaccard.loc[resto].sort_values(ascending=False)

sim = pd.DataFrame({'restaurant_name': sim.index, 'simScore': sim.values})
sim = sim[(sim['restaurant_name']!= resto) & (sim['simScore']>=0.7)].head(5)

# Merge The Rating

RestoRec = pd.merge(sim,dfRS[['restaurant_name','aggregate_rating']],how='inner',on='restaurant_name')
FinalRestoRec = RestoRec.sort_values('aggregate_rating',ascending=False).drop_duplicates('restaurant_name',keep='first')
FinalRestoRec

Unnamed: 0,restaurant_name,simScore,aggregate_rating
0,Sushi Masa,1.0,4.9
2,Nobu,1.0,4.4
4,Ichiban,1.0,4.3
8,Osaka,1.0,4.2
6,Guppy,1.0,4.1


According to initial restaurant input, The data above will show up to top 5 recommended restaurants with the best rating, the rating is also curated only 4 and above, so the reccomendation system porvide good rating obvectively. For a better recommendation we can use the other feature that leads to more acurate recommendation, for example we can consider restaurant location, operating hours or even on ongoing promos.<br>
This kind of recommendation system can also implemented in different field such as ecommerce for product recommendation, streaming platform for movie/music recommedation, telemedicine for doctor suggestion, etc.<br><br>

This notebook arranged by *Bizary Algadri*
Reach me on:
linkedin https://www.linkedin.com/in/bizary/
github https://github.com/bizary03