# Zomato Recommendation System

## Introduction

Bengaluru, India's IT capital, is a food lover's paradise with approximately 12,000 restaurants offering cuisines from around the world. The city's vibrant food scene includes delivery services, dine-out options, pubs, bars, buffets, and dessert spots. Despite the high number of restaurants, new establishments continue to open, facing challenges such as competition, high costs, and staffing issues.

## Purpose of the System

The Zomato Recommendation System aims to:

1. Analyze the local food culture and demographics
2. Assist new restaurants in making informed decisions about their concept, menu, and pricing
3. Identify similarities between Bengaluru neighborhoods based on food preferences
4. Provide overall ratings for restaurants using customer reviews

## What is a Recommendation System?

A recommendation system is an information filtering tool that improves search results by suggesting items relevant to a user's interests or search history. These systems are widely used for recommending movies, articles, restaurants, travel destinations, and products.

## Types of Recommendation Systems

1. **Demographic Filtering**: Offers generalized recommendations based on user demographics
2. **Content-Based Filtering**: Suggests similar items based on specific item characteristics
3. **Collaborative Filtering**: Matches users with similar interests and provides recommendations accordingly

This project utilizes Content-Based Filtering, which recommends items similar to those a user has liked in the past or is currently viewing.

## Project Breakdown

1. Data Loading: Import necessary libraries and load the dataset
2. Data Cleaning: Remove redundant columns, rename columns, drop duplicates, and handle missing values
3. Data Transformation: Perform necessary transformations on the data
4. Text Preprocessing: Clean reviews by removing unnecessary words, links, symbols, and other irrelevant content
5. Recommendation System Development: Create a system that recommends restaurants based on similar reviews and sorts them by highest rating

By analyzing Zomato's Bengaluru restaurant data, this recommendation system aims to provide valuable insights for both new and established restaurants, as well as enhance the dining experience for food enthusiasts in the city.

## Import Libraries

In [4]:
#Importing Libraries
import numpy as np
import pandas as pd
import seaborn as sb
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.linear_model import LogisticRegression
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import r2_score
import warnings
warnings.filterwarnings('always')
warnings.filterwarnings('ignore')
import re
from nltk.corpus import stopwords
from sklearn.metrics.pairwise import linear_kernel
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfVectorizer

## Load dataset

In [5]:
# read csv
zomato=pd.read_csv("/kaggle/input/zomato-bangalore-restaurants/zomato.csv")
zomato.head() 

Unnamed: 0,url,address,name,online_order,book_table,rate,votes,phone,location,rest_type,dish_liked,cuisines,approx_cost(for two people),reviews_list,menu_item,listed_in(type),listed_in(city)
0,https://www.zomato.com/bangalore/jalsa-banasha...,"942, 21st Main Road, 2nd Stage, Banashankari, ...",Jalsa,Yes,Yes,4.1/5,775,080 42297555\r\n+91 9743772233,Banashankari,Casual Dining,"Pasta, Lunch Buffet, Masala Papad, Paneer Laja...","North Indian, Mughlai, Chinese",800,"[('Rated 4.0', 'RATED\n A beautiful place to ...",[],Buffet,Banashankari
1,https://www.zomato.com/bangalore/spice-elephan...,"2nd Floor, 80 Feet Road, Near Big Bazaar, 6th ...",Spice Elephant,Yes,No,4.1/5,787,080 41714161,Banashankari,Casual Dining,"Momos, Lunch Buffet, Chocolate Nirvana, Thai G...","Chinese, North Indian, Thai",800,"[('Rated 4.0', 'RATED\n Had been here for din...",[],Buffet,Banashankari
2,https://www.zomato.com/SanchurroBangalore?cont...,"1112, Next to KIMS Medical College, 17th Cross...",San Churro Cafe,Yes,No,3.8/5,918,+91 9663487993,Banashankari,"Cafe, Casual Dining","Churros, Cannelloni, Minestrone Soup, Hot Choc...","Cafe, Mexican, Italian",800,"[('Rated 3.0', ""RATED\n Ambience is not that ...",[],Buffet,Banashankari
3,https://www.zomato.com/bangalore/addhuri-udupi...,"1st Floor, Annakuteera, 3rd Stage, Banashankar...",Addhuri Udupi Bhojana,No,No,3.7/5,88,+91 9620009302,Banashankari,Quick Bites,Masala Dosa,"South Indian, North Indian",300,"[('Rated 4.0', ""RATED\n Great food and proper...",[],Buffet,Banashankari
4,https://www.zomato.com/bangalore/grand-village...,"10, 3rd Floor, Lakshmi Associates, Gandhi Baza...",Grand Village,No,No,3.8/5,166,+91 8026612447\r\n+91 9901210005,Basavanagudi,Casual Dining,"Panipuri, Gol Gappe","North Indian, Rajasthani",600,"[('Rated 4.0', 'RATED\n Very good restaurant ...",[],Buffet,Banashankari


In [4]:
zomato.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 51717 entries, 0 to 51716
Data columns (total 17 columns):
 #   Column                       Non-Null Count  Dtype 
---  ------                       --------------  ----- 
 0   url                          51717 non-null  object
 1   address                      51717 non-null  object
 2   name                         51717 non-null  object
 3   online_order                 51717 non-null  object
 4   book_table                   51717 non-null  object
 5   rate                         43942 non-null  object
 6   votes                        51717 non-null  int64 
 7   phone                        50509 non-null  object
 8   location                     51696 non-null  object
 9   rest_type                    51490 non-null  object
 10  dish_liked                   23639 non-null  object
 11  cuisines                     51672 non-null  object
 12  approx_cost(for two people)  51371 non-null  object
 13  reviews_list                 51

## Data Cleaning and Feature Engineering

In [6]:
# Delete unnnecessary columns
zomato=zomato.drop(['url','dish_liked','phone'],axis=1) 

In [7]:
# Remove duplicates
zomato.duplicated().sum()
zomato.drop_duplicates(inplace=True)

In [27]:
zomato.info()

<class 'pandas.core.frame.DataFrame'>
Index: 51674 entries, 0 to 51716
Data columns (total 14 columns):
 #   Column                       Non-Null Count  Dtype 
---  ------                       --------------  ----- 
 0   address                      51674 non-null  object
 1   name                         51674 non-null  object
 2   online_order                 51674 non-null  object
 3   book_table                   51674 non-null  object
 4   rate                         43907 non-null  object
 5   votes                        51674 non-null  int64 
 6   location                     51653 non-null  object
 7   rest_type                    51447 non-null  object
 8   cuisines                     51629 non-null  object
 9   approx_cost(for two people)  51329 non-null  object
 10  reviews_list                 51674 non-null  object
 11  menu_item                    51674 non-null  object
 12  listed_in(type)              51674 non-null  object
 13  listed_in(city)              51674 n

In [8]:
# Remove the NaN values 
zomato.isnull().sum()
zomato.dropna(how='any',inplace=True)
zomato.info() 

<class 'pandas.core.frame.DataFrame'>
Index: 43499 entries, 0 to 51716
Data columns (total 14 columns):
 #   Column                       Non-Null Count  Dtype 
---  ------                       --------------  ----- 
 0   address                      43499 non-null  object
 1   name                         43499 non-null  object
 2   online_order                 43499 non-null  object
 3   book_table                   43499 non-null  object
 4   rate                         43499 non-null  object
 5   votes                        43499 non-null  int64 
 6   location                     43499 non-null  object
 7   rest_type                    43499 non-null  object
 8   cuisines                     43499 non-null  object
 9   approx_cost(for two people)  43499 non-null  object
 10  reviews_list                 43499 non-null  object
 11  menu_item                    43499 non-null  object
 12  listed_in(type)              43499 non-null  object
 13  listed_in(city)              43499 n

In [12]:
zomato.columns

Index(['address', 'name', 'online_order', 'book_table', 'rate', 'votes',
       'location', 'rest_type', 'cuisines', 'approx_cost(for two people)',
       'reviews_list', 'menu_item', 'listed_in(type)', 'listed_in(city)'],
      dtype='object')

In [9]:
# renaming columns
zomato = zomato.rename(columns={'approx_cost(for two people)':'cost','listed_in(type)':'type',
                                  'listed_in(city)':'city'})
zomato.columns

Index(['address', 'name', 'online_order', 'book_table', 'rate', 'votes',
       'location', 'rest_type', 'cuisines', 'cost', 'reviews_list',
       'menu_item', 'type', 'city'],
      dtype='object')

In [10]:
# minor  transformations
zomato['cost'] = zomato['cost'].astype(str) 
zomato['cost'] = zomato['cost'].apply(lambda x: x.replace(',','.')) 
zomato['cost'] = zomato['cost'].astype(float)
zomato.info()

<class 'pandas.core.frame.DataFrame'>
Index: 43499 entries, 0 to 51716
Data columns (total 14 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   address       43499 non-null  object 
 1   name          43499 non-null  object 
 2   online_order  43499 non-null  object 
 3   book_table    43499 non-null  object 
 4   rate          43499 non-null  object 
 5   votes         43499 non-null  int64  
 6   location      43499 non-null  object 
 7   rest_type     43499 non-null  object 
 8   cuisines      43499 non-null  object 
 9   cost          43499 non-null  float64
 10  reviews_list  43499 non-null  object 
 11  menu_item     43499 non-null  object 
 12  type          43499 non-null  object 
 13  city          43499 non-null  object 
dtypes: float64(1), int64(1), object(12)
memory usage: 5.0+ MB


In [31]:
# look at rate column
zomato['rate'].unique()

array(['4.1/5', '3.8/5', '3.7/5', '3.6/5', '4.6/5', '4.0/5', '4.2/5',
       '3.9/5', '3.1/5', '3.0/5', '3.2/5', '3.3/5', '2.8/5', '4.4/5',
       '4.3/5', 'NEW', '2.9/5', '3.5/5', '2.6/5', '3.8 /5', '3.4/5',
       '4.5/5', '2.5/5', '2.7/5', '4.7/5', '2.4/5', '2.2/5', '2.3/5',
       '3.4 /5', '-', '3.6 /5', '4.8/5', '3.9 /5', '4.2 /5', '4.0 /5',
       '4.1 /5', '3.7 /5', '3.1 /5', '2.9 /5', '3.3 /5', '2.8 /5',
       '3.5 /5', '2.7 /5', '2.5 /5', '3.2 /5', '2.6 /5', '4.5 /5',
       '4.3 /5', '4.4 /5', '4.9/5', '2.1/5', '2.0/5', '1.8/5', '4.6 /5',
       '4.9 /5', '3.0 /5', '4.8 /5', '2.3 /5', '4.7 /5', '2.4 /5',
       '2.1 /5', '2.2 /5', '2.0 /5', '1.8 /5'], dtype=object)

In [11]:
# Removing '/5' the denominator from rate column
zomato = zomato.loc[zomato.rate !='NEW']
zomato = zomato.loc[zomato.rate !='-'].reset_index(drop=True)
remove_slash = lambda x: x.replace('/5', '') if type(x) == str else x
zomato.rate = zomato.rate.apply(remove_slash).str.strip().astype('float')
zomato['rate'].head()

0    4.1
1    4.1
2    3.8
3    3.7
4    3.8
Name: rate, dtype: float64

In [12]:
# change online_order and book_table to true and false
zomato.name = zomato.name.apply(lambda x:x.title())
zomato.online_order.replace(('Yes','No'),(True, False),inplace=True)
zomato.book_table.replace(('Yes','No'),(True, False),inplace=True)
zomato.cost.unique()

array([800.  , 300.  , 600.  , 700.  , 550.  , 500.  , 450.  , 650.  ,
       400.  , 900.  , 200.  , 750.  , 150.  , 850.  , 100.  ,   1.2 ,
       350.  , 250.  , 950.  ,   1.  ,   1.5 ,   1.3 , 199.  ,   1.1 ,
         1.6 , 230.  , 130.  ,   1.7 ,   1.35,   2.2 ,   1.4 ,   2.  ,
         1.8 ,   1.9 , 180.  , 330.  ,   2.5 ,   2.1 ,   3.  ,   2.8 ,
         3.4 ,  50.  ,  40.  ,   1.25,   3.5 ,   4.  ,   2.4 ,   2.6 ,
         1.45,  70.  ,   3.2 , 240.  ,   6.  ,   1.05,   2.3 ,   4.1 ,
       120.  ,   5.  ,   3.7 ,   1.65,   2.7 ,   4.5 ,  80.  ])

In [20]:
zomato.head()

Unnamed: 0,address,name,online_order,book_table,rate,votes,location,rest_type,cuisines,cost,reviews_list,menu_item,type,city
0,"942, 21st Main Road, 2nd Stage, Banashankari, ...",Jalsa,True,True,4.1,775,Banashankari,Casual Dining,"North Indian, Mughlai, Chinese",800.0,"[('Rated 4.0', 'RATED\n A beautiful place to ...",[],Buffet,Banashankari
1,"2nd Floor, 80 Feet Road, Near Big Bazaar, 6th ...",Spice Elephant,True,False,4.1,787,Banashankari,Casual Dining,"Chinese, North Indian, Thai",800.0,"[('Rated 4.0', 'RATED\n Had been here for din...",[],Buffet,Banashankari
2,"1112, Next to KIMS Medical College, 17th Cross...",San Churro Cafe,True,False,3.8,918,Banashankari,"Cafe, Casual Dining","Cafe, Mexican, Italian",800.0,"[('Rated 3.0', ""RATED\n Ambience is not that ...",[],Buffet,Banashankari
3,"1st Floor, Annakuteera, 3rd Stage, Banashankar...",Addhuri Udupi Bhojana,False,False,3.7,88,Banashankari,Quick Bites,"South Indian, North Indian",300.0,"[('Rated 4.0', ""RATED\n Great food and proper...",[],Buffet,Banashankari
4,"10, 3rd Floor, Lakshmi Associates, Gandhi Baza...",Grand Village,False,False,3.8,166,Basavanagudi,Casual Dining,"North Indian, Rajasthani",600.0,"[('Rated 4.0', 'RATED\n Very good restaurant ...",[],Buffet,Banashankari


In [21]:
zomato['city'].unique()

array(['Banashankari', 'Bannerghatta Road', 'Basavanagudi', 'Bellandur',
       'Brigade Road', 'Brookefield', 'BTM', 'Church Street',
       'Electronic City', 'Frazer Town', 'HSR', 'Indiranagar',
       'Jayanagar', 'JP Nagar', 'Kalyan Nagar', 'Kammanahalli',
       'Koramangala 4th Block', 'Koramangala 5th Block',
       'Koramangala 6th Block', 'Koramangala 7th Block', 'Lavelle Road',
       'Malleshwaram', 'Marathahalli', 'MG Road', 'New BEL Road',
       'Old Airport Road', 'Rajajinagar', 'Residency Road',
       'Sarjapur Road', 'Whitefield'], dtype=object)

In [13]:
## Check for Null 
zomato.isnull().sum()

address         0
name            0
online_order    0
book_table      0
rate            0
votes           0
location        0
rest_type       0
cuisines        0
cost            0
reviews_list    0
menu_item       0
type            0
city            0
dtype: int64

In [14]:
## add Mean Rating column 
restaurants = list(zomato['name'].unique())
zomato['Mean Rating'] = 0

for i in range(len(restaurants)):
    zomato['Mean Rating'][zomato['name'] == restaurants[i]] = zomato['rate'][zomato['name'] == restaurants[i]].mean()

In [15]:
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler(feature_range = (1,5))

zomato[['Mean Rating']] = scaler.fit_transform(zomato[['Mean Rating']]).round(2)

zomato.sample(3)

Unnamed: 0,address,name,online_order,book_table,rate,votes,location,rest_type,cuisines,cost,reviews_list,menu_item,type,city,Mean Rating
10275,"#10 B, Ground floor , Devatha Plaza, Residency...",The Bao Belly,False,False,3.3,9,Residency Road,Cafe,Cafe,400.0,"[('Rated 3.0', 'RATED\n The Bao Belly was my ...",[],Dine-out,Church Street,2.94
37989,"No.46/A, 1st floor - front building , lalbagh ...",Desi Inn,True,False,3.5,13,Shanti Nagar,Quick Bites,North Indian,300.0,"[('Rated 4.0', 'RATED\n Starting with praisin...",[],Delivery,Residency Road,3.19
20038,"403, Mariappa Road, Off Kammanahalli Main Road...",Oki,True,True,4.5,418,Kammanahalli,Casual Dining,"Asian, European, Italian, Korean, Malaysian, T...",1.2,"[('Rated 4.0', 'RATED\n ValentineÃ\x83Ã\x83...",['Bangkok Style Raw Papaya Salad with Green Ap...,Delivery,Kammanahalli,4.48


## Text Preprocessing

Some of the common text preprocessing / cleaning steps:

* lower case
* remove puntuations
* remove stopwords
* remove urls
* spelling

In [16]:
## Lower Case
zomato["reviews_list"] = zomato["reviews_list"].str.lower()

In [17]:
import string
print(string.punctuation)

!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~


In [48]:
str.maketrans.__doc__

'Return a translation table usable for str.translate().\n\nIf there is only one argument, it must be a dictionary mapping Unicode\nordinals (integers) or characters to Unicode ordinals, strings or None.\nCharacter keys will be then converted to ordinals.\nIf there are two arguments, they must be strings of equal length, and\nin the resulting dictionary, each character in x will be mapped to the\ncharacter at the same position in y. If there is a third argument, it\nmust be a string, whose characters will be mapped to None in the result.'

In [18]:
## Remove puctuations
PUNCT_TO_REMOVE = string.punctuation
def remove_punctuation(text):
    """custom function to remove the punctuation"""
    return text.translate(str.maketrans('', '', PUNCT_TO_REMOVE))

zomato["reviews_list"] = zomato["reviews_list"].apply(lambda text: remove_punctuation(text))

In [19]:
from nltk.corpus import stopwords
print(set(stopwords.words('english')))

{'an', 'will', "you've", 'nor', 'very', 'my', 'because', "hasn't", 'having', 'until', 'was', 'y', 'above', "hadn't", 'and', "she's", 'where', 'll', 'were', 'd', 'no', 'their', 'then', 'from', 'off', 'how', 'it', 'only', 'won', "doesn't", 'between', "didn't", 'some', 'once', 'should', 'aren', 'are', 'yours', 'through', 'down', 'if', 'just', 'wasn', 'over', 'of', 'itself', 'ain', "you'll", 'your', 'shouldn', 'can', 'does', "you'd", 'mustn', 'her', 'wouldn', 'when', 'than', 'too', 'hadn', "shan't", 'why', 'its', 'about', 'by', 'any', 'before', 'we', 'again', 'there', 't', 'at', "should've", 'him', 'didn', 'doing', 'do', 're', 'ours', 've', 'needn', 'have', 'with', "you're", 'himself', 'who', 'not', 'both', 'same', 'but', 'during', 'ma', "mustn't", 'a', 'shan', "isn't", 'here', 'into', 'o', 'them', 'ourselves', 'while', 'in', 'our', 'he', 'few', 'this', 'those', 'don', 'yourselves', 'couldn', 'what', 'after', 'weren', 'under', 'which', 'you', 'am', 'did', "won't", 'haven', 'on', 'that', "a

In [20]:
## Remove Stopwords
from nltk.corpus import stopwords
STOPWORDS = set(stopwords.words('english'))
def remove_stopwords(text):
    """custom function to remove the stopwords"""
    return " ".join([word for word in str(text).split() if word not in STOPWORDS])

zomato["reviews_list"] = zomato["reviews_list"].apply(lambda text: remove_stopwords(text))

In [21]:
## Remove URLS
def remove_urls(text):
    url_pattern = re.compile(r'https?://\S+|www\.\S+')
    return url_pattern.sub(r'', text)

zomato["reviews_list"] = zomato["reviews_list"].apply(lambda text: remove_urls(text))

In [22]:
# Restaurant names
restaurant_names = list(zomato['name'].unique())
len(restaurant_names)

6572

In [23]:
zomato=zomato.drop(['address','rest_type', 'type', 'menu_item', 'votes'],axis=1)

In [24]:
# Randomly sample 50% of your data
df_percent = zomato.sample(frac=0.5)

In [57]:
df_percent

Unnamed: 0,name,online_order,book_table,rate,location,cuisines,cost,reviews_list,city,Mean Rating
38405,Meghana Foods,True,False,4.3,Residency Road,"Biryani, Andhra, North Indian, Seafood",600.0,rated 50 ratedn one best biriyani places banga...,Residency Road,4.28
13817,Truffles,False,False,4.6,Indiranagar,"Cafe, American, Burger, Steak",900.0,rated 50 ratedn went tuesday evening much crow...,Indiranagar,4.61
14838,Give Me 5 Cafe,False,False,3.4,Indiranagar,"Cafe, Continental",500.0,rated 50 ratedn nice place terms ambience serv...,Indiranagar,3.00
20832,Cafeteria Al - Falah,False,False,3.3,Banaswadi,"Fast Food, North Indian",300.0,,Kammanahalli,2.94
9685,Qissa Khawani,True,False,4.4,Church Street,"North Indian, Mughlai",800.0,rated 50 ratedn little holeinthewall place ser...,Church Street,4.28
...,...,...,...,...,...,...,...,...,...,...
976,Kakal-Kai Ruchi,True,False,3.7,JP Nagar,"North Indian, South Indian, Chinese",500.0,rated 40 ratedn nice place breakfast great sou...,Bannerghatta Road,3.49
26467,Stoked!,False,False,4.3,Koramangala 4th Block,"Cafe, Italian, Pizza",850.0,rated 40 ratedn small classy cafe situated kor...,Koramangala 6th Block,4.23
29421,Maven Foods,False,False,3.6,Koramangala 4th Block,"Continental, North Indian, South Indian",650.0,rated 50 ratedn love food expect something hig...,Koramangala 7th Block,3.32
29837,Sutra Spice-The Resto Bar,False,False,3.8,Koramangala 4th Block,"South Indian, Andhra",1.0,rated 30 ratedn 35ngood place decent ambiance ...,Koramangala 7th Block,3.66


## Term Frequency-Inverse Document Frequency (TF-IDF)

### What is TF-IDF?

TF-IDF is a statistical method used to evaluate the importance of a word in a document within a collection of documents. It combines two metrics:

1. Term Frequency (TF)
2. Inverse Document Frequency (IDF)

### Term Frequency (TF)

TF measures how often a term appears in a document. It's calculated as:

TF = (Number of times a term appears in a document) / (Total number of terms in the document)

### Inverse Document Frequency (IDF)

IDF measures the importance of a term across the entire collection of documents. It's calculated as:

IDF = log(Total number of documents / Number of documents containing the term)

### TF-IDF Calculation

The TF-IDF score for a term in a document is the product of its TF and IDF:

TF-IDF = TF * IDF

### Key Concepts

- TF-IDF assigns higher weights to terms that are frequent in a specific document but rare across the entire collection.
- Common words that appear in many documents receive lower weights.
- Rare words that are important to a specific document receive higher weights.

### Application in Recommendation Systems

In the context of a restaurant recommendation system:

1. Each restaurant's reviews or descriptions form a "document"
2. The collection of all restaurant reviews/descriptions forms the "corpus"
3. TF-IDF helps identify key terms that are uniquely important to each restaurant

### Implementation with scikit-learn

Scikit-learn provides a TfidfVectorizer class that simplifies the process of creating a TF-IDF matrix:

1. It tokenizes the text
2. Calculates the TF-IDF scores
3. Produces a matrix where:
   - Rows represent documents (restaurants)
   - Columns represent terms in the vocabulary
   - Cell values are the TF-IDF scores

This matrix can then be used to compare restaurants based on the similarity of their most important terms, forming the basis of a content-based recommendation system.

In [25]:
df_percent.set_index('name', inplace=True)

In [26]:
indices = pd.Series(df_percent.index)

In [27]:
# Create tf-idf matrix
tfidf = TfidfVectorizer(analyzer='word', ngram_range=(1, 2), min_df=0, stop_words='english')
tfidf_matrix = tfidf.fit_transform(df_percent['reviews_list'])

In [28]:
cosine_similarities = linear_kernel(tfidf_matrix, tfidf_matrix)

In [38]:
def recommend(name, cosine_similarities = cosine_similarities):
    
    # Create a list to add top 10 restaurants
    recommend_restaurant = []
    
    # Find the index of the hotel 
    idx = indices[indices == name].index[0]
    
    # Find the restaurants with a similar cosine-sim value and sort descending
    score_series = pd.Series(cosine_similarities[idx]).sort_values(ascending=False)
    
    # Extract top 30 restaurant indexes with a similar cosine-sim value
    top30_indexes = list(score_series.iloc[0:31].index)
    
    # Names of the top 30 restaurants
    for each in top30_indexes:
        recommend_restaurant.append(list(df_percent.index)[each])
    
    # Create the new df set to show similar restaurants
    df_new = pd.DataFrame(columns=['cuisines', 'Mean Rating', 'cost'])
    
    # Create the top 30 similar restaurants with some columns
    for each in recommend_restaurant:
        #df_new = df_new.append(pd.DataFrame(df_percent[['cuisines','Mean Rating', 'cost']][df_percent.index == each].sample()))
        df_new = pd.concat([df_new, pd.DataFrame(df_percent[['cuisines','Mean Rating', 'cost']][df_percent.index == each].sample())], ignore_index=True)
    
    # Drop the same named restaurants and sort only the top 10 by the highest rating
    df_new = df_new.drop_duplicates(subset=['cuisines','Mean Rating', 'cost'], keep=False)
    df_new = df_new.sort_values(by='Mean Rating', ascending=False).head(10)
    
    print('TOP %s RESTAURANTS LIKE %s WITH SIMILAR REVIEWS: ' % (str(len(df_new)), name))
    
    return df_new

In [39]:
# HERE IS A RANDOM RESTAURANT. LET'S SEE THE DETAILS ABOUT THIS RESTAURANT:
df_percent[df_percent.index == 'Pai Vihar'].head()

Unnamed: 0_level_0,online_order,book_table,rate,location,cuisines,cost,reviews_list,city,Mean Rating
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Pai Vihar,True,False,2.8,Vasanth Nagar,"South Indian, Street Food, Chinese, Fast Food",400.0,rated 30 ratedn 12 rate hereãx83ãx83ãx82ãx...,MG Road,2.48
Pai Vihar,False,False,3.2,City Market,"South Indian, Street Food, Chinese, Fast Food",400.0,rated 20 ratedn food dry bland dont understand...,Church Street,2.48
Pai Vihar,True,False,2.8,Vasanth Nagar,"South Indian, Street Food, Chinese, Fast Food",400.0,rated 30 ratedn 12 rate hereãx83ãx83ãx82ãx...,Malleshwaram,2.48
Pai Vihar,False,False,3.2,City Market,"South Indian, Street Food, Chinese, Fast Food",400.0,rated 20 ratedn food dry bland dont understand...,Brigade Road,2.48
Pai Vihar,False,False,3.3,City Market,"South Indian, Street Food, Chinese, Fast Food",400.0,rated 20 ratedn food dry bland dont understand...,Residency Road,2.48


In [40]:
recommend('Pai Vihar')

TOP 10 RESTAURANTS LIKE Pai Vihar WITH SIMILAR REVIEWS: 


Unnamed: 0,cuisines,Mean Rating,cost
22,"Asian, Burmese",4.74,1.5
30,Pizza,4.13,600.0
21,"American, Burger, Fast Food",4.11,400.0
27,"Hyderabadi, Biryani, North Indian, Chinese",3.84,700.0
6,"Rolls, Kebab",3.84,250.0
20,"Biryani, North Indian, Kebab",3.72,600.0
19,"Seafood, South Indian, Chinese, Kerala",3.65,600.0
17,"North Indian, South Indian, Chinese",3.45,600.0
23,"North Indian, Fast Food, Street Food",3.43,300.0
3,Pizza,3.32,500.0
