# Restaurant  Recommendation System


#### A content-based recommender system that recommends top 4 nearby restaurants similar to restaurants searched by user

### Overview
The restaurant are recommended based on the content of the restaurant user entered or selected. Most of the people here are dependent mainly on the restaurant food as they don’t have time to cook for themselves. With such an overwhelming demand of restaurants it has therefore become important to study the demography of a location.  The main parameters that are considered for the recommendations are the restaurant categories, location, restaurant ranting, price and review count 

#### Methodology used -  Similarity Score :
How does it decide which item is most similar to the item user likes(or selects in our case)? Here comes the similarity scores.
It is a numerical value ranges between zero to one which helps to determine how much two items are similar to each other on a scale of zero to one. This similarity score is obtained measuring the similarity between the text details of both of the items. So, similarity score is the measure of similarity between given text details of two items. This can be done by cosine-similarity.

#### Datasets used 
In this project, A dataset of Zomato Pune and Bangalore cities has been used

In [1]:
#importing important lib
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

import re
from nltk.corpus import stopwords
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.preprocessing import MinMaxScaler

In [2]:
# reading dataset from csv 
df_main = pd.read_csv("Pune_Bang.csv")
df_main.head()
df_main["city"] = df_main["Locality"] 
df_main["city"] = df_main["city"].apply(lambda s : s.split(",")[-1])
df_main.head()

Unnamed: 0,Restaurant_Name,Category,Pricing_for_2,Locality,Dining_Rating,Dining_Review_Count,Delivery_Rating,Delivery_Rating_Count,Website,Address,Phone_No,Latitude,Longitude,city
0,Burma Burma,"Asian, Burmese, Bubble Tea, Salad, Tea, Desser...",1500,"Indiranagar, Bangalore",4.9,2790.0,4.5,838.0,https://www.zomato.com/bangalore/burma-burma-i...,"607, Ground Floor, 12th Main, Hal 2nd Stage, I...",918000000000.0,12.970394,77.644713,Bangalore
1,Windmills Craftworks,"Continental, Fast Food, Kebab, Beverages, Ital...",2500,"Windmills Craftworks, Bangalore",4.9,6543.0,4.2,524.0,https://www.zomato.com/bangalore/windmills-cra...,"78, Immaine Epip Industrial Area, Whitefield B...",919000000000.0,12.982413,77.721979,Bangalore
2,CTR Shri Sagar,South Indian,150,"Malleshwaram, Bangalore",4.9,4837.0,4.3,22100.0,https://www.zomato.com/bangalore/ctr-shri-saga...,"7th Cross, Margosa Road, Malleshwaram, Bangalore",918000000000.0,12.99827,77.569455,Bangalore
3,Brahmin's Coffee Bar,South Indian,100,"Basavanagudi, Bangalore",4.9,2975.0,4.4,372.0,https://www.zomato.com/bangalore/brahmins-coff...,"Ranga Rao Road, Near Basavanagudi, Bangalore",920000000000.0,12.954043,77.568865,Bangalore
4,Milano Ice Cream,"Desserts, Ice Cream, Beverages",400,"Indiranagar, Bangalore",4.9,2575.0,4.4,1180.0,https://www.zomato.com/bangalore/milano-ice-cr...,"460, 2nd Cross, Krishna Temple Road, Indiranag...",918000000000.0,12.979121,77.644039,Bangalore


#### Data Analysis 

In [3]:
#checking unique values in city column
df_main.city.unique()

array([' Bangalore', ' Pune'], dtype=object)

In [4]:
#checking duplicates and removing if present
print("shape of dataframe : ", df_main.shape)
print("No. duplicates present : ",df_main.duplicated().sum())
df_main.drop_duplicates(inplace = True)
print("No. duplicates present : ",df_main.duplicated().sum())
print("shape of dataframe after removing : ", df_main.shape)

shape of dataframe :  (9906, 14)
No. duplicates present :  106
No. duplicates present :  0
shape of dataframe after removing :  (9800, 14)


In [5]:
# checking null values in data 
df = df_main.copy()
df.isnull().sum()

Restaurant_Name             0
Category                    0
Pricing_for_2               0
Locality                    0
Dining_Rating               8
Dining_Review_Count         8
Delivery_Rating          2957
Delivery_Rating_Count       8
Website                     0
Address                     0
Phone_No                    0
Latitude                    0
Longitude                   0
city                        0
dtype: int64

In [6]:
# droping the null values in Dining_Rating, Dining_Review_Count and  Delivery_Rating_Count columns snice we have only 8 null values, 
print("shape before dropping : ", df.shape)
df.dropna(subset = "Dining_Rating",inplace = True)
print("shape after dropping : ", df.shape)
df.isnull().sum()

shape before dropping :  (9800, 14)
shape after dropping :  (9792, 14)


Restaurant_Name             0
Category                    0
Pricing_for_2               0
Locality                    0
Dining_Rating               0
Dining_Review_Count         0
Delivery_Rating          2951
Delivery_Rating_Count       0
Website                     0
Address                     0
Phone_No                    0
Latitude                    0
Longitude                   0
city                        0
dtype: int64

In [7]:
#checking percentage of null values in delivery rating
print("penetration of Delivery_Rating NA Values : " ,df.Delivery_Rating.isna().sum()/df.shape[0])

penetration of Delivery_Rating NA Values :  0.30136846405228757


In [8]:
# Here droping 2951 records is not good,so Impute the record by updating the Delivery_Rating using Delivery_Rating_Count
def replace_fuction(rating,count):
    if str(rating).isdigit() == False:
        if int(count)==0:
            return 0.0
        elif int(count)<50:
            return 2.5
    return rating
 
df["Delivery_Rating"] = df.apply(lambda xdf : replace_fuction(xdf["Delivery_Rating"],xdf["Delivery_Rating_Count"]),axis=1)

In [9]:
# checking null values
df.isnull().sum()

Restaurant_Name          0
Category                 0
Pricing_for_2            0
Locality                 0
Dining_Rating            0
Dining_Review_Count      0
Delivery_Rating          0
Delivery_Rating_Count    0
Website                  0
Address                  0
Phone_No                 0
Latitude                 0
Longitude                0
city                     0
dtype: int64

In [10]:
df.reset_index(inplace = True,drop=True)
df2 = df.copy()

df2["Restaurant_Name"] = df2["Restaurant_Name"].str.lower().str.strip()
df2["Category"] = df2["Category"].str.lower().str.strip()
df2["city"] = df2["city"].str.lower().str.strip()
df2["Category_2"] = df2["Category"] + ", " + df2["city"]

#df.set_index("Restaurant_Name",inplace = True)
df2.reset_index(inplace = True,drop=True)
df_final =  df2.copy()

print(df_final.shape)
df_final.head()

(9792, 15)


Unnamed: 0,Restaurant_Name,Category,Pricing_for_2,Locality,Dining_Rating,Dining_Review_Count,Delivery_Rating,Delivery_Rating_Count,Website,Address,Phone_No,Latitude,Longitude,city,Category_2
0,burma burma,"asian, burmese, bubble tea, salad, tea, desser...",1500,"Indiranagar, Bangalore",4.9,2790.0,4.5,838.0,https://www.zomato.com/bangalore/burma-burma-i...,"607, Ground Floor, 12th Main, Hal 2nd Stage, I...",918000000000.0,12.970394,77.644713,bangalore,"asian, burmese, bubble tea, salad, tea, desser..."
1,windmills craftworks,"continental, fast food, kebab, beverages, ital...",2500,"Windmills Craftworks, Bangalore",4.9,6543.0,4.2,524.0,https://www.zomato.com/bangalore/windmills-cra...,"78, Immaine Epip Industrial Area, Whitefield B...",919000000000.0,12.982413,77.721979,bangalore,"continental, fast food, kebab, beverages, ital..."
2,ctr shri sagar,south indian,150,"Malleshwaram, Bangalore",4.9,4837.0,4.3,22100.0,https://www.zomato.com/bangalore/ctr-shri-saga...,"7th Cross, Margosa Road, Malleshwaram, Bangalore",918000000000.0,12.99827,77.569455,bangalore,"south indian, bangalore"
3,brahmin's coffee bar,south indian,100,"Basavanagudi, Bangalore",4.9,2975.0,4.4,372.0,https://www.zomato.com/bangalore/brahmins-coff...,"Ranga Rao Road, Near Basavanagudi, Bangalore",920000000000.0,12.954043,77.568865,bangalore,"south indian, bangalore"
4,milano ice cream,"desserts, ice cream, beverages",400,"Indiranagar, Bangalore",4.9,2575.0,4.4,1180.0,https://www.zomato.com/bangalore/milano-ice-cr...,"460, 2nd Cross, Krishna Temple Road, Indiranag...",918000000000.0,12.979121,77.644039,bangalore,"desserts, ice cream, beverages, bangalore"


##### Sentimental analysis of Category using CountVectorization technique 
it is used to transform a given text into a vector on the basis of the frequency (count) of each word that occurs in the entire text.

In [11]:
df_final["Category_2"] = df_final["Category_2"].apply(lambda word : " ".join([w for w in word.split(",") if w not in stopwords.words('english')]))

# Initializing CountVectorizer
vect = CountVectorizer()
feature= vect.fit_transform(df_final.Category_2)

print("vector  : \n",list(feature[0].toarray()))
print("\nvocabulary : \feature_alln" ,vect.vocabulary_)

vector  : 
 [array([0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1,
       0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0,
       0, 0], dtype=int64)]

vocabulary : eature_alln {'asian': 5, 'burmese': 21, 'bubble': 19, 'tea': 104, 'salad': 94, 'desserts': 31, 'ice': 47, 'cream': 30, 'beverages': 14, 'bangalore': 9, 'continental': 29, 'fast': 35, 'food': 37, 'kebab': 57, 'italian': 52, 'south': 99, 'indian': 48, 'cafe': 22, 'american': 2, 'coffee': 28, 'steak': 100, 'european': 34, 'pizza': 87, 'pasta': 86, 'mediterranean': 68, 'bbq': 11, 'chinese': 27, 'north': 80, 'finger': 36, 'mangalorean': 66, 'street': 101, 'japanese': 54, 'sushi': 102, 'thai': 106, 'momos': 75, 'seafood': 96, 'bar': 10, 'burger': 20, 'sandwich': 95, 'mexican': 7

In [12]:
# scaler = MinMaxScaler()
# data = scaler.fit_transform(feature)
cosine_sim = cosine_similarity(feature)

In [13]:
### Calulating distance from searched restaurant using Haversine formaula
from math import radians, cos, sin, asin, sqrt
def Calculate_distance(lat1,lat2 , lon1, lon2):
     
    # Coverting to radians.
    lon1 = radians(lon1)
    lon2 = radians(lon2)
    lat1 = radians(lat1)
    lat2 = radians(lat2)
      
    # Haversine formula
    dlon = lon2 - lon1
    dlat = lat2 - lat1
    a = sin(dlat / 2)**2 + cos(lat1) * cos(lat2) * sin(dlon / 2)**2
 
    # r- Radius of earth in kilometers. 
    r = 6371
    c = 2 * r * asin(sqrt(a))
    # calculate the result
    return(c)

#### Developing recommendation model

In [14]:
def recommend(restaurnt_name,city, cosine_sim,df_final):
    
    # Getting index of searched restaurant 
    idx = df_final.query(f"Restaurant_Name == '{restaurnt_name}' and city == '{city}' ").index[0]
    
    # Getting similarity index 
    get_similarity_list = cosine_sim[idx]
    
    # Fittering the resurestaurnt_namelt 
    similarity_list = enumerate(get_similarity_list)
    similarity_list =  sorted(similarity_list, key = lambda x:x[1], reverse = True)
    top_50 = similarity_list[0:50]
    top_50_index =  [int(sl[0]) for sl in top_50 if int(sl[0]) != idx]
    
    # Getting top 50 nearest restaurants 
    df_50 = df_final.loc[top_50_index]
    df_50["Relative_distance"] =  df_50.apply(
        lambda df: Calculate_distance(df_final.loc[idx].Latitude,df.Latitude,df_final.loc[idx].Longitude,df.Longitude) , axis = 1)
    
    # Getting top 12 restaurant by using Relative_distance, Dining_Review_Count, Dining_Rating and Delivery_Rating
    df_return = df_50.nsmallest(25,"Relative_distance").nlargest(25,"Dining_Review_Count").nlargest(20,"Dining_Rating").nlargest(15,"Delivery_Rating_Count").nlargest(12,"Delivery_Rating")
    
    # returing ramdom 4 restaurant 
    return df_return.sample(4)
    
    


### User Menu


In [19]:
# Taking input from user
city_dict = {'p' : "Pune", "b" : "Bangalore"}
out_col = ["Restaurant_Name","Pricing_for_2","Dining_Rating","Delivery_Rating","Locality","Website"]
user_input = 0
while user_input != "99":
    
    user_input = str(input("Enter 99 to Exit or Press enter to continue "))
    if user_input != "99":
        user_input = str(input("Enter your current city code (P - Pune , B - Bangalore) : "))
        if user_input.lower() not in ["p","b"] and user_input != "99":
            print("Invalid City code plese re-enter city code")
            continue
        else :
            city = city_dict[user_input.lower()]

        user_input = str(input(f"Search restaurant in {city.capitalize()}: "))

        if user_input.lower() not in df_final.query(f"city == '{city.lower()}' ").Restaurant_Name.to_list() and user_input != "99":
            print("Sorry: Restaurnt Name not fount in our Database, plese re-enter your details ")
            continue
        else :
            restaurnt_name = user_input.lower()
        df_print = recommend(restaurnt_name.lower(),city.lower(), cosine_sim,df_final.copy())
        df_print.Restaurant_Name = df_print.Restaurant_Name.str.capitalize()

        print("\n#### Recommendation #### ")
        display(df_print[out_col].reset_index(drop=True))
        print("\n\n")

    
    
    
    

Enter 99 to Exit or Press enter to continue 
Enter your current city code (P - Pune , B - Bangalore) : p
Search restaurant in Pune: Anna Vada

#### Recommendation #### 


Unnamed: 0,Restaurant_Name,Pricing_for_2,Dining_Rating,Delivery_Rating,Locality,Website
0,Special south dosa center,200,3.7,3.9,"Hadapsar, Pune",https://www.zomato.com/pune/special-south-dosa...
1,Yumma swami,400,4.3,3.7,"Camp Area, Pune",https://www.zomato.com/pune/yumma-swami-camp-area
2,Appa,300,4.0,4.3,"Deccan Gymkhana, Pune",https://www.zomato.com/pune/appa-deccan-gymkhana
3,Nashtaram,300,3.6,2.5,"Aundh, Pune",https://www.zomato.com/pune/nashtaram-aundh





Enter 99 to Exit or Press enter to continue 
Enter your current city code (P - Pune , B - Bangalore) : p
Search restaurant in Pune: Anna Vada

#### Recommendation #### 


Unnamed: 0,Restaurant_Name,Pricing_for_2,Dining_Rating,Delivery_Rating,Locality,Website
0,Nrh-badshahi,200,3.6,4.2,"Tilak Road, Pune",https://www.zomato.com/pune/nrh-badshahi-tilak...
1,Amruth vilas,100,3.5,3.8,"Kharadi, Pune",https://www.zomato.com/pune/amruth-vilas-kharadi
2,Iddos,350,3.8,3.4,"Kothrud, Pune",https://www.zomato.com/pune/iddos-kothrud
3,Yenna dosa,350,4.0,4.2,"Bibvewadi, Pune",https://www.zomato.com/pune/yenna-dosa-bibvewadi





Enter 99 to Exit or Press enter to continue 
Enter your current city code (P - Pune , B - Bangalore) : b
Search restaurant in Bangalore: Real Fresh Dosa Corner

#### Recommendation #### 


Unnamed: 0,Restaurant_Name,Pricing_for_2,Dining_Rating,Delivery_Rating,Locality,Website
0,Sgs non veg - gundu pulav,500,4.2,4.5,"City Market, Bangalore",https://www.zomato.com/bangalore/sgs-non-veg-g...
1,Sree krishna kafe,200,4.5,4.0,"Koramangala 5th Block, Bangalore",https://www.zomato.com/bangalore/sree-krishna-...
2,Raghavendra tiffin,150,4.4,4.1,"HSR, Bangalore",https://www.zomato.com/bangalore/raghavendra-t...
3,Chikkanna tiffin room,200,4.3,4.4,"City Market, Bangalore",https://www.zomato.com/bangalore/chikkanna-tif...





Enter 99 to Exit or Press enter to continue b
Enter your current city code (P - Pune , B - Bangalore) : b
Search restaurant in Bangalore: Punjabi Dhaba

#### Recommendation #### 


Unnamed: 0,Restaurant_Name,Pricing_for_2,Dining_Rating,Delivery_Rating,Locality,Website
0,Kabab palace,400,3.6,4.0,"Rajajinagar, Bangalore",https://www.zomato.com/bangalore/kabab-palace-...
1,Kabab korner,650,3.7,4.1,"St. Marks Road, Bangalore",https://www.zomato.com/bangalore/kabab-korner-...
2,Funky punjab,500,4.1,3.8,"JP Nagar, Bangalore",https://www.zomato.com/bangalore/funky-punjab-...
3,Bathinda junction,500,4.2,4.0,"Koramangala 7th Block, Bangalore",https://www.zomato.com/bangalore/bathinda-junc...





Enter 99 to Exit or Press enter to continue 
Enter your current city code (P - Pune , B - Bangalore) : p
Search restaurant in Pune: Punjabi Dhaba

#### Recommendation #### 


Unnamed: 0,Restaurant_Name,Pricing_for_2,Dining_Rating,Delivery_Rating,Locality,Website
0,Cafe sai pure veg,300,3.9,3.8,"Yerawada, Pune",https://www.zomato.com/pune/cafe-sai-pure-veg-...
1,Amritsari kulcha,250,3.8,3.9,"Destination Centre, Magarpatta, Pune",https://www.zomato.com/pune/amritsari-kulcha-h...
2,Sagy's healthy bites,200,3.9,4.1,"Kothrud, Pune",https://www.zomato.com/pune/sagys-healthy-bite...
3,Punjabi grill,500,3.9,3.8,"Hadapsar, Pune",https://www.zomato.com/pune/punjabi-grill-hada...





Enter 99 to Exit or Press enter to continue 99


In [16]:
# Smaple restaurant name [Pune]
# Anna Vada
# Sai Sagar Family Restaurant
# Sanskruti
# Blue Moon Cafe
# Madhura Restaurant
# Punjabi Dhaba

# Smaple restaurant name [Bangalore]
# Real Fresh Dosa Corner
# Punjabi Dhaba
# Arogya Ahaara
# Smoke House Deli
# Venessa