Task-2: Restaurant Recommendation

To build a restaurant recommendation system that suggests restaurants to a user based on their preferences using content-based filtering

Step-1:Preprocess the dataset
 
 1. Load and understanding the dataset
 2. Handle missing values
 3. Drop Unnecessary Columns
 4. Encode Categorical Data

In [3]:
import pandas as pd
df=pd.read_csv("C:/Users/KAVYA/Documents/Dataset .csv")
df.info()
df.describe()
df.head()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9551 entries, 0 to 9550
Data columns (total 21 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   Restaurant ID         9551 non-null   int64  
 1   Restaurant Name       9551 non-null   object 
 2   Country Code          9551 non-null   int64  
 3   City                  9551 non-null   object 
 4   Address               9551 non-null   object 
 5   Locality              9551 non-null   object 
 6   Locality Verbose      9551 non-null   object 
 7   Longitude             9551 non-null   float64
 8   Latitude              9551 non-null   float64
 9   Cuisines              9542 non-null   object 
 10  Average Cost for two  9551 non-null   int64  
 11  Currency              9551 non-null   object 
 12  Has Table booking     9551 non-null   object 
 13  Has Online delivery   9551 non-null   object 
 14  Is delivering now     9551 non-null   object 
 15  Switch to order menu 

Unnamed: 0,Restaurant ID,Restaurant Name,Country Code,City,Address,Locality,Locality Verbose,Longitude,Latitude,Cuisines,...,Currency,Has Table booking,Has Online delivery,Is delivering now,Switch to order menu,Price range,Aggregate rating,Rating color,Rating text,Votes
0,6317637,Le Petit Souffle,162,Makati City,"Third Floor, Century City Mall, Kalayaan Avenu...","Century City Mall, Poblacion, Makati City","Century City Mall, Poblacion, Makati City, Mak...",121.027535,14.565443,"French, Japanese, Desserts",...,Botswana Pula(P),Yes,No,No,No,3,4.8,Dark Green,Excellent,314
1,6304287,Izakaya Kikufuji,162,Makati City,"Little Tokyo, 2277 Chino Roces Avenue, Legaspi...","Little Tokyo, Legaspi Village, Makati City","Little Tokyo, Legaspi Village, Makati City, Ma...",121.014101,14.553708,Japanese,...,Botswana Pula(P),Yes,No,No,No,3,4.5,Dark Green,Excellent,591
2,6300002,Heat - Edsa Shangri-La,162,Mandaluyong City,"Edsa Shangri-La, 1 Garden Way, Ortigas, Mandal...","Edsa Shangri-La, Ortigas, Mandaluyong City","Edsa Shangri-La, Ortigas, Mandaluyong City, Ma...",121.056831,14.581404,"Seafood, Asian, Filipino, Indian",...,Botswana Pula(P),Yes,No,No,No,4,4.4,Green,Very Good,270
3,6318506,Ooma,162,Mandaluyong City,"Third Floor, Mega Fashion Hall, SM Megamall, O...","SM Megamall, Ortigas, Mandaluyong City","SM Megamall, Ortigas, Mandaluyong City, Mandal...",121.056475,14.585318,"Japanese, Sushi",...,Botswana Pula(P),No,No,No,No,4,4.9,Dark Green,Excellent,365
4,6314302,Sambo Kojin,162,Mandaluyong City,"Third Floor, Mega Atrium, SM Megamall, Ortigas...","SM Megamall, Ortigas, Mandaluyong City","SM Megamall, Ortigas, Mandaluyong City, Mandal...",121.057508,14.58445,"Japanese, Korean",...,Botswana Pula(P),Yes,No,No,No,4,4.8,Dark Green,Excellent,229


In [5]:
df['Cuisines']=df['Cuisines'].fillna(df['Cuisines'].mode()[0])
df['Average Cost for two']=df['Average Cost for two'].fillna(df['Average Cost for two'].median())

In [6]:
df.drop(['Restaurant ID','Address','Locality','Locality Verbose'],axis=1,inplace=True)

In [7]:
df['Has Table booking']=df['Has Table booking'].map({'Yes':1,'No':0})
df['Has Online delivery']=df['Has Online delivery'].map({'Yes':1,'No':0})
df['Is delivering now']=df['Is delivering now'].map({'Yes':1,'No':0})
df['Switch to order menu']=df['Switch to order menu'].map({'Yes':1,'No':0})

Step-2:Defining Recommendation Criteria

a. Cuisines
b. Average Cost for two
c. Has Online delivery
d. City

Feature Engineering for similarity

1. Create Combined Feature column
2. Apply TF-IDF Vectorization

In [9]:
df['combined_features']=df['Cuisines']+' '+df['City']+' '+df['Currency']

In [None]:
from sklearn.feature_extraction.text import TfidfVectorizer
tfidf=TfidfVectorizer(stop_words='english')
tfidf_matrix=tfidf.fit_transform(df['combined_features'])

Step-3: Build Recommendation System

1. Compute Cosine Similarity
2. Build a function to recommend restaurants using content- based filtering

In [12]:
from sklearn.metrics.pairwise import cosine_similarity
cosine_sim= cosine_similarity(tfidf_matrix,tfidf_matrix)

In [13]:
def recommend_restaurants(user_cuisine,user_city,user_currency,top_n=5):
    input_str=user_cuisine+' '+user_city+' '+user_currency
    input_vec=tfidf.transform([input_str])
    sim_scores=cosine_similarity(input_vec,tfidf_matrix).flatten()
    indices=sim_scores.argsort()[-top_n:][::-1]
    return df.iloc[indices][['Restaurant Name','Cuisines','City','Aggregate rating']]

step-4: Testing the Recommendation System

In [14]:
recommend_restaurants("Italians","New Delhi","Indian Rupees(Rs.)")

Unnamed: 0,Restaurant Name,Cuisines,City,Aggregate rating
6803,Mahipal Dhaba,North Indian,New Delhi,2.8
5359,The Test,North Indian,New Delhi,2.9
2605,PM 2 AM Food Bank,North Indian,New Delhi,2.5
2606,Punjabi Chaap Corner,North Indian,New Delhi,2.9
5351,Muradabadi Chicken Biryani,North Indian,New Delhi,3.3


In [15]:
recommend_restaurants("Chinese","Bangalore","Indian Rupees(Rs.)")

Unnamed: 0,Restaurant Name,Cuisines,City,Aggregate rating
742,Communiti,Continental,Bangalore,4.2
726,Sultans of Spice,"North Indian, Mughlai",Bangalore,4.1
727,The Fatty Bao - Asian Gastro Bar,Asian,Bangalore,4.7
744,Hoot,"Continental, Italian, North Indian",Bangalore,3.9
738,Koramangala Social,"Continental, American",Bangalore,4.5


In [16]:
recommend_restaurants("American","New Delhi","Indian Rupees(Rs.)")

Unnamed: 0,Restaurant Name,Cuisines,City,Aggregate rating
6878,Smoke Trailer Grill,American,New Delhi,0.0
3907,Big Fat Sandwich,American,New Delhi,2.9
2626,Big Fat Sandwich,American,New Delhi,3.5
7621,Pat 'N' Harry,American,New Delhi,2.3
4321,Tourist Janpath,"North Indian, American, Chinese",New Delhi,4.1
