## 1. Importing Libraries

In [1]:
import pandas as pd
import numpy as np
from sklearn.preprocessing import OneHotEncoder
from sklearn.impute import SimpleImputer
from sklearn.metrics.pairwise import cosine_similarity

## 2. Loading the Dataset

In [2]:
df = pd.read_csv('C:/Users/abc/OneDrive/Desktop/File Transfer/Projects/Cognifyz Internship/Dataset .csv')
df.head()

Unnamed: 0,Restaurant ID,Restaurant Name,Country Code,City,Address,Locality,Locality Verbose,Longitude,Latitude,Cuisines,...,Currency,Has Table booking,Has Online delivery,Is delivering now,Switch to order menu,Price range,Aggregate rating,Rating color,Rating text,Votes
0,6317637,Le Petit Souffle,162,Makati City,"Third Floor, Century City Mall, Kalayaan Avenu...","Century City Mall, Poblacion, Makati City","Century City Mall, Poblacion, Makati City, Mak...",121.027535,14.565443,"French, Japanese, Desserts",...,Botswana Pula(P),Yes,No,No,No,3,4.8,Dark Green,Excellent,314
1,6304287,Izakaya Kikufuji,162,Makati City,"Little Tokyo, 2277 Chino Roces Avenue, Legaspi...","Little Tokyo, Legaspi Village, Makati City","Little Tokyo, Legaspi Village, Makati City, Ma...",121.014101,14.553708,Japanese,...,Botswana Pula(P),Yes,No,No,No,3,4.5,Dark Green,Excellent,591
2,6300002,Heat - Edsa Shangri-La,162,Mandaluyong City,"Edsa Shangri-La, 1 Garden Way, Ortigas, Mandal...","Edsa Shangri-La, Ortigas, Mandaluyong City","Edsa Shangri-La, Ortigas, Mandaluyong City, Ma...",121.056831,14.581404,"Seafood, Asian, Filipino, Indian",...,Botswana Pula(P),Yes,No,No,No,4,4.4,Green,Very Good,270
3,6318506,Ooma,162,Mandaluyong City,"Third Floor, Mega Fashion Hall, SM Megamall, O...","SM Megamall, Ortigas, Mandaluyong City","SM Megamall, Ortigas, Mandaluyong City, Mandal...",121.056475,14.585318,"Japanese, Sushi",...,Botswana Pula(P),No,No,No,No,4,4.9,Dark Green,Excellent,365
4,6314302,Sambo Kojin,162,Mandaluyong City,"Third Floor, Mega Atrium, SM Megamall, Ortigas...","SM Megamall, Ortigas, Mandaluyong City","SM Megamall, Ortigas, Mandaluyong City, Mandal...",121.057508,14.58445,"Japanese, Korean",...,Botswana Pula(P),Yes,No,No,No,4,4.8,Dark Green,Excellent,229


Creating a DataFrame

In [3]:
df = pd.DataFrame(df)

## 3. Preprocessing the Data

In this section, I handled missing values, filled missing numerical values with the median and filled missing categorical values with the most frequent value. Also, I added the factor of City to narrow down and get more specific recommendations fror the user.

In [4]:
num_imputer = SimpleImputer(strategy='median')
df[['Price range', 'Aggregate rating']] = num_imputer.fit_transform(df[['Price range', 'Aggregate rating']])

cat_imputer = SimpleImputer(strategy='most_frequent')
df[['Cuisines', 'City']] = cat_imputer.fit_transform(df[['Cuisines', 'City']])

## 4. Categorical Feature Encoding

In [5]:
encoder = OneHotEncoder(sparse_output=False)
encoded_features = encoder.fit_transform(df[['Cuisines', 'City']])

Now I did some feature selection and finally combined them

In [6]:
price_range = df[['Price range']].values
features = np.hstack((encoded_features, price_range, df[['Aggregate rating']].values))

For the testing of the recommendation system, I provided sample user preferences based on the given dataset.

In [7]:
user_pref_cuisine = "North Indian"
user_pref_price = 2  
user_pref_locality = "New Delhi"

After this, I encoded the user preferences and ensured the encoder has seen all categories including the user preference categories

In [8]:
user_pref_df = pd.DataFrame([[user_pref_cuisine, user_pref_locality, user_pref_price, 0]], columns=['Cuisines', 'City', 'Price range', 'Aggregate rating'])

In [9]:
encoder.fit(pd.concat([df[['Cuisines', 'City']], user_pref_df[['Cuisines', 'City']]]))

I have now transformed the user preference categories and combined user preferences encoded features with numerical preferences

In [10]:
user_pref_encoded_cat = encoder.transform(user_pref_df[['Cuisines', 'City']])

user_pref_features = np.hstack((user_pref_encoded_cat, user_pref_df[['Price range', 'Aggregate rating']].values))

## 5. Calculating Similarity scores

In [11]:
similarity = cosine_similarity(user_pref_features, features)

## 6. Evaluation of the Recommendation System

Just before evaluating, I sorted the restaurants based on similarity

In [14]:
sorted_indices = similarity[0].argsort()[::-1]
sorted_restaurants = df["Restaurant Name"].iloc[sorted_indices].tolist()

print(f"Top 5 Restaurant Recommendations for {user_pref_cuisine} (around ${user_pref_price}) in {user_pref_locality}:")
print(sorted_restaurants[:5]) 

Top 5 Restaurant Recommendations for North Indian (around $2) in New Delhi:
['Friends Cafe', 'Lemon Chick', 'Syall Kotian Da Dhaba', 'Recipe Tadka', 'Rasoi']


Therefore, we can conclude by saying that this is a fully functional Restaurant Recommendation System.