# **TASK 2: Restaurant Recommendation System**

### **1. Introduction**

With the rapid growth of online food platforms, recommending suitable restaurants to users has become essential. A Restaurant Recommendation System helps users discover restaurants based on their preferences such as cuisine type, price range, and ratings.

This project implements a content-based recommendation system using restaurant data. The system recommends restaurants that are similar to a user’s preferences or similar to a selected restaurant.


In [1]:
# Import Libraries
import numpy as np
import pandas as pd
from sklearn.preprocessing import OneHotEncoder
from sklearn.metrics.pairwise import cosine_similarity

### **2. Objective**

The main objectives of this project are:

* To preprocess a restaurant dataset
* To encode categorical features
* To apply content-based filtering
* To recommend restaurants based on similarity
* To validate the recommendation results

In [2]:
# Load Dataset
r_data = pd.read_csv("/content/Dataset.csv")
r_data.head(3)

Unnamed: 0,Restaurant ID,Restaurant Name,Country Code,City,Address,Locality,Locality Verbose,Longitude,Latitude,Cuisines,...,Currency,Has Table booking,Has Online delivery,Is delivering now,Switch to order menu,Price range,Aggregate rating,Rating color,Rating text,Votes
0,6317637,Le Petit Souffle,162,Makati City,"Third Floor, Century City Mall, Kalayaan Avenu...","Century City Mall, Poblacion, Makati City","Century City Mall, Poblacion, Makati City, Mak...",121.027535,14.565443,"French, Japanese, Desserts",...,Botswana Pula(P),Yes,No,No,No,3,4.8,Dark Green,Excellent,314
1,6304287,Izakaya Kikufuji,162,Makati City,"Little Tokyo, 2277 Chino Roces Avenue, Legaspi...","Little Tokyo, Legaspi Village, Makati City","Little Tokyo, Legaspi Village, Makati City, Ma...",121.014101,14.553708,Japanese,...,Botswana Pula(P),Yes,No,No,No,3,4.5,Dark Green,Excellent,591
2,6300002,Heat - Edsa Shangri-La,162,Mandaluyong City,"Edsa Shangri-La, 1 Garden Way, Ortigas, Mandal...","Edsa Shangri-La, Ortigas, Mandaluyong City","Edsa Shangri-La, Ortigas, Mandaluyong City, Ma...",121.056831,14.581404,"Seafood, Asian, Filipino, Indian",...,Botswana Pula(P),Yes,No,No,No,4,4.4,Green,Very Good,270


### **3. Dataset Description**

The dataset contains the following important columns:

* Restaurant Name – Name of the restaurant
* Cuisines – Types of food served
* Price range – Cost category of the restaurant
* Aggregate rating – Overall customer rating

These features are used to calculate similarity between restaurants.

### **4. Methodology**

**4.1 Data Preprocessing**

* Missing values are handled to ensure data consistency.
* Only relevant features are selected for recommendation.

In [3]:
# Handle Missing Values
r_data["Cuisines"]=r_data["Cuisines"].fillna("No Cuisine")
r_data["Price range"] = r_data["Price range"].fillna(r_data["Price range"].median())
r_data["Aggregate rating"] = r_data["Aggregate rating"].fillna(r_data["Aggregate rating"].mean())
r_data["Votes"] = r_data["Votes"].fillna(r_data["Votes"].mean())

In [4]:
# Select features
features = r_data[["Cuisines", "Price range", "Aggregate rating", "Votes"]]

**4.2 Feature Encoding**

* The Cuisines column is categorical.
* One-Hot Encoding is used to convert cuisine types into numerical values.

In [5]:
# Encode Categorical Variable
encoder = OneHotEncoder(sparse_output= False)
cuisine_encoded = encoder.fit_transform(features[["Cuisines"]])

# Convert Dataframe of encode feature
cuisine_encoded_data = pd.DataFrame(cuisine_encoded, columns = encoder.get_feature_names_out(["Cuisines"]))

In [6]:
# Final Feature Matrix
feature_matrix = pd.concat([cuisine_encoded_data, features[["Price range","Aggregate rating"]]], axis=1)
feature_matrix.head(5)

Unnamed: 0,Cuisines_Afghani,"Cuisines_Afghani, Mughlai, Chinese","Cuisines_Afghani, North Indian","Cuisines_Afghani, North Indian, Pakistani, Arabian",Cuisines_African,"Cuisines_African, Portuguese",Cuisines_American,"Cuisines_American, Asian, Burger","Cuisines_American, Asian, European, Seafood","Cuisines_American, Asian, Italian, Seafood",...,"Cuisines_Turkish, Mediterranean, Middle Eastern",Cuisines_Vietnamese,"Cuisines_Vietnamese, Fish and Chips","Cuisines_Western, Asian, Cafe","Cuisines_Western, Fusion, Fast Food",Cuisines_World Cuisine,"Cuisines_World Cuisine, Mexican, Italian","Cuisines_World Cuisine, Patisserie, Cafe",Price range,Aggregate rating
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3,4.8
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3,4.5
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4,4.4
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4,4.9
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4,4.8


**4.3 Similarity Calculation**

* Cosine similarity is applied to measure similarity between restaurants.
* Restaurants with higher similarity scores are considered more relevant.

In [7]:
# Compute Cosine Similarity
similarity_matrix = cosine_similarity(feature_matrix)

**4.4 Recommendation Strategy**

Two recommendation approaches are implemented:

* Restaurant-based recommendation
* User preference-based recommendation

**1. Restaurance based Recommendation**

In [8]:
def recommend_restaurant(restaurant_name, top_n=2):

    index = r_data[r_data["Restaurant Name"] == restaurant_name].index[0]

    scores = list(enumerate(similarity_matrix[index]))
    scores.sort(key=lambda x: x[1], reverse=True)

    top_matches = scores[1:top_n + 1]

    return [
        r_data.iloc[i[0]]["Restaurant Name"]
        for i in top_matches
    ]


recommend_restaurant("Le Petit Souffle")

['Silantro Fil-Mex', "Rae's Coastal Cafe"]

**Validation of Results**

The correctness of the recommendation is validated by:

* Comparing the similarity scores
* Manually verifying cuisine, price range, and rating
* Ensuring the recommended restaurant has the highest similarity value

Restaurant recommendations vary depending on user preferences and input

**2. User Preference Input**

In [9]:
user_input = {
    "Cuisines": "Seafood",
    "Price range": 4,
    "Aggregate rating" : 4.0
}

# Encode User preference
user_data = pd.DataFrame({"Cuisines": [user_input["Cuisines"]]})

user_cuisine_encoded = encoder.transform(user_data)

# User Feature vector
# Add the mean of 'Votes' to the user_feature_vector to match the dimensions of feature_matrix
user_feature_vector = np.concatenate([user_cuisine_encoded[0],[user_input["Price range"], user_input["Aggregate rating"]]]).reshape(1,-1)

In [10]:
# Final Recommendation
user_similarity = cosine_similarity(user_feature_vector, feature_matrix)

best_match_index = user_similarity.argmax()

print("Recommended Restaurant Based on (Cuisines,Price Range,Aggregate rating):", r_data.iloc[best_match_index]["Restaurant Name"])

# View recommended restaurant details
print("\n Details of the Restaurant:\n", r_data.iloc[best_match_index])


Recommended Restaurant Based on (Cuisines,Price Range,Aggregate rating): Blue Point Grill

 Details of the Restaurant:
 Restaurant ID                                     17211719
Restaurant Name                           Blue Point Grill
Country Code                                           216
City                                             Princeton
Address                 258 Nassau St, Princeton, NJ 08542
Locality                                         Princeton
Locality Verbose                      Princeton, Princeton
Longitude                                       -74.651139
Latitude                                         40.352385
Cuisines                                           Seafood
Average Cost for two                                    70
Currency                                         Dollar($)
Has Table booking                                       No
Has Online delivery                                     No
Is delivering now                                     

### **Conclusion**

This project successfully demonstrates a content-based restaurant recommendation system. The system provides accurate and explainable recommendations without requiring user history. It can be extended further using location data, user reviews, or advanced text-based similarity techniques.