# 🍽️ Restaurant Recommendation System (Content-Based)

## 📌 Project Overview

This project aims to build a **content-based restaurant recommendation system** that suggests restaurants to users based on their preferences. The system uses restaurant metadata (such as cuisine types, cost, ratings, and delivery options) to compute similarities between restaurants and user-defined preferences.

---

## 🎯 Objective

- Recommend restaurants that closely match a user's specified preferences.
- Use content-based filtering (no collaborative data is required).

---

## 🛠️ Key Steps

1. **Data Preprocessing**  
   - Encode categorical variables (e.g., cuisines, city).  
   - Normalize numeric features (e.g., cost, rating).  

2. **Feature Engineering**  
   - Construct multi-label encodings for cuisines.  
   - Combine multiple features into a single vector for each restaurant.  

3. **Content-Based Filtering**  
   - Create a user preference vector based on selected criteria.  
   - Compute cosine similarity between user vector and all restaurant vectors.  
   - Recommend the top N most similar restaurants.  

4. **Evaluation**  
   - Test with sample user preferences.  
   - Manually assess the relevance of recommendations.

---

## 📦 Outcome

A functional restaurant recommender system that returns relevant suggestions based solely on content similarity with user preferences.

## Import Packages

In [94]:
from sklearn.preprocessing import OneHotEncoder, StandardScaler, FunctionTransformer
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
from sklearn.metrics.pairwise import cosine_similarity
import pandas as pd

## Load Dataset

In [95]:
data = pd.read_csv('../data/raw/Dataset.csv')
data.head()

Unnamed: 0,Restaurant ID,Restaurant Name,Country Code,City,Address,Locality,Locality Verbose,Longitude,Latitude,Cuisines,...,Currency,Has Table booking,Has Online delivery,Is delivering now,Switch to order menu,Price range,Aggregate rating,Rating color,Rating text,Votes
0,6317637,Le Petit Souffle,162,Makati City,"Third Floor, Century City Mall, Kalayaan Avenu...","Century City Mall, Poblacion, Makati City","Century City Mall, Poblacion, Makati City, Mak...",121.027535,14.565443,"French, Japanese, Desserts",...,Botswana Pula(P),Yes,No,No,No,3,4.8,Dark Green,Excellent,314
1,6304287,Izakaya Kikufuji,162,Makati City,"Little Tokyo, 2277 Chino Roces Avenue, Legaspi...","Little Tokyo, Legaspi Village, Makati City","Little Tokyo, Legaspi Village, Makati City, Ma...",121.014101,14.553708,Japanese,...,Botswana Pula(P),Yes,No,No,No,3,4.5,Dark Green,Excellent,591
2,6300002,Heat - Edsa Shangri-La,162,Mandaluyong City,"Edsa Shangri-La, 1 Garden Way, Ortigas, Mandal...","Edsa Shangri-La, Ortigas, Mandaluyong City","Edsa Shangri-La, Ortigas, Mandaluyong City, Ma...",121.056831,14.581404,"Seafood, Asian, Filipino, Indian",...,Botswana Pula(P),Yes,No,No,No,4,4.4,Green,Very Good,270
3,6318506,Ooma,162,Mandaluyong City,"Third Floor, Mega Fashion Hall, SM Megamall, O...","SM Megamall, Ortigas, Mandaluyong City","SM Megamall, Ortigas, Mandaluyong City, Mandal...",121.056475,14.585318,"Japanese, Sushi",...,Botswana Pula(P),No,No,No,No,4,4.9,Dark Green,Excellent,365
4,6314302,Sambo Kojin,162,Mandaluyong City,"Third Floor, Mega Atrium, SM Megamall, Ortigas...","SM Megamall, Ortigas, Mandaluyong City","SM Megamall, Ortigas, Mandaluyong City, Mandal...",121.057508,14.58445,"Japanese, Korean",...,Botswana Pula(P),Yes,No,No,No,4,4.8,Dark Green,Excellent,229


## Define Input & Output Features

In [96]:
input_features = ['Cuisines', 'Price range', 'Aggregate rating', 'Has Table booking', 'Has Online delivery']
output_features = ['Restaurant Name', 'Country Code', 'City', 'Address', 'Locality Verbose', 'Cuisines', 'Currency', 
                   'Has Table booking', 'Has Online delivery', 'Price range', 'Aggregate rating', 'Votes']

## Data Preprocssing

Drop Nulls

In [97]:
data = data.dropna()

Drop unnecessary features

In [98]:
data = data.drop(columns=['Longitude', 'Latitude', 'Restaurant ID', 'Locality', 'Is delivering now', 'Switch to order menu', 'Rating color', 'Rating text'])
data.head()

Unnamed: 0,Restaurant Name,Country Code,City,Address,Locality Verbose,Cuisines,Average Cost for two,Currency,Has Table booking,Has Online delivery,Price range,Aggregate rating,Votes
0,Le Petit Souffle,162,Makati City,"Third Floor, Century City Mall, Kalayaan Avenu...","Century City Mall, Poblacion, Makati City, Mak...","French, Japanese, Desserts",1100,Botswana Pula(P),Yes,No,3,4.8,314
1,Izakaya Kikufuji,162,Makati City,"Little Tokyo, 2277 Chino Roces Avenue, Legaspi...","Little Tokyo, Legaspi Village, Makati City, Ma...",Japanese,1200,Botswana Pula(P),Yes,No,3,4.5,591
2,Heat - Edsa Shangri-La,162,Mandaluyong City,"Edsa Shangri-La, 1 Garden Way, Ortigas, Mandal...","Edsa Shangri-La, Ortigas, Mandaluyong City, Ma...","Seafood, Asian, Filipino, Indian",4000,Botswana Pula(P),Yes,No,4,4.4,270
3,Ooma,162,Mandaluyong City,"Third Floor, Mega Fashion Hall, SM Megamall, O...","SM Megamall, Ortigas, Mandaluyong City, Mandal...","Japanese, Sushi",1500,Botswana Pula(P),No,No,4,4.9,365
4,Sambo Kojin,162,Mandaluyong City,"Third Floor, Mega Atrium, SM Megamall, Ortigas...","SM Megamall, Ortigas, Mandaluyong City, Mandal...","Japanese, Korean",1500,Botswana Pula(P),Yes,No,4,4.8,229


Convert Numerical Categorical Features to Textual Features

In [99]:
country_code_mapping = {
    162: 'India',
    30: 'Greece',
    216: 'Tunisia',
    14: 'Australia',
    37: 'Belgium',
    184: 'Latvia',
    214: 'Bosnia and Herzegovina',
    1: 'United States',
    94: 'Sri Lanka',
    148: 'South Africa',
    215: 'Mauritius',
    166: 'Czech Republic',
    189: 'Slovakia',
    191: 'Ukraine',
    208: 'Romania'
}

data['Country'] = data['Country Code'].map(country_code_mapping)
data.drop(columns='Country Code', inplace=True)

In [100]:
price_range_mappings = {
    1:'low',
    2:'medium',
    3:'high',
    4:'very high',
}
data['Price range'] = data['Price range'].map(price_range_mappings)

Columns Reordering

In [101]:
new_order = ['Restaurant Name', 'Country', 'City', 'Address', 'Locality Verbose', 'Cuisines', 'Average Cost for two', 'Currency', 'Price range', 
         'Has Table booking', 'Has Online delivery', 'Aggregate rating', 'Votes']
output_data = data[new_order]
output_data.head()

Unnamed: 0,Restaurant Name,Country,City,Address,Locality Verbose,Cuisines,Average Cost for two,Currency,Price range,Has Table booking,Has Online delivery,Aggregate rating,Votes
0,Le Petit Souffle,India,Makati City,"Third Floor, Century City Mall, Kalayaan Avenu...","Century City Mall, Poblacion, Makati City, Mak...","French, Japanese, Desserts",1100,Botswana Pula(P),high,Yes,No,4.8,314
1,Izakaya Kikufuji,India,Makati City,"Little Tokyo, 2277 Chino Roces Avenue, Legaspi...","Little Tokyo, Legaspi Village, Makati City, Ma...",Japanese,1200,Botswana Pula(P),high,Yes,No,4.5,591
2,Heat - Edsa Shangri-La,India,Mandaluyong City,"Edsa Shangri-La, 1 Garden Way, Ortigas, Mandal...","Edsa Shangri-La, Ortigas, Mandaluyong City, Ma...","Seafood, Asian, Filipino, Indian",4000,Botswana Pula(P),very high,Yes,No,4.4,270
3,Ooma,India,Mandaluyong City,"Third Floor, Mega Fashion Hall, SM Megamall, O...","SM Megamall, Ortigas, Mandaluyong City, Mandal...","Japanese, Sushi",1500,Botswana Pula(P),very high,No,No,4.9,365
4,Sambo Kojin,India,Mandaluyong City,"Third Floor, Mega Atrium, SM Megamall, Ortigas...","SM Megamall, Ortigas, Mandaluyong City, Mandal...","Japanese, Korean",1500,Botswana Pula(P),very high,Yes,No,4.8,229


Split features that will be used for vectorization & matching

In [102]:
input_data = output_data[input_features]
input_data.head()

Unnamed: 0,Cuisines,Price range,Aggregate rating,Has Table booking,Has Online delivery
0,"French, Japanese, Desserts",high,4.8,Yes,No
1,Japanese,high,4.5,Yes,No
2,"Seafood, Asian, Filipino, Indian",very high,4.4,Yes,No
3,"Japanese, Sushi",very high,4.9,No,No
4,"Japanese, Korean",very high,4.8,Yes,No


Vectorizing Pipeline

In [103]:
# Preprocessing for cuisines (split comma-separated values)
def preprocess_cuisines(text_series):
    return text_series.str.replace(', ', ' ')  # Convert to space-separated for TF-IDF

In [None]:
# Create the column transformer with FunctionTransformer
preprocessor = ColumnTransformer(
    transformers=[
        # TF-IDF for cuisines
        ('cuisines_tfidf', 
         Pipeline([
             ('preprocess', FunctionTransformer(
                 lambda x: x['Cuisines'].str.replace(', ', ' '), 
                 validate=False
             )),
             ('tfidf', TfidfVectorizer(max_features=100))
         ]), 
         ['Cuisines']),
        
        # One-hot encoding for price range
        ('price_ohe', OneHotEncoder(handle_unknown='ignore'), ['Price range']),
        
        # Standard scaling for numerical features
        ('num_scaling', StandardScaler(), ['Aggregate rating']),
        
        # Binary encoding for yes/no features
        ('binary', OneHotEncoder(drop='if_binary'), ['Has Table booking', 'Has Online delivery'])
    ],
    remainder='drop'
)

In [105]:
# Create the full pipeline
pipeline = Pipeline([
    ('preprocessor', preprocessor)
])

In [106]:
# Fit and transform the data
transformed_data = pipeline.fit_transform(data)
print(f"Transformed data shape: {transformed_data.shape}")

Transformed data shape: (9542, 107)


User Preferences

In [None]:
def get_user_input():
    print("Please enter restaurant preferences:")
    cuisines = input("Cuisines (comma separated): ")
    price = input("Price range (low/medium/high/very high): ")
    min_rating = float(input("Minimum rating (0-5): "))
    table_booking = input("Require table booking? (Yes/No): ")
    online_delivery = input("Require online delivery? (Yes/No): ")
    
    return pd.DataFrame([{
        'Cuisines': cuisines,
        'Price range': price,
        'Aggregate rating': min_rating,
        'Has Table booking': table_booking,
        'Has Online delivery': online_delivery
    }])

user_input = get_user_input()
user_input

Please enter restaurant preferences:


Unnamed: 0,Cuisines,Price range,Aggregate rating,Has Table booking,Has Online delivery
0,French,high,4.5,Yes,Yes


Matching Process

In [108]:
def find_similar_restaurants(user_input, top_n=10):
    # Transform user input
    user_vector = pipeline.transform(user_input)
    
    # Calculate cosine similarity with all restaurants
    similarities = cosine_similarity(user_vector, transformed_data)
    
    # Get top N most similar restaurants
    similar_indices = similarities.argsort()[0][-top_n:][::-1]
    
    # Return the matching restaurants
    return output_data.iloc[similar_indices]

find_similar_restaurants(user_input)

Unnamed: 0,Restaurant Name,Country,City,Address,Locality Verbose,Cuisines,Average Cost for two,Currency,Price range,Has Table booking,Has Online delivery,Aggregate rating,Votes
3257,Bonne Bouche,United States,New Delhi,"Shop 16, 1st Floor, Defence Colony Market, Def...","Defence Colony, New Delhi","Italian, French",1750,Indian Rupees(Rs.),high,Yes,Yes,4.1,711
0,Le Petit Souffle,India,Makati City,"Third Floor, Century City Mall, Kalayaan Avenu...","Century City Mall, Poblacion, Makati City, Mak...","French, Japanese, Desserts",1100,Botswana Pula(P),high,Yes,No,4.8,314
809,Chili's,United States,Chennai,"49 & 50 L, Express Avenue Mall, White's Road, ...","Express Avenue Mall, Royapettah, Chennai","Mexican, American, Tex-Mex, Burger",1700,Indian Rupees(Rs.),high,Yes,Yes,4.8,1262
2483,The Fusion Kitchen,United States,Mumbai,"Shop 1, Opposite Veda Building, Near Bhavdevi ...","Borivali West, Mumbai","North Indian, Italian, Chinese, Mexican",1000,Indian Rupees(Rs.),high,Yes,Yes,4.7,2083
2302,Chili's,United States,Hyderabad,"Flat 48, Ground Floor, Opposite Vengal Rao Par...","Banjara Hills, Hyderabad","Mexican, American, Tex-Mex, Burger",1800,Indian Rupees(Rs.),high,Yes,Yes,4.7,1932
3014,Zabardast Indian Kitchen,United States,New Delhi,"E-13/29, Ground Floor, Middle Circle, Connaugh...","Connaught Place, New Delhi",North Indian,1800,Indian Rupees(Rs.),high,Yes,Yes,4.7,242
814,Bombay Brasserie,United States,Chennai,"3, College Lane, Nungambakkam, Chennai","Nungambakkam, Chennai",North Indian,1200,Indian Rupees(Rs.),high,Yes,Yes,4.6,1753
820,Basil With A Twist,United States,Chennai,"58-A, Habibullah Road, T. Nagar, Chennai","T. Nagar, Chennai","Continental, Cafe, Spanish, Italian, European,...",1500,Indian Rupees(Rs.),high,Yes,Yes,4.5,1210
3981,Coast Cafe,United States,New Delhi,"H-2, 2nd & 3rd Floor, Hauz Khas Village, New D...","Hauz Khas Village, New Delhi","Continental, Kerala",1400,Indian Rupees(Rs.),high,Yes,Yes,4.5,1033
2099,Indian Grill Room,United States,Gurgaon,"315, 3rd Floor, Suncity Business Tower, Golf C...","Suncity Business Tower, Golf Course Road, Gurgaon","North Indian, Mughlai",1800,Indian Rupees(Rs.),high,Yes,Yes,4.5,1262
