<a href="https://colab.research.google.com/github/Lokesh-006/Cognifyz-/blob/main/TASK_2_Restaurent_Recommendation_System.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Restaurant recommendation**

**Objective:** Create a restaurant recommendation
system based on user preferences.

# Importing Requirments.

In [1]:
import pandas as pd
import numpy as np
import warnings
import plotly.express as px
warnings.filterwarnings('ignore')

**Loading Restaurants Data set**

In [2]:
df=pd.read_csv('/content/sample_data/Dataset .csv')

# Data Exploaring.

In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9551 entries, 0 to 9550
Data columns (total 21 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   Restaurant ID         9551 non-null   int64  
 1   Restaurant Name       9551 non-null   object 
 2   Country Code          9551 non-null   int64  
 3   City                  9551 non-null   object 
 4   Address               9551 non-null   object 
 5   Locality              9551 non-null   object 
 6   Locality Verbose      9551 non-null   object 
 7   Longitude             9551 non-null   float64
 8   Latitude              9551 non-null   float64
 9   Cuisines              9542 non-null   object 
 10  Average Cost for two  9551 non-null   int64  
 11  Currency              9551 non-null   object 
 12  Has Table booking     9551 non-null   object 
 13  Has Online delivery   9551 non-null   object 
 14  Is delivering now     9551 non-null   object 
 15  Switch to order menu 

In [4]:
df.isnull().sum() # checking missing values

Restaurant ID           0
Restaurant Name         0
Country Code            0
City                    0
Address                 0
Locality                0
Locality Verbose        0
Longitude               0
Latitude                0
Cuisines                9
Average Cost for two    0
Currency                0
Has Table booking       0
Has Online delivery     0
Is delivering now       0
Switch to order menu    0
Price range             0
Aggregate rating        0
Rating color            0
Rating text             0
Votes                   0
dtype: int64

In [5]:
df.fillna('Other',inplace=True) # handling missing values

In [6]:
df.isnull().sum()

Restaurant ID           0
Restaurant Name         0
Country Code            0
City                    0
Address                 0
Locality                0
Locality Verbose        0
Longitude               0
Latitude                0
Cuisines                0
Average Cost for two    0
Currency                0
Has Table booking       0
Has Online delivery     0
Is delivering now       0
Switch to order menu    0
Price range             0
Aggregate rating        0
Rating color            0
Rating text             0
Votes                   0
dtype: int64

In [7]:
df.columns

Index(['Restaurant ID', 'Restaurant Name', 'Country Code', 'City', 'Address',
       'Locality', 'Locality Verbose', 'Longitude', 'Latitude', 'Cuisines',
       'Average Cost for two', 'Currency', 'Has Table booking',
       'Has Online delivery', 'Is delivering now', 'Switch to order menu',
       'Price range', 'Aggregate rating', 'Rating color', 'Rating text',
       'Votes'],
      dtype='object')

In [8]:
df['Cuisines']=df['Cuisines'].str.split(',') # Split the cuisines by comma

# Let's check and select the required feature to make the recommendation system.

In [9]:
Restaurant_data=df[['Restaurant Name','City','Cuisines','Locality','Price range']]

In [10]:
Restaurant_data.head()[['Cuisines','City','Price range']]

Unnamed: 0,Cuisines,City,Price range
0,"[French, Japanese, Desserts]",Makati City,3
1,[Japanese],Makati City,3
2,"[Seafood, Asian, Filipino, Indian]",Mandaluyong City,4
3,"[Japanese, Sushi]",Mandaluyong City,4
4,"[Japanese, Korean]",Mandaluyong City,4


In [11]:
fig=px.histogram(Restaurant_data,x='Price range',text_auto='')
fig.show()

In [12]:
fig=px.histogram(Restaurant_data,x='City',color='City',title='Restaurants_in_Citys')
fig.show()

From the above graph, we can observe the Cities of New Delhi, Noida, and Gurgaon have more Restaurants.

## Train|Test|Split Data

In [13]:
X=Restaurant_data.drop('Restaurant Name',axis=1)

In [14]:
y=Restaurant_data['Restaurant Name']

In [15]:
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2,
                                               random_state=42)

# Vectorization

In [16]:
from sklearn.feature_extraction.text import TfidfVectorizer,CountVectorizer

In [17]:
Tf_Idf=TfidfVectorizer(stop_words='english')

In [18]:
Tf_Idf.fit(X_train)

In [19]:
# Convert lists of cuisines to strings
X_train['Cuisines'] = X_train['Cuisines'].apply(lambda x: ' '.join(x))
X_test['Cuisines'] = X_test['Cuisines'].apply(lambda x: ' '.join(x))

Tf_Idf_Xtrain = Tf_Idf.transform(X_train['Cuisines'])
Tf_Idf_Xtest = Tf_Idf.transform(X_test['Cuisines'])

In [20]:
Tf_Idf_Xtrain

<7640x5 sparse matrix of type '<class 'numpy.float64'>'
	with 0 stored elements in Compressed Sparse Row format>

In [21]:
Tf_Idf_Xtest

<1911x5 sparse matrix of type '<class 'numpy.float64'>'
	with 0 stored elements in Compressed Sparse Row format>

In [22]:
Tf_Idf_ytrain=Tf_Idf.fit_transform(y_train)
Tf_Idf_ytest=Tf_Idf.fit_transform(y_test)

In [23]:
Tf_Idf_ytrain

<7640x4849 sparse matrix of type '<class 'numpy.float64'>'
	with 17340 stored elements in Compressed Sparse Row format>

In [24]:
Tf_Idf_ytest

<1911x1884 sparse matrix of type '<class 'numpy.float64'>'
	with 4403 stored elements in Compressed Sparse Row format>

## Model Comparisons - Naive Bayes,LogisticRegression, LinearSVC

In [25]:
from sklearn.naive_bayes import MultinomialNB
from sklearn.linear_model import LogisticRegression
from sklearn.svm import LinearSVC
from sklearn.metrics import accuracy_score,classification_report,confusion_matrix

In [26]:
naive_bayes=MultinomialNB()


In [27]:
X_train.shape

(7640, 4)

In [28]:
y_train.shape

(7640,)

In [29]:
naive_bayes.fit(Tf_Idf_Xtrain.toarray(),y_train)

In [30]:
from math import log
log_reg=LogisticRegression(max_iter=1000)
log_reg.fit(Tf_Idf_Xtrain.toarray(),y_train)

In [31]:
svc=LinearSVC()
svc.fit(Tf_Idf_Xtrain.toarray(),y_train)

# Performance Evaluation

In [32]:
def report(model):
  y_pred=model.predict(Tf_Idf_Xtest.toarray())
  print(classification_report(y_test,y_pred))
  print(confusion_matrix(y_test,y_pred))

In [33]:
print('Naive Bayes')
report(naive_bayes)

Naive Bayes
                                                        precision    recall  f1-score   support

                                           #OFF Campus       0.00      0.00      0.00         1
                                   19 Flavours Biryani       0.00      0.00      0.00         1
                                         21 Gun Salute       0.00      0.00      0.00         1
                                         22nd Parallel       0.00      0.00      0.00         1
                                       23 On Hazelwood       0.00      0.00      0.00         1
                24/7 Pastry Shop - The Lalit New Delhi       0.00      0.00      0.00         1
                 24/7 Restaurant - The Lalit New Delhi       0.00      0.00      0.00         1
                                       3 Squares Diner       0.00      0.00      0.00         1
                                  34, Chowringhee Lane       0.00      0.00      0.00         4
                           

In [34]:
print('Logistic Regression')
report(log_reg)

Logistic Regression
                                                        precision    recall  f1-score   support

                                           #OFF Campus       0.00      0.00      0.00         1
                                   19 Flavours Biryani       0.00      0.00      0.00         1
                                         21 Gun Salute       0.00      0.00      0.00         1
                                         22nd Parallel       0.00      0.00      0.00         1
                                       23 On Hazelwood       0.00      0.00      0.00         1
                24/7 Pastry Shop - The Lalit New Delhi       0.00      0.00      0.00         1
                 24/7 Restaurant - The Lalit New Delhi       0.00      0.00      0.00         1
                                       3 Squares Diner       0.00      0.00      0.00         1
                                  34, Chowringhee Lane       0.00      0.00      0.00         4
                   

In [35]:
print('Linear SVC')
report(svc)

Linear SVC
                                                        precision    recall  f1-score   support

                                           #OFF Campus       0.00      0.00      0.00         1
                                   19 Flavours Biryani       0.00      0.00      0.00         1
                                         21 Gun Salute       0.00      0.00      0.00         1
                                         22nd Parallel       0.00      0.00      0.00         1
                                       23 On Hazelwood       0.00      0.00      0.00         1
                24/7 Pastry Shop - The Lalit New Delhi       0.00      0.00      0.00         1
                 24/7 Restaurant - The Lalit New Delhi       0.00      0.00      0.00         1
                                       3 Squares Diner       0.00      0.00      0.00         1
                                  34, Chowringhee Lane       0.00      0.00      0.00         4
                            

In [36]:
# Finalize a PipeLine for Deployment on New cuisines
from sklearn.pipeline import Pipeline

In [37]:
pipe=Pipeline([('tfidf',TfidfVectorizer()),('svc',LinearSVC())])

In [38]:
# Convert lists in 'Cuisines' column to strings
Restaurant_data['Cuisines'] = Restaurant_data['Cuisines'].apply(lambda x: ' '.join(x) if isinstance(x, list) else x)

pipe = Pipeline([('tfidf', TfidfVectorizer()), ('svc', LinearSVC())])

# Now fit the pipeline
pipe.fit(Restaurant_data['Cuisines'], Restaurant_data['Restaurant Name'])

here we preparing a ecomondation system that will recommands Restaurants,city,location,Price range

In [39]:
def recommend():
  """
  Recommends top 10 restaurants based on user-specified cuisines.

  Args:
    user_cuisines: A string containing the user's preferred cuisines and price range.

  Returns:
    A DataFrame containing the top 10 recommended restaurants.
  """
  try:
    user_cuisines =  input("Enter your preferred cuisines : ")
    # Handle user input variations (lowercase, uppercase, etc.)
    user_cuisines = user_cuisines.lower()

    # Transform the user preferences using the TfidfVectorizer
    user_cuisines_tfidf = pipe['tfidf'].transform([user_cuisines])

    # Predict using the LinearSVC model
    recommendations = pipe['svc'].predict(user_cuisines_tfidf)

    # Find matching restaurants
    top_10_recommendations = Restaurant_data[Restaurant_data['Restaurant Name'].isin(recommendations)].head(10)

    return top_10_recommendations

  except Exception as e:
    print("An error occurred:", e)
    return None

In [40]:
recommend()

Enter your preferred cuisines : fruits


Unnamed: 0,Restaurant Name,City,Cuisines,Locality,Price range
353,Raglan Road Irish Pub and Restaurant,Orlando,Irish,Disney: Downtown Disney,4


In [41]:
recommend()

Enter your preferred cuisines : burgers


Unnamed: 0,Restaurant Name,City,Cuisines,Locality,Price range
353,Raglan Road Irish Pub and Restaurant,Orlando,Irish,Disney: Downtown Disney,4


In [43]:
recommend()

Enter your preferred cuisines : andhra


Unnamed: 0,Restaurant Name,City,Cuisines,Locality,Price range
2500,Hotel RRR Mysore,Mysore,Andhra,Chamrajpura,1


In [44]:
recommend()

Enter your preferred cuisines : kiwi


Unnamed: 0,Restaurant Name,City,Cuisines,Locality,Price range
9317,Maranui Cafe,Wellington City,Cafe Kiwi,Lyall Bay,3


# **Conclusion:**

  * Our restaurant recommendation system based on user preferences demonstrates promising performance in suggesting relevant restaurants.
  
  * The system leverages machine learning techniques to analyze cuisine preferences, location, and price range.
  
  * By employing a pipeline comprising TfidfVectorizer and LinearSVC, the system effectively captures the relationships between cuisines and restaurants.
