#### A restaurant recommendation system is an application that recommends similar restaurants to a customer according to the customer’s taste. To build a restaurant recommender system using Python, I have collected data from Kaggle. 

In [32]:
import numpy as np
import pandas as pd
from sklearn.feature_extraction import text
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.feature_extraction.text import TfidfVectorizer

In [43]:
data = pd.read_csv("TripAdvisor_RestauarantRecommendation.csv")
print(data.head())

                            Name       Street Address  \
0  Betty Lou's Seafood and Grill     318 Columbus Ave   
1              Coach House Diner        55 State Rt 4   
2               Table Talk Diner  2521 South Rd Ste C   
3                    Sixty Vines     3701 Dallas Pkwy   
4                   The Clam Bar    3914 Brewerton Rd   

                       Location                                          Type  \
0  San Francisco, CA 94133-3908   Seafood, Vegetarian Friendly, Vegan Options   
1     Hackensack, NJ 07601-6337          Diner, American, Vegetarian Friendly   
2   Poughkeepsie, NY 12601-5476          American, Diner, Vegetarian Friendly   
3          Plano, TX 75093-7777       American, Wine Bar, Vegetarian Friendly   
4            Syracuse, NY 13212                        American, Bar, Seafood   

            Reviews No of Reviews  \
0  4.5 of 5 bubbles   243 reviews   
1    4 of 5 bubbles    84 reviews   
2    4 of 5 bubbles   256 reviews   
3  4.5 of 5 bubbles   

In [44]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3062 entries, 0 to 3061
Data columns (total 11 columns):
 #   Column            Non-Null Count  Dtype 
---  ------            --------------  ----- 
 0   Name              3062 non-null   object
 1   Street Address    3062 non-null   object
 2   Location          3062 non-null   object
 3   Type              3049 non-null   object
 4   Reviews           3062 non-null   object
 5   No of Reviews     3062 non-null   object
 6   Comments          2447 non-null   object
 7   Contact Number    3062 non-null   object
 8   Trip_advisor Url  3062 non-null   object
 9   Menu              3062 non-null   object
 10  Price_Range       3062 non-null   object
dtypes: object(11)
memory usage: 263.3+ KB


### I will select  only two columns from the dataset for the rest of the task (Name, Type):



In [45]:
data = data[["Name", "Type"]]
print(data.head())

                            Name                                          Type
0  Betty Lou's Seafood and Grill   Seafood, Vegetarian Friendly, Vegan Options
1              Coach House Diner          Diner, American, Vegetarian Friendly
2               Table Talk Diner          American, Diner, Vegetarian Friendly
3                    Sixty Vines       American, Wine Bar, Vegetarian Friendly
4                   The Clam Bar                        American, Bar, Seafood


In [35]:
print(data.isnull().sum())

Name     0
Type    13
dtype: int64


In [36]:
data = data.dropna()

In [37]:
tfidf_vectorizer = TfidfVectorizer(stop_words='english')

### The type of restaurant is a valuable feature in the data to build a recommendation system. The type column here represents the category of restaurants. For example, if a customer likes vegetarian-friendly restaurants, he will only look at the recommendations if they are vegetarian friendly too. So I will use the Type column as the feature to recommend similar restaurants to the customer:

In [38]:
tfidf_matrix = tfidf_vectorizer.fit_transform(data['Type'])



In [39]:
similarity = cosine_similarity(tfidf_matrix)

#### The cosine_similarity function is a measure that calculates the cosine of the angle between two non-zero vectors in an inner product space. It's commonly used to measure how similar two sets of data are. The cosine similarity measure returns a value between -1 and 1, 
### where: 1 means the data sets are identical 0 means the data sets are orthogonal (or not similar) -1 means the data sets are diametrically opposite

### Now I will set the name of the restaurant as an index so that we can find similar restaurants by giving the name of the restaurant as an input:


In [40]:
indices = pd.Series(data.index, index=data['Name']).drop_duplicates()

In [41]:
def restaurant_recommendation(name, similarity = similarity):
    index = indices[name]
    similarity_scores = list(enumerate(similarity[index]))
    similarity_scores = sorted(similarity_scores, key=lambda x: x[1], reverse=True)
    similarity_scores = similarity_scores[0:10]
    restaurantindices = [i[0] for i in similarity_scores]
    return data['Name'].iloc[restaurantindices]

print(restaurant_recommendation("Market Grill"))

23                   The Lion's Share
154                        Houlihan's
518            Midgley's Public House
568                 Aspen Creek Grill
770              Pete's Sunset Grille
1190     Paul Martin's American Grill
1581                   Aviation Grill
1872                   Aviation Grill
2193                Crest Bar & Grill
2612    Tahoe Joe's Famous Steakhouse
Name: Name, dtype: object
