## <span style="color:#730101">Classification of Review Stars</span>

In this entity, we create the recommendation system.

#### Import Modules

In [1]:
import pandas as pd
import numpy as np
import os
import pickle
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Model

In [2]:
pd.options.mode.chained_assignment = None

In [3]:
self_trained_model_cnn = keras.models.load_model(os.path.join(os.getcwd(), 'Self_Trained_Model_CNN'))

In [4]:
# We select the best model and we rename it to model in order to continue the process
model = self_trained_model_cnn

### Extract embeddings

In [5]:
model.layers

[<tensorflow.python.keras.engine.input_layer.InputLayer at 0x24fc3a56b88>,
 <tensorflow.python.keras.layers.embeddings.Embedding at 0x24fc3a911c8>,
 <tensorflow.python.keras.layers.convolutional.Conv1D at 0x24fc3aa1788>,
 <tensorflow.python.keras.layers.pooling.GlobalMaxPooling1D at 0x24fc3ab4908>,
 <tensorflow.python.keras.layers.core.Dropout at 0x24fc3abb208>,
 <tensorflow.python.keras.layers.core.Dense at 0x24fc3abbe48>]

You can think of embedding layer as a lookup table where nth row of the table correspond to word vector of the nth word (but embedding layers is trainable layer not just a static lookup table)

In [6]:
# Import Dataset
with open('Nevada.pkl', 'rb') as nevada:
    Nevada = pickle.load(nevada)

In [7]:
Nevada = Nevada.drop(columns=['attributes','categories','LemaText', ]).reset_index(drop=True)
Nevada.head(2)

Unnamed: 0,review_id,user_id,business_id,review_stars,text,date,business_name,address,city,state,postal_code,latitude,longitude,rating,user_name,average_stars,cuisines,input_text
0,izOSwMP2js_ptjDQZsynig,KriIEvoyWwhoswBoqqUpzA,faPVqws-x-5k2CQKDNtHxw,5,We just recently returned from Las Vegas and h...,2019-12-13 15:34:17,Yardbird Southern Table & Bar,3355 Las Vegas Blvd S,LAS VEGAS,NV,89109,36.122328,-115.170112,4.5,Christine,5.0,american,return las pleasure stop restaurant lunch seat...
1,qUxCvPEkl7xrmY-n1szciA,D3XxyNOy8b_1484Oi1eYOg,VrGI7_nRjXpn0415S3coGQ,5,I've been here three times now one of my cowor...,2019-12-13 15:32:07,Vegas Noodle House,3516 Wynn Rd,LAS VEGAS,NV,89103,36.125887,-115.194425,4.0,Aaron,3.88,thai,time place staff friendly restaurant clean sou...


In [8]:
Nevada.shape

(1013794, 18)

### Recommendation System

#### Step 1: Get the user's id and keep his reviews

In [12]:
# The user has to input his/her id/username
input_userid = input("Give me your id: ") 
# Test this one user_id with lots of reviews: iW6YSCu3YVI-SNPNi0I-xg  
# another user_id with only one review: f38DGhAQOxQu07P-4JMpTw

Give me your id: iW6YSCu3YVI-SNPNi0I-xg


In [13]:
# Keep the data for the user of interest
user_data = Nevada[Nevada.user_id==input_userid]
user_data

Unnamed: 0,review_id,user_id,business_id,review_stars,text,date,business_name,address,city,state,postal_code,latitude,longitude,rating,user_name,average_stars,cuisines,input_text
4,bwswkgcVgsv_IPzqMdA4ow,iW6YSCu3YVI-SNPNi0I-xg,xfWdUmrz2ha3rcigyITV0g,5,Second time here and boy they haven't changed....,2019-12-13 14:45:29,Gordon Ramsay Burger,3667 S Las Vegas Blvd,LAS VEGAS,NV,89109,36.110611,-115.172268,4.0,Glenn,3.37,american,time boy best onion not bad american
5,YS4AaE3zFjgZEQ0bBlP-_A,iW6YSCu3YVI-SNPNi0I-xg,DESv2ys6SjBKA4SyDtJvxw,5,We have visited Ferraro's many times and it st...,2019-12-13 14:43:39,Ferraro's Italian Restaurant & Wine Bar,4480 Paradise Rd,LAS VEGAS,NV,89169,36.109198,-115.151955,4.5,Glenn,3.37,italian,time food outstanding perfect server amaze pao...
1783,NduseQt_-TriCjiOgHPsXA,iW6YSCu3YVI-SNPNi0I-xg,QwxtJF1Dw9hEGdGQciD9Ng,3,When we first arrived the Host & Hostess were ...,2019-12-09 15:29:28,CATCH,3730 S Las Vegas Blvd,LAS VEGAS,NV,89158,36.107349,-115.176584,3.5,Glenn,3.37,asian,host hostess not friendly tell bar no place si...
194405,Gi0stcG3Jc2T21QpNu7Ymg,iW6YSCu3YVI-SNPNi0I-xg,N7yuiiu8jhQ-Fl9Npflreg,5,First time here and it was great!. The ambianc...,2018-12-06 12:44:09,Lemongrass,3730 Las Vegas Blvd S,LAS VEGAS,NV,89109,36.106864,-115.177399,3.5,Glenn,3.37,thai,time great nice food outstanding best nice dow...
194661,hmAg1eDA0u8Rd0CennPkAQ,iW6YSCu3YVI-SNPNi0I-xg,G-5kEa6E6PD5fkBRuA7k9Q,5,The ambiance was great. The food and \nRocco ...,2018-12-05 21:33:07,Giada,3595 Las Vegas Blvd South,LAS VEGAS,NV,89109,36.115059,-115.172109,3.5,Glenn,3.37,italian,great food server outstanding highly recommend...
195603,uJgVjMtp90qSB0DhEx90xQ,iW6YSCu3YVI-SNPNi0I-xg,VsewHMsfj1Mgsl2i_hio7w,2,Visited Lavo last night with some friends and ...,2018-12-03 15:08:59,LAVO Italian Restaurant & Lounge,3255 S Las Vegas Blvd,LAS VEGAS,NV,89109,36.124478,-115.169283,3.5,Glenn,3.37,greek mediterranean italian,night service best thing food disappoint best ...
372178,-J_vYUIyogj0lp-T8RQvaQ,iW6YSCu3YVI-SNPNi0I-xg,bjSC_jbrypke0l-bXXBmwQ,5,Everything about Vic & Anthony's is great. The...,2017-12-22 13:52:12,Vic & Anthony's Steakhouse,129 E Fremont St,LAS VEGAS,NV,89101,36.170107,-115.144791,4.5,Glenn,3.37,american,great service food fantastic best spinach loca...
663597,AmADGmQgLgoQ53J7Lh93wg,iW6YSCu3YVI-SNPNi0I-xg,zU9w_xRlQSRIYXxGo-HSOA,4,We had gone to the one in New York and could n...,2015-12-23 13:47:51,Carnegie,3400 Las Vegas Blvd S,LAS VEGAS,NV,89109,36.120584,-115.173643,3.0,Glenn,3.37,american,york not pass hot pastrami rye best american
663598,1J5IFUn2Dp46jQYCfKBc1w,iW6YSCu3YVI-SNPNi0I-xg,vZLAmpN-hTI3GCT9hlHVUA,4,"The ambiance, the food and the assistant waite...",2015-12-23 13:38:48,Carbone,3730 Las Vegas Blvd S,LAS VEGAS,NV,89109,36.107199,-115.176987,4.0,Glenn,3.37,italian,food assistant waiter great main waiter red co...
667608,sKXgzv9y9xMeMjwKthGvCA,iW6YSCu3YVI-SNPNi0I-xg,Xj7DIGRHEchJ-VVdISazQQ,4,Just stopped in for drinks. Great ambiance dri...,2015-12-08 16:56:03,Javier's,3730 Las Vegas Blvd S,LAS VEGAS,NV,89109,36.107325,-115.176579,4.0,Glenn,3.37,mexican,stop great service well fantastic dinner time ...


In [14]:
user_data.shape

(11, 18)

In [15]:
# Get User's Name
name = user_data.user_name.unique()
print(f'Hello {name[0]}.\n')

Hello Glenn.



In [16]:
# Sort the dataset by review_star and date of review and get the first review.
user_data = user_data.sort_values(['review_stars', 'date'], ascending=[False, False]).reset_index(drop=True)
user_review = user_data.input_text[0]
user_review

'time boy best onion not bad american'

#### Step 2 Run the Recommendation Function

In [17]:
# Extract embeddings
text_weights = model.layers[1].get_weights()[0]
print("The shape of embedded weights: ", text_weights.shape)
print("The length of embedded weights: ", len(text_weights))

The shape of embedded weights:  (33434, 300)
The length of embedded weights:  33434


In [18]:
# We need to normalize the embeddings so that the dot product between two embeddings becomes the cosine similarity.
text_weights = text_weights / np.linalg.norm(text_weights, axis = 1).reshape((-1, 1))
round(np.sum(np.square(text_weights[0])))

1.0

In [19]:
texts = Nevada['input_text']

review_index = {link: idx for idx, link in enumerate(texts)}
index_review = {idx: link for link, idx in review_index.items()}

review_index[user_review]

4

* We have based the below function on this [article](https://www.kaggle.com/willkoehrsen/neural-network-embedding-recommendation-system)

In [20]:
# The function below takes in either a user review, a set of embeddings, and returns the n most similar items to the 
# review. It does this by computing the dot product between the review and embeddings. Because we normalized the embeddings, 
# the dot product represents the cosine similarity between two vectors. Once we have the dot products, we can sort the 
# results to find the closest entities in the embedding space. With cosine similarity, higher numbers indicate entities 
# that are closer together, with -1 the furthest apart and +1 closest together.

def find_similar(name, weights, index_name = 'review', n = 20, return_dist = False):
    """Find n most similar items (or least) to name based on embeddings. Option to also plot the results"""
    
    # Select index and reverse index
    if index_name == 'review':
        index = review_index
        rindex = index_review
    
    # Check to make sure `name` is in index
    try:
        # Calculate dot product between review and all others
        dists = np.dot(weights, weights[index[name]])
    except KeyError:
        print(f'{name} Not Found.')
        return
    
    # Sort distance indexes from smallest to largest
    sorted_dists = np.argsort(dists)
    
    # Take the last n sorted distances
    closest = sorted_dists[-n:]

    # Need distances later on
    if return_dist:
        return dists, closest


    print(f'{index_name.capitalize()}s closest to {name}.\n')
        
    # Need distances later on
    if return_dist:
        return dists, closest
    
    
    # Print formatting
    max_width = max([len(rindex[c]) for c in closest])
    
    review_text=[]
    similarity=[]
    # Print the most similar and distances
    for c in reversed(closest):
        #print(f'{index_name.capitalize()}: {rindex[c]:{max_width + 2}} Similarity: {dists[c]:.{2}}')
        review_text.append(rindex[c])
        similarity.append(round(dists[c],3))
    data_similar = pd.DataFrame(np.column_stack([review_text, similarity]),
                                columns=['Review', 'Similarity'])
    
    return data_similar

In [21]:
# Print the top 20 most similar Reviews
pd.set_option('max_colwidth', 300)
data_similar = find_similar(user_review, text_weights)
data_similar

Reviews closest to time boy best onion not bad american.



Unnamed: 0,Review,Similarity
0,time boy best onion not bad american,1.0
1,food absolutely perfect hit mark meat delicious side perfect american,0.661
2,eaten time great picky food notch service friendly quick finally decent food thai,0.6
3,great food affordable service good drag portion restaurant slow service occasion portion restaurant experience good food portion opposite mi japanese,0.584
4,problem place close live fish awesome chip good order thing time eat comment menu service spot enjoy game pretty good good favorite dislike repeat time mexican american,0.574
5,delicious favorite night love happy hour delicious highly recommend roll roll food atmosphere pleasant amaze come mode regret japanese,0.574
6,amaze piece advice pay extra honestly regret great food great price atmosphere comfort wait husband dinner night great friend meat delicious recommend place wait round japanese,0.567
7,large sign post restaurant bloody mary seat waitress late finally mary hour no apology no duty mexican,0.563
8,late lunch not busy surprise lot selection soft shell crab favorite price excellent japanese,0.56
9,best la beat meat town love place steak huge soft asian korean,0.545


#### Step 3 Recommend Restaurant

In [22]:
restaurant_name=[]
restaurant_cuisine=[]
restaurant_stars=[]
user=[]

for i in range(len(data_similar)):
    if i==0:
        continue
    else:
        # Create the list of the reaturant's names
        res_name = Nevada[Nevada.input_text==data_similar.Review[i]].reset_index(drop=True).business_name
        res_name = res_name[0]
        restaurant_name.append(res_name)
        # Create the list of the reaturant's cuisine
        res_cuisine = Nevada[Nevada.input_text==data_similar.Review[i]].reset_index(drop=True).cuisines
        res_cuisine = res_cuisine[0]
        restaurant_cuisine.append(res_cuisine)
        # Create the list of the reaturant's average stars rating
        res_stars = Nevada[Nevada.input_text==data_similar.Review[i]].reset_index(drop=True).average_stars
        res_stars = round(res_stars[0],3)
        restaurant_stars.append(res_stars)
        # Create the list of the reaturant's user id for the specific review
        res_user = Nevada[Nevada.input_text==data_similar.Review[i]].reset_index(drop=True).user_id
        res_user = res_user[0]
        user.append(res_user)

# Create the dataframe that holds the recommendations
recommendations = pd.DataFrame(np.column_stack([restaurant_name, restaurant_cuisine, restaurant_stars,user]),
                               columns=['Name', 'Cuisine', 'Stars', 'User'])

In [23]:
recommendations = recommendations[recommendations.User!=input_userid]
recommendations

Unnamed: 0,Name,Cuisine,Stars,User
0,Big B's Texas BBQ,american,4.04,QFwSaMCndJydXVxzErpEzw
1,Thai Spoon Las Vegas,thai,3.67,GmzwysGSE-gh0-RgG1_QNA
2,Su Casa,japanese,4.1,kLc-KHMaHBhPw6MVOibiOw
3,Nacho Daddy,mexican american,3.97,_5m99dbfiPa0qnf8tKWqRQ
4,Kobe Sushi Bar,japanese,3.67,O12IUTiitzYcvQfDYzhnww
5,Cafe Sanuki,japanese,5.0,3bgZCU_tHkn005u019frLg
6,Carlos'n Charlie's,mexican,3.2,EhVbz35G6vjdDwcoOgkxRg
7,Hayashi Japanese Cuisine,japanese,4.0,lFk5FBqxNuvDv6b90EJ1gw
8,Captain6 Korean BBQ,asian korean,5.0,RdyyZFs1bBdFG55Rtvuz5Q
9,Rice & Company,asian chinese,5.0,6iHdhJs6EAdkTW67CN2iHw


In [24]:
input_cuisine = input("What cuisine would you like to try today? ").lower() #put as input: american or mexican

What cuisine would you like to try today? american


In [25]:
# Get the name of the chosen restaurant
Name = recommendations[recommendations.Cuisine.str.contains(input_cuisine)].reset_index(drop=True).Name[0]
# Get the location of the chosen restaurant
Location = Nevada[Nevada.business_name==Name].reset_index(drop=True).address[0]
print(f'You should try {Name} at {Location}.\n')
print(f'Have a nice meal!\n')

You should try Big B's Texas BBQ at 3019 St Rose Pkwy, Ste 130.

Have a nice meal!



## Restaurant Recommendation

In [26]:
# Extract embeddings
text_weights = model.layers[1].get_weights()[0]
print("The shape of embedded weights: ", text_weights.shape)
print("The length of embedded weights: ", len(text_weights))

# We need to normalize the embeddings so that the dot product between two embeddings becomes the cosine similarity.
text_weights = text_weights / np.linalg.norm(text_weights, axis = 1).reshape((-1, 1))
round(np.sum(np.square(text_weights[0])),1)

The shape of embedded weights:  (33434, 300)
The length of embedded weights:  33434


1.0

In [27]:
texts = Nevada['input_text']

review_index = {link: idx for idx, link in enumerate(texts)}
index_review = {idx: link for link, idx in review_index.items()}

In [28]:
def RestaurantRecommendation(weights, review_index,index_review,index_name = 'review',n=20 ):
    
    input_userid = input("Give me your id: ")
    
    user_data = Nevada[Nevada.user_id==input_userid]
    
    # Get User's Name
    name = user_data.user_name.unique()
    print(f'\nHello {name[0]}.\n')
    
    # Sort the dataset by review_star and date of review and get the first review.
    user_data = user_data.sort_values(['review_stars', 'date'], ascending=[False, False]).reset_index(drop=True)
    user_review = user_data.input_text[0]

    # Select index and reverse index
    if index_name == 'review':
        index = review_index
        rindex = index_review
    
    # Check to make sure `user_review` is in index
    try:
        # Calculate dot product between review and all others
        dists = np.dot(weights, weights[index[user_review]])
    except KeyError:
        print(f'{user_review} Not Found.')
        return
    
    # Sort distance indexes from smallest to largest
    sorted_dists = np.argsort(dists)
    
    # Take the last n sorted distances
    closest = sorted_dists[-n:]


    #print(f'{index_name.capitalize()}s closest to {user_review}.\n')    
    
    # Print formatting
    max_width = max([len(rindex[c]) for c in closest])
    
    review_text=[]
    similarity=[]
    # Print the most similar and distances
    for c in reversed(closest):
        #print(f'{index_name.capitalize()}: {rindex[c]:{max_width + 2}} Similarity: {dists[c]:.{2}}')
        review_text.append(rindex[c])
        similarity.append(dists[c])
    data_similar = pd.DataFrame(np.column_stack([review_text, similarity]),
                                columns=['Review', 'Similarity'])
    
    restaurant_name=[]
    restaurant_cuisine=[]
    restaurant_stars=[]
    user=[]
    for i in range(len(data_similar)):
        if i==0:
            continue
        else:
            res_name = Nevada[Nevada.input_text==data_similar.Review[i]].reset_index(drop=True).business_name
            res_name = res_name[0]
            restaurant_name.append(res_name)
            res_cuisine = Nevada[Nevada.input_text==data_similar.Review[i]].reset_index(drop=True).cuisines
            res_cuisine = res_cuisine[0]
            restaurant_cuisine.append(res_cuisine)
            res_stars = Nevada[Nevada.input_text==data_similar.Review[i]].reset_index(drop=True).average_stars
            res_stars = round(res_stars[0],3)
            restaurant_stars.append(res_stars)
            res_user = Nevada[Nevada.input_text==data_similar.Review[i]].reset_index(drop=True).user_id
            res_user = res_user[0]
            user.append(res_user)

    recommendations = pd.DataFrame(np.column_stack([restaurant_name, restaurant_cuisine, restaurant_stars,user]),
                               columns=['Name', 'Cuisine', 'Stars', 'User'])
    recommendations = recommendations[recommendations.User!=input_userid]
    input_cuisine = input("What cuisine would you like to try today? ").lower()
    
    Name = recommendations[recommendations.Cuisine.str.contains(input_cuisine)].reset_index(drop=True).Name[0]
    Location = Nevada[Nevada.business_name==Name].reset_index(drop=True).address[0]
    print(f'You should try {Name} at {Location}.\n')
    print(f'Have a nice meal!\n')

In [29]:
RestaurantRecommendation(text_weights, review_index, index_review)

Give me your id: iW6YSCu3YVI-SNPNi0I-xg

Hello Glenn.

What cuisine would you like to try today? american
You should try Big B's Texas BBQ at 3019 St Rose Pkwy, Ste 130.

Have a nice meal!

