**Recommender system:**<br>
* Module 1 - non-personalized keyword-filtering recommender:<br>
build keyword search-based restaurant recommender module to filter by keyword. Keywords could include, for instance, location-based information (zip code, longitude, latitude)  and restaurant feature-based information (cuisine, style). 
The restaurant inventory will be filtered by keywords first, then ranked by its average rating or weighted smart rating taking into consideration the popularity (depending on user’s choice). The top-k restaurants from the list will be returned as the top-k recommendations.<br>
* Module 2 - personalized content-based recommender:<br>
With user ID and restaurant’s metadata, build a content based recommender module that recommends restaurants that are similar to user’s preference inferred from user’s past ratings. More specifically, pairwise similarity scores will be computed for restaurants based on their vectorized feature representation extracted using CountVectorizer or TfidfVectorizer and recommend restaurants based on rankings of the weighted similarity score (e.g. cosine similarity). The important restaurant metadata to consider include categories, attributes, location.<br>
* Module 3 - personalized collaborative recommender:<br>
With user_id x restaurant_id rating matrix, build a collaborative recommender module. Remember that the dataset has a total of 1,518,169 users, 188,593 businesses and 5,996,995 reviews. In terms of the user_id x business_id matrix, the matrix is very sparse (0.003% non-empty). Therefore, matrix factorization algorithms will be used to complete the highly sparse matrix and generate recommendations.<br>
* Metrics chosen for evaluating and optimizing the ‘goodness’ of the algorithms:<br>
a) measure prediction accuracy: RMSE(root mean squared error) <br>
b) measure ranking effectiveness: NDCG(Normalized Discounted Cumulative Gain) at top-k<br>
* Integration - combine the above modules to build a hybrid recommendation engine:<br>
To combine the above modules, a few simple interactive questions will be added:<br>
a) “Want customized recommendations based on your user history by providing your user ID?”  If no, activate the simple recommender module to provide base-case recommendations using location information and/or optional keywords<br>
b) If yes, prompt to ask follow up question: “do you want to try something new based on people like you?” If yes, activate the collaborative filtering module to recommend new restaurants based on similar peers; otherwise, use content filter module to recommend similar restaurants. <br>
* Other improvements:<br>
Optimize restaurant ranking by weighting the average rating based on total number of ratings (popularity), weighting the individual rating according to their recency, etc. With a quick interactive question: “want smart rating instead?” The alternative restaurant ranking method based on the above weighted scores will be activated and used instead of the simple average rating.<br>
* Potential caveats - cold start problem:<br>
a) new restaurant → content-based recommendation module will be able to use the features (metadata) of the new restaurant and include it when generating recommendations.<br>
b) new user → will be treated as if the user ID is not available (both has no user history) and similar recommender module will be used to recommend restaurants based on location, keywords, popularity, etc. 

**Note:**<br> 
Only a subset of Yelp restaurants from a few selected states are available in this dataset. Among them, only Arizona, Nevada, Ohio, North Carolina and Pennsylvania have a rich catalog of over 5000 restaurants. Only the top two states, Arizona and Nevada have over 10000 restaurants. 

# 1. All necessary imports and functions

In [7]:
import warnings
warnings.filterwarnings('ignore')

import pandas as pd
import numpy as np

from geopy.geocoders import Nominatim
from geopy.exc import GeocoderTimedOut
import pickle
import os.path
from sklearn.metrics.pairwise import linear_kernel

In [2]:
# import all necessary dataset to power the recommender modules
business = pd.read_csv('business_clean.csv')  # contains business data including location data, attributes and categories
business['postal_code'] = business.postal_code.astype(str) # update the data type of the 'postal_code' column to string
review = pd.read_csv('review_clean.csv') # contains full review text data including the user_id that wrote the review and the business_id the review is written for
# extract a subset of reviews related to restaurants, since we are only interested in restaurant-type business
review_s = review[review.business_id.isin(business.business_id.unique())] 

In [3]:
def great_circle_mile(lat1, lon1, lat2, lon2):
    """
    Compute geodesic distances (great-circle distance) of two points on the globe given their coordinates. 
    The function returns the distance in miles. 
    Note: 1. Calculation uses the earth's mean radius of 6371.009 km, 
    2. The central subtended angle is calculated by formula: 
    alpha = cos-1*[sin(lat1)*sin(lat2)+ cos(lat1)*cos(lat2)*cos(lon1-lon2)]
    """
    
    from math import sin, cos, acos, radians
    
    lat1, lon1, lat2, lon2 = radians(lat1), radians(lon1), radians(lat2), radians(lon2) # convert degrees to radians
    earth_radius = 6371.009  # use earth's mean radius in kilometers
    alpha = acos(sin(lat1)*sin(lat2) + cos(lat1)*cos(lat2)*cos(lon1-lon2)) # alpha is in radians
    dis_km = alpha * earth_radius
    dis_mile = dis_km * 0.621371   # convert kilometer to mile
    
    return dis_mile

In [4]:
# adding 'adjusted_score' to the 'business' dataset, which adjusts the restaurnat average star ratings by the number of ratings it has

globe_mean = ((business.stars * business.review_count).sum())/(business.review_count.sum())
k = 22 # set strength k to 22, which is the 50% quantile of the review counts for all businesses
business['adjusted_score'] = (business.review_count * business.stars + k * globe_mean)/(business.review_count + k)

# 2. Building hybrid recommendation engine

## 2.1 Implementation

In [19]:
class Recommender:
    
    def __init__(self, n=5, original_score=False):
        """initiate a Recommender object.
        ---
        Optional keyword arguments to be passed are:
        1. the desired number of recommendations to make ('n'), the default number is 10.
        2. the score for ranking the recommendations ('original_score'): by default, the adjusted score will be used for ranking; 
            To rank by the original average rating of the restaurant, pass original_score=True
        ---
        In addition, a few class variables will be initiated upon creation for internal use:        
        1. the class variable '.personalized' is used to keep track of whethere a personalized recommendation has been generated or not.
            it only takes one of the following values with the default being 0
            0: no personalization yet
            1: a personalized recommendation has been computed using the collaborative module
            2: a personalzied recommendation has been computed using the content-based module
        2. the class variable '.column_to_dispay' is used to keep track of a list of column names to display in the recommendation results.
            the list will be updated based on the modules being called.
        3. the class variable '.recomm' is used to store the current list of recommendations
        """
        
        self.n = n # number of recommendations to make, default is 5
        self.original_score = original_score # boolean indicating whether the original average rating or the adjusted score is used
        self.personalized = 0 # variable indicating if a personalized recommendation has been generated or not, default is 0
        self.column_to_display = ['state','city','name','address','attributes.RestaurantsPriceRange2','cuisine',\
                                  'style','review_count','stars','adjusted_score'] # initiate a list of columns to display in the recommendation results
        
        # upon class creation, initiate the recommendation to be all the open restaurants from the entire catalog of 'business' dataset sorted by the score of interest
        if self.original_score:  # set sorting criteria to the originial star rating
            score = 'stars'
        else:  # set sorting criteria to the adjusted score
            score = 'adjusted_score'
        self.recomm = business[business.is_open == 1].sort_values(score, ascending=False)
        
    def _filter_by_location(self):
        """Filter and update the dataframe of recommendations by the matching location of interest.
        A combination of state, city and zipcode is used as the location information, partially missing information can be handled. 
        Matching restaurant is defined as the restaurant within the acceptable distance (max_distance) of the location of interest.
        note: this hidden method should only be called within the method 'keyword'
        """       
        geolocator = Nominatim(user_agent="yelp_recommender") # use geopy.geocoders to make geolocation queries
        address = [self.city, self.state, self.zipcode]
        address = ",".join([str(i) for i in address if i != None])
        # use geolocate query to find the coordinate for the location of interest
        try:
            location = geolocator.geocode(address, timeout=10) 
        except GeocoderTimedOut as e:
            print("Error: geocode failed to locate the address of interest {} with message {}".format(address, e.message))            

        # calculate the geodesic distance between each restaurant and the location of interest and add as a new column ''distance_to_interest'
        self.recomm['distance_to_interest'] = self.recomm.apply(lambda row: great_circle_mile(row.latitude, row.longitude, location.latitude, location.longitude), axis=1)
        # add the new column 'distance_to_interest' to the list of columns to display in the recommendation result
        self.column_to_display.insert(0, 'distance_to_interest')
        # filter by the desired distance
        self.recomm = self.recomm[self.recomm.distance_to_interest <= self.max_distance]

    def _filter_by_state(self):
        """ Filter and update the dataframe of recommendations by the matching state.
        note: this hidden method should only be called within the method 'keyword'
        """
        self.recomm = self.recomm[self.recomm.state == self.state.upper()]
    
    def _filter_by_cuisine(self):
        """ Filter and update the dataframe of recommendations by the matching cuisine of interest. 
        note: this hidden method should only be called within the method 'keyword'
        """                         
        idx = []
        for i in self.recomm.index: 
            if self.recomm.loc[i,'cuisine'] is not np.nan:
                entries = self.recomm.loc[i,'cuisine'].split(',')
                if self.cuisine in entries:
                    idx.append(i)
        self.recomm = self.recomm.loc[idx]
    
    def _filter_by_style(self):  
        """ Filter and update the dataframe of recommendations by the matching style of interest. 
        note: this hidden method should only be called within the method 'keyword'
        """
        idx = []
        for i in self.recomm.index: 
            if self.recomm.loc[i,'style'] is not np.nan:
                entries = self.recomm.loc[i,'style'].split(',')
                if self.style in entries:
                    idx.append(i)
        self.recomm = self.recomm.loc[idx]
        
    def _filter_by_price(self):
        """Filter and update the dataframe of recommendations by the matching price range of interest. 
        note: this hidden method should only be called within the method 'keyword'
        """
        self.recomm = self.recomm[self.recomm['attributes.RestaurantsPriceRange2'].isin(self.price)]
    
    def display_recommendation(self):
        """ Display the list of top n recommended restaurants
        """
        if len(self.recomm) == 0:
            print("Sorry, there is no matching recommendations.")
        elif self.n < len(self.recomm):  # display only the top n from the recommendation list
            print("Below is a list of the top {} recommended restaurants for you: ".format(self.n))
            print(self.recomm.iloc[:self.n][self.column_to_display])
        else:  # display all if # of recommendations is less than self.n
            print("Below is a list of the top {} recommended restaurants for you: ".format(len(self.recomm)))
            print(self.recomm[self.column_to_display])
     
    #---------------------------------------------------------------
    # non-personalized keyword filtering-based recommender module
    def keyword(self, df=business[business.is_open == 1], zipcode=None, city=None, state=None, max_distance=10, cuisine=None, style=None, price=None, personalized=False):
        """Non-personalized recommendation by keyword filtering: 
        Support filtering by the distance and location (zipcode, city, state) of interest, 
        by the desired cuisine, by the desired style, and by the desired price range. 
        The module supports multiple price range inputs separated by comma.
        ---
        Note:
        df: the default restaurant catalog is all the open restaurants in the 'business' dataframe, 
            if a subset is prefered, e.g. previous filtered result, the subset can be passed to df
        state: needs to be the upper case of the state abbreviation, e.g.: 'NV', 'CA'
        max_distance: the max acceptable distance between the restaurant and the location of interest, unit is in miles, default is 10
        ---
        """
    
        # re-initiate the following variables every time the module is called so that the recommendation starts fresh
        self.recomm = df # start with the desired restaurant catalog
        self.recomm['distance_to_interest'] = np.nan # reset the distance between each restaurant and the location of interest
        self.column_to_display = ['state','city','name','address','attributes.RestaurantsPriceRange2','cuisine','style','review_count','stars','adjusted_score'] # reset the columns to display
        
        # assign variables based on user's keyword inputs
        self.zipcode = zipcode
        self.city = city
        self.state = state 
        self.max_distance = max_distance
        self.cuisine = cuisine
        self.style = style
        self.price = price
        
        # check self.personalized and column names to see a personalized score is available for ranking and displaying personalized recommendations
        if personalized:
            if (self.personalized == 0) or ('predicted_stars' not in self.recomm.columns and 'similarity_score' not in self.recomm.columns):
                print("no personalized list of recommendations is generated yet!")
                print("please first run the collaborative recommender module or content-based recommender module for a personalized recommendations.")
                return None
        
        # filter by restaurant location
        if (self.zipcode != None) or (self.city != None) or (self.state != None):      
            if (self.zipcode != None) or (self.city != None): # use zipcode and/or city whenever available
                self._filter_by_location()
            else: # filter by state if state is the only location information available 
                self._filter_by_state()
            if len(self.recomm) == 0:
                print("no restaurant found for the matching location of interest.")
                return None
        
        # filter by restaurant 'cuisine'
        if self.cuisine != None:
            self._filter_by_cuisine()
            if len(self.recomm) == 0:
                print("no restaurant found for the matching cuisine of {}".format(self.cuisine))
                return None
    
        # filter by restaurant 'style'
        if self.style != None:
            self._filter_by_style() 
            if len(self.recomm) == 0:
                print("no restaurant found for the matching style of {}".format(self.style))
                return None
        
        # filter by restaurant price range
        if self.price != None:
            self.price = [i.strip() for i in price.split(',')] #extract multiple inputs of price range
            self._filter_by_price()
            if len(self.recomm) == 0:
                print("no restaurant found for the matching price of {}".format(self.price))
                return None
        
        # sort the matching list of restaurants by the score of interest
        if personalized:
            if self.personalized == 1:
                score = 'predicted_stars'
                self.column_to_display.insert(0, 'predicted_stars')  # add 'predicted_stars' to the list of columns to display
            elif self.personalized == 2:
                score = 'similarity_score'
                self.column_to_display.insert(0, 'similarity_score')  # add 'similarity_score' to the list of columns to display
        elif self.original_score:  # set sorting criteria to the originial star rating
            score = 'stars'
        else:  # set sorting criteria to the adjusted score
            score = 'adjusted_score'
        self.recomm = self.recomm.sort_values(score, ascending=False)
        
        # display the list of top n recommendations
        self.display_recommendation()
        
        return self.recomm
    
    #------------------------------------------------------------
    # personalized collaborative recommender module
    def collaborative(self, user_id=None):
        """Passing of user_id is required if personalized recommendation is desired.
        """
        
        self.user_id = user_id # user_id for personalized recommendation using collaborative filtering 
        if self.user_id is None:
            print("no user_id is provided!")
            return None
        if len(self.user_id) != 22:
            print("invalid user id!")
            return None
        
        # initiate every time the module is called
        self.recomm = business[business.is_open ==1] # start with all open restaurants from the entire 'business' catalog
        self.column_to_display = ['state','city','name','address','attributes.RestaurantsPriceRange2',\
                                  'cuisine','style','review_count','stars','adjusted_score'] # reset the columns to display
        if 'predicted_stars' in self.recomm.columns:
            self.recomm.drop('predicted_stars', axis=1, inplace=True) # delete the column of 'predicted_stars' if already present
        
        # load and extract the necessary info fro the trained matrix factorization algorithm
        with open('svd_trained_info.pkl', 'rb') as f:
            svd_trained_info = pickle.load(f)
        user_latent = svd_trained_info['user_latent']
        item_latent = svd_trained_info['item_latent']
        user_bias = svd_trained_info['user_bias']
        item_bias = svd_trained_info['item_bias']
        r_mean = svd_trained_info['mean_rating'] # global mean of all ratings
        userid_to_idx = svd_trained_info['userid_to_index']
        itemid_to_idx = svd_trained_info['itemid_to_index']
        
        # predict personalized restaurant ratings for the user_id of interest
        if self.user_id in userid_to_idx:
            u_idx = userid_to_idx[self.user_id]
            pred = r_mean + user_bias[u_idx] + item_bias + np.dot(user_latent[u_idx,:],item_latent.T)
        else: 
            print("sorry, no personal data available for this user_id yet!")
            print("Here is the generic recommendation computed from all the users in our database:")
            pred = r_mean + item_bias
        
        # pairing the predicted ratings with the business_id by matching the corresponding matrix indices of the business_id
        prediction = pd.DataFrame(data=pred, index=itemid_to_idx.values(), columns=['predicted_stars']) 
        prediction.index.name = 'matrix_item_indice'
        assert len(prediction) == len(pred)
        prediction['business_id'] = list(itemid_to_idx.keys())
        
        # filter to unrated business_id only by the user_id of interest if a personal history is available
        if self.user_id in userid_to_idx:       
            busi_rated = review[review.user_id == self.user_id].business_id.unique()
            prediction = prediction[~prediction.business_id.isin(busi_rated)]
        
        # inner-join the prediction dataframe with the recommendation catalog on 'business_id' to retrieve all relevant business informations
        # note: the .merge step needs to be performed prior to extracting the top n
        # because many of the 'business_id' in the review dataframe are not restaurant-related, therefore not present in the 'business' catalog
        self.recomm = self.recomm.merge(prediction, on='business_id', how='inner') 
        
        # sort the prediction by the predicted ratings in descending order
        self.recomm = self.recomm.sort_values('predicted_stars', ascending=False).reset_index(drop=True)
        
        # add 'predicted_stars' to the list of columns to display and update self.personalized to 1
        self.column_to_display.insert(0, 'predicted_stars') 
        self.personalized = 1
        
        # display the list of top n recommendations
        self.display_recommendation()
        
        return self.recomm
    
    
    #------------------------------------------------------------
    # personalized content-based recommender module
    def content(self, user_id=None):
        """Passing of user_id is required if personalized recommendation is desired.
        """
        
        self.user_id = user_id # user_id for personalized recommendation using collaborative filtering 
        if self.user_id is None:
            print("no user_id is provided!")
            return None
        if len(self.user_id) != 22:
            print("invalid user id!")
            return None
        if self.user_id not in review_s.user_id.unique(): # check if previous restaurant rating/review history is available for the user_id of interest
            print("sorry, no personal data available for this user_id yet!")
            return []
        
        # initiate every time the module is called
        self.recomm = business[business.is_open ==1] # start with all open restaurants from the entire 'business' catalog
        self.column_to_display = ['state','city','name','address','attributes.RestaurantsPriceRange2',\
                                  'cuisine','style','review_count','stars','adjusted_score'] # reset the columns to display
        if 'similarity_score' in self.recomm.columns:
            self.recomm.drop('similarity_score', axis=1, inplace=True) # delete the column of 'cosine_similarity' if already present
        
        # load the saved restaurant pca feature vectors
        with open('rest_pcafeature_all.pkl', 'rb') as f:
            rest_pcafeature = pickle.load(f)
            
        # load the saved user pca feature vectors
        max_bytes = 2**31 - 1
        bytes_in = bytearray(0)
        input_size = os.path.getsize('user_pcafeature_all.pkl')
        with open('user_pcafeature_all.pkl','rb') as f: 
            for _ in range(0, input_size, max_bytes):
                bytes_in += f.read(max_bytes)
            user_pcafeature = pickle.loads(bytes_in)
        
        # predict personalized cosine similarity scores for the user_id of interest
        sim_matrix = linear_kernel(user_pcafeature.loc[user_id].values.reshape(1, -1), rest_pcafeature)
        sim_matrix = sim_matrix.flatten()
        sim_matrix = pd.Series(sim_matrix, index = rest_pcafeature.index)
        sim_matrix.name = 'similarity_score'
        
        # pairing the computed cosine similarity score with the business_id by matching the corresponding matrix indices of the business_id
        self.recomm = pd.concat([sim_matrix, self.recomm.set_index('business_id')], axis=1, join='inner').reset_index()
        
        # filter to unrated business_id only by the user_id of interest if a personal history is available      
        busi_rated = review_s[review_s.user_id == self.user_id].business_id.unique()
        self.recomm = self.recomm[~self.recomm.business_id.isin(busi_rated)]
               
        # sort the recommendation by the cosine similarity score in descending order
        self.recomm = self.recomm.sort_values('similarity_score', ascending=False).reset_index(drop=True)
           
        # add 'similarity_score' to the list of columns to display and update self.personalized to 2
        self.column_to_display.insert(0, 'similarity_score') 
        self.personalized = 2
        
        # display the list of top n recommendations
        self.display_recommendation()
        
        return self.recomm

## 2.2 Testing

### 2.2.1 Testing of the non-personalized keyword filtering recommender module

In [6]:
%%time
# initiate a Recommender object
kw = Recommender(n=3)

# test0: display only (same as no keywords)
print("------\nresult from test0 (display only): ")
kw.display_recommendation()

# test1: no keywords
print("------\nresult from test1 (no keywords): ")
kw.keyword();

# test 2: a combination of city, state and zipcode
print("------\nresult from test2 (a combination of city and state): ")
kw.keyword(city='Phoenix', state='AZ', zipcode='85023');

# test 3: a combination of cuisine and style
print("------\nresult from test3 (a combination of cuisine and style): ")
kw.keyword(cuisine='barbeque', style='restaurants');

# test 4: a combination of state, cuisine and style
print("------\nresult from test4 (a combination of state, cuisine and style): ")
kw.keyword(state='NV', cuisine='desserts', style='restaurants');

# test 5: no matching location
print("------\nresult from test5 (no matching location): ")
kw.keyword(city='milpitas', zipcode='95035');

# test 6: no matching 'cuisine'
print("------\nresult from test6 (no matching cuisine): ")
kw.keyword(cuisine='abc');

# test 7: no matching 'style'
print("------\nresult from test7 (no matching style): ")
kw.keyword(style='abc');

# test 8: a combination of location, cuisine and style
print("------\nresult from test8 (a combination of location, cuisine and style): ")
kw.keyword(city='Phoenix', zipcode='85023',cuisine='barbeque', style='restaurants');

# test 9: a combination of price range, cuisine and style
print("------\nresult from test9 (a combination of price range, cuisine and style): ")
kw.keyword(price='1', cuisine='barbeque', style='restaurants');

# test 10: a combination of two price ranges, location, cuisine and style
print("------\nresult from test10 (a combination of two price ranges, location, cuisine and style): ")
kw.keyword(price='1, 2', zipcode='85023',cuisine='barbeque', style='restaurants');

# test 11: use the original average rating and return top 10 recommendations
print("------\nresult from test11 (top 10 recommendations ranked by original average rating): ")
kw2 = Recommender(n=10, original_score=True)
kw2.keyword(city='Phoenix', zipcode='85023',cuisine='barbeque', style='restaurants');

------
result from test0 (display only): 
Below is a list of the top 3 recommended restaurants for you: 
      state       city             name                       address  \
7464     AZ    Phoenix  Little Miss BBQ          4301 E University Dr   
31910    NV  Las Vegas     Brew Tea Bar  7380 S Rainbow Blvd, Ste 101   
45401    NV  Las Vegas       Gelatology  7910 S Rainbow Blvd, Ste 110   

       attributes.RestaurantsPriceRange2                              cuisine  \
7464                                 2.0                             barbeque   
31910                                1.0                 desserts, bubble tea   
45401                                1.0  ice cream & frozen yogurt, desserts   

                    style  review_count  stars  adjusted_score  
7464          restaurants          1746    5.0        4.984169  
31910  cafes, restaurants          1380    5.0        4.980037  
45401                 NaN           547    5.0        4.950811  
------
result fro

Below is a list of the top 10 recommended restaurants for you: 
       distance_to_interest state     city  \
13651              2.566000    AZ  Phoenix   
44933              1.077608    AZ  Phoenix   
9236               4.550027    AZ  Phoenix   
23589              9.583306    AZ  Phoenix   
25730              7.104974    AZ  Phoenix   
15502              6.417437    AZ  Phoenix   
15693              9.044423    AZ  Phoenix   
44683              6.068879    AZ  Phoenix   
41810              6.005243    AZ  Phoenix   
213                6.576795    AZ   Peoria   

                                         name                        address  \
13651                      Big Cuz's Catering  428 E Thunderbird Rd, Ste 424   
44933                 Pork on a Fork Catering                 1732 W Bell Rd   
9236                                  Bobby Q                8501 N 27th Ave   
23589                         Reathrey Sekong        1312 E Indian School Rd   
25730                   Papa 

As shown, 11 tests (11 queries) are performed with a total CPU time of 10 seconds and elapsed time of 15 seconds. This averages to roughly 1-2 seconds per queries which is very reasonable in practice.

### 2.2.2 Testing of the personalized collaborative recommender module

In [20]:
%%time

# initiate a Recommender object
col = Recommender(n=5)

# test0: display only (same as no keywords)
print("------\nresult from test0 (display only): ")
col.display_recommendation()

# test1: no user id input
print("------\nresult from test1 (no user id input): ")
col.collaborative();

# test 2: invalid user id input
print("------\nresult from test2 (invalid user id input): ")
col.collaborative(user_id='928402');

------
result from test0 (display only): 
Below is a list of the top 5 recommended restaurants for you: 
      state             city                name  \
7464     AZ          Phoenix     Little Miss BBQ   
31910    NV        Las Vegas        Brew Tea Bar   
45401    NV        Las Vegas          Gelatology   
7784     NV  North Las Vegas        Poke Express   
28162    NV        Las Vegas  Meráki Greek Grill   

                            address  attributes.RestaurantsPriceRange2  \
7464           4301 E University Dr                                2.0   
31910  7380 S Rainbow Blvd, Ste 101                                1.0   
45401  7910 S Rainbow Blvd, Ste 110                                1.0   
7784        655 W Craig Rd, Ste 118                                2.0   
28162  4950 S Rainbow Blvd, Ste 160                                2.0   

                                   cuisine               style  review_count  \
7464                              barbeque         restau

In [21]:
%%time

# test 3: valid user id (no user data)
print("------\nresult from test3 (valid user id --- no user review data): ")
col.collaborative(user_id='-NzChtoNOw706kps82x0Kg');

------
result from test3 (valid user id --- no user review data): 
sorry, no personal data available for this user_id yet!
Here is the generic recommendation computed from all the users in our database:
Below is a list of the top 5 recommended restaurants for you: 
   predicted_stars state       city                              name  \
0         4.972518    AZ      Tempe  Affordable Party & Event Rentals   
1         4.952788    NV  Henderson                        Party Pros   
2         4.939452    NV  Henderson                    Firelight Barn   
3         4.937603    AZ    Phoenix         La Parilla Villa Catering   
4         4.930186    WI    Madison           The Conscious Carnivore   

                         address  attributes.RestaurantsPriceRange2  \
0         510 S 52nd St, Ste 105                                NaN   
1              1153 Enchanted Ct                                NaN   
2  133 W Lake Mead Pkwy, Ste 140                                2.0   
3          

In [10]:
%%time

# test 4: valid user id (user has only one review)
print("------\nresult from test4 (valid user id --- user has only one review): ")
col.collaborative(user_id='---89pEy_h9PvHwcHNbpyg');

------
result from test4 (valid user id --- user has only one review): 
Below is a list of the top 5 recommended restaurants for you: 
   predicted_stars state        city                              name  \
0         5.162014    AZ  Scottsdale                    Aloha Cakes AZ   
1         5.158219    AZ       Tempe  Affordable Party & Event Rentals   
2         5.157192    NV   Las Vegas                  CHEFit Meal Prep   
3         5.152490    NV   Henderson                    Firelight Barn   
4         5.144190    NV   Henderson                        Party Pros   

                         address  attributes.RestaurantsPriceRange2  \
0                            NaN                                2.0   
1         510 S 52nd St, Ste 105                                NaN   
2                6235 S Pecos Rd                                2.0   
3  133 W Lake Mead Pkwy, Ste 140                                2.0   
4              1153 Enchanted Ct                                N

As shown, it takes only 2 seconds to return the personalized recommendation ranks, but due to the limited user preference history, the recommendation is somewhat similar to the generic recommendation for unseen users. 

In [12]:
%%time

# test 5: valid user id (user has over 100 reviews)
print("------\nresult from test5 (valid user id --- user has over 100 reviews): ")
col.collaborative(user_id='---1lKK3aKOuomHnwAkAow');

------
result from test5 (valid user id --- user has over 100 reviews): 
Below is a list of the top 5 recommended restaurants for you: 
   predicted_stars state             city                     name  \
0         6.505740    NV        Las Vegas             Kabob N More   
1         6.396645    NV        Las Vegas              Tasty Grill   
2         6.306595    NV        Las Vegas           Tacos N' Ritas   
3         6.051560    NV        Las Vegas  KUMI by Chef Akira Back   
4         6.046227    NV  North Las Vegas  Amazing Thai Restaurant   

                          address  attributes.RestaurantsPriceRange2  \
0           3049 S Las Vegas Blvd                                2.0   
1               4140 S Durango Dr                                1.0   
2  MGM Grand, 3799 Las Vegas Blvd                                2.0   
3           3950 Las Vegas Blvd S                                3.0   
4          3000 W Ann Rd, Ste 109                                2.0   

          

As shown, even for users with more review history where the module needs to filter and remove all the rated restaurants from the recommendation list, it only takes 2 seconds to return the personalized recommendation rank. Thanks to the rich personal preference history, the recommendation is really personalized. As in this case, it seems to suggest that the user prefers restaurants with a rich number of reviews (popular restaurants), reasonable to good ratings (3.5-4.5) and in the lower price range (\$-\$$).

In [22]:
%%time

# test 6: valid user id (user has over 100 reviews)
print("------\nresult from test6 (valid user id --- user has over 100 reviews): ")
recomm = col.collaborative(user_id='---1lKK3aKOuomHnwAkAow');

# filter the personalized recommendation with keywords
print("------\nfurther filtering the personalized recommendations by keywords:")
recomm = col.keyword(df=recomm, city='Phoenix', personalized=True)

------
result from test6 (valid user id --- user has over 100 reviews): 
Below is a list of the top 5 recommended restaurants for you: 
   predicted_stars state             city                     name  \
0         6.505740    NV        Las Vegas             Kabob N More   
1         6.396645    NV        Las Vegas              Tasty Grill   
2         6.306595    NV        Las Vegas           Tacos N' Ritas   
3         6.051560    NV        Las Vegas  KUMI by Chef Akira Back   
4         6.046227    NV  North Las Vegas  Amazing Thai Restaurant   

                          address  attributes.RestaurantsPriceRange2  \
0           3049 S Las Vegas Blvd                                2.0   
1               4140 S Durango Dr                                1.0   
2  MGM Grand, 3799 Las Vegas Blvd                                2.0   
3           3950 Las Vegas Blvd S                                3.0   
4          3000 W Ann Rd, Ste 109                                2.0   

          

In [23]:
%%time

# test 7: try to run keyword filtering on personalized recommendation directly
print("------\nresult from test7 (run keyword filtering on personalized recommendations directly):")
col.keyword(city='Phoenix', personalized=True)

------
result from test7 (run keyword filtering on personalized recommendations directly):
no personalized list of recommendations is generated yet!
please first run the collaborative recommender module or content-based recommender module for a personalized recommendations.
CPU times: user 33.5 ms, sys: 4.68 ms, total: 38.2 ms
Wall time: 37.8 ms


### 2.2.3 Testing of the personalized content-based recommender module

In [24]:
%%time

# initiate a Recommender object
con = Recommender(n=10)

# test0: display only (same as no keywords)
print("------\nresult from test0 (display only): ")
con.display_recommendation()

# test1: no user id input
print("------\nresult from test1 (no user id input): ")
con.content();

# test 2: invalid user id input
print("------\nresult from test2 (invalid user id input): ")
con.content(user_id='928402');

# test 3: valid user id (no user data)
print("------\nresult from test3 (valid user id --- no user review data): ")
con.content(user_id='-NzChtoNOw706kps82x0Kg');

------
result from test0 (display only): 
Below is a list of the top 10 recommended restaurants for you: 
      state             city                                  name  \
7464     AZ          Phoenix                       Little Miss BBQ   
31910    NV        Las Vegas                          Brew Tea Bar   
45401    NV        Las Vegas                            Gelatology   
7784     NV  North Las Vegas                          Poke Express   
28162    NV        Las Vegas                    Meráki Greek Grill   
2684     AZ             Mesa                        Worth Takeaway   
14567    NV        Las Vegas                Free Vegas Club Passes   
11521    NV        Las Vegas  Paranormal - Mind Reading Magic Show   
30972    NV        Las Vegas           Desert Wind Coffee Roasters   
46284    NV        Henderson                                HUMMUS   

                            address  attributes.RestaurantsPriceRange2  \
7464           4301 E University Dr              

In [25]:
%%time

# test 4: valid user id (user has only one review)
print("------\nresult from test4 (valid user id --- user has only one review): ")
con.content(user_id='---89pEy_h9PvHwcHNbpyg');

------
result from test4 (valid user id --- user has only one review): 
Below is a list of the top 10 recommended restaurants for you: 
   similarity_score state       city                               name  \
0          0.920511    NV  Henderson      The Bar At Bermuda & St. Rose   
1          0.913473    NV  Las Vegas  The Bar @ Las Vegas Blvd & Wigwam   
2          0.912803    NV  Las Vegas      The Bar @ Tropicana & Durango   
3          0.859365    NV  Las Vegas            The Bar @Trails Village   
4          0.848265    NV  Las Vegas              Distill - A Local Bar   
5          0.847307    NV  Las Vegas              Distill - A Local Bar   
6          0.841570    NV  Henderson                    Remedy's Tavern   
7          0.814201    NV  Las Vegas                 Sunrise Casablanca   
8          0.813073    NV  Las Vegas                   Aces Bar & Grill   
9          0.803431    NV  Las Vegas                Cactus Jacks Saloon   

                          address  att

The best total time to return the recommendation is around 25 seconds, but the time needed to load the feature vectors is 20 seconds by itself. Therefore, to speed up the recommender's response time, an alternative way is to load in the restaurant and user feature vectors when initializing the recommender object (under the \__init\__ method). 

As shown, the personalized recommendation result features mid-price range bars of high ratings near Las Vegas, these are very personalized recommendations based on the user's only review of a 4-star nightlife bars in the mid-price range located in Las Vegas, and the user gives the bar a 5-star review with strong positive words, a clear indication of his/her preference.

In [26]:
%%time

# test 5: valid user id (user has over 100 reviews)
print("------\nresult from test5 (valid user id --- user has over 100 reviews): ")
con.content(user_id='Ox89nMY8HpT0vxfKGqDPdA');

------
result from test5 (valid user id --- user has over 100 reviews): 
Below is a list of the top 10 recommended restaurants for you: 
   similarity_score state       city                          name  \
0          0.656317    AZ    Gilbert              Joe's Farm Grill   
1          0.613937    AZ       Mesa                  Orchard Eats   
2          0.583544    AZ    Phoenix                 Welcome Diner   
3          0.575226    AZ    Phoenix  Wally's American Pub N Grill   
4          0.574281    AZ    Phoenix    Phoenix Public Market Cafe   
5          0.572199    AZ      Tempe      Tempe Public Market Cafe   
6          0.568653    AZ    Phoenix       Switch Restaurant & Bar   
7          0.568363    AZ    Phoenix                           FEZ   
8          0.568114    NV  Henderson        Henry's American Grill   
9          0.565331    AZ    Phoenix        St. Francis Restaurant   

                     address  attributes.RestaurantsPriceRange2  \
0      3000 E Ray Rd, Bld

As shown, the personalized recommendation list features popular restaurants (over 150 reviews) in the low-to-mid price range, featuring american style cuisines (pizza, burger, sandwiches), located in Arizona. These are very personalized recommendations based on the user's history of 120 restaurant reviews.

In [29]:
%%time

# test 6: valid user id (user has over 100 reviews)
print("------\nresult from test6 (valid user id --- user has over 100 reviews): ")
recomm = con.content(user_id='---1lKK3aKOuomHnwAkAow');

# filter the personalized recommendation with keywords
print("------\nfurther filtering the personalized recommendations by keywords:")
recomm = con.keyword(df=recomm, city='Phoenix', personalized=True)

------
result from test6 (valid user id --- user has over 100 reviews): 
Below is a list of the top 10 recommended restaurants for you: 
   similarity_score state       city                      name  \
0          0.642374    NV  Las Vegas                   Firefly   
1          0.616575    NV  Las Vegas      Julian Serrano Tapas   
2          0.610763    NV  Las Vegas                   Sinatra   
3          0.604054    NV  Las Vegas  Trevi Italian Restaurant   
4          0.602714    NV  Las Vegas       Eatt Gourmet Bistro   
5          0.601211    NV  Las Vegas   Panevino Italian Grille   
6          0.599102    NV  Las Vegas                     Jaleo   
7          0.596687    NV  Las Vegas    Harvest by Roy Ellamar   
8          0.595337    NV  Las Vegas                     Crush   
9          0.594966    NV  Las Vegas                 Canaletto   

                            address  attributes.RestaurantsPriceRange2  \
0                  3824 Paradise Rd                           

In [30]:
%%time

# test 7: try to run keyword filtering on personalized recommendation directly
print("------\nresult from test7 (run keyword filtering on personalized recommendations directly):")
con.keyword(city='Phoenix', personalized=True)

------
result from test7 (run keyword filtering on personalized recommendations directly):
no personalized list of recommendations is generated yet!
please first run the collaborative recommender module or content-based recommender module for a personalized recommendations.
CPU times: user 45.4 ms, sys: 96.4 ms, total: 142 ms
Wall time: 177 ms


# 3. Functions to control API interfaces