<h1 align='center'>I Don't Know - Where Do You Want to Eat?</h1>

<h1 align='center'>🍽️🤷‍♀️🤷‍♀️🤷‍♀️🥢</h1>

## Section 1) Introduction 📗

### It's a tough day at the office, and lunch time is here.  Have you ever had the following exchange with your co-workers and/ or friends?

#### 🤔: Where do you want to eat?
#### 🤨: I don't know.  Where do you want to eat?
#### 🙄: I don't know.  Where do you want to eat?

### And on and on and on 🤯.  Before you know it, lunch time is over, and it's time to get back to the gridstone.  You've spent most (if not all) of your lunch time simply trying to figure out where to eat.

#### We live in a world where we are faced with unprecidented choice in our options for entertainment, food, and habitation.  Being spolied for choice is a double-edged sword though.  In the face of so many options, people are often not happy with the choices that they make experiencing either buyer's remorse (fear of missing out aka FOMO) or being completely overwhelmed leading to analysis paralysis (AP).  My project (I Don't Know - Where Do You Want to Eat) aims to help lunch buddies solve the paradox of choice by implmeneting a hybrid recommender system (both content and collaborative filtering) to help find restaurants that meet the group's dining preferences and offering them 3 options that are close to their central location (the lunch buddies don't have to be at the same location).  The locations of the participants will be used to create a centroid (central location) between them that will be used in conjunction with queries the FourSquare API to return candidate restaurants fitting the top criteria of the diners.  I will then use the matrix factorization techique to create a sparse matrix that I will then feed to a LightFM model to fill in any blanks in the utilization matrix so that the top 3 recommendations can be returned and visualized.

## Section 2) Data 💾

### I will be using the following data sources to solve this problem:
#### - FourSquare API: To make calls to retrieve venues using the <a href='https://developer.foursquare.com/docs/api/venues/search'>`venues\search`</a> and <a href='https://developer.foursquare.com/docs/api/venues/details'>`venues\VENUEID`</a> endpoints.  This will be the basis for the item matrix that will help generate the utilization matrix as well as providing crucial information for the visualization of the final venue selections and their attributes (ratings, addresses, categories, etc.).  I will also make a call to the <a href='https://developer.foursquare.com/docs/api/venues/categories'>`venues\categories`</a>  endpoint to query all food venue category ids.  The users will also use this in order to construct their corresponding profiles.
#### - My dear friends and colleagues: My friends and colleauges have generously agreed to create user profiles along with their respective lunch time locations and dining preferences.  I will use this information along with the FourSquare Categories to create the user matrix.

In [1]:
# First things first
! conda install -c conda-forge folium --yes
! conda install -c conda-forge pandas-profiling --yes
! conda install -c conda-forge wordcloud --yes
! pip install lightfm
! pip install geocoder

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2019.6.16  |       hecc5488_0         145 KB  conda-forge
    branca-0.3.1               |             py_0          25 KB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    folium-0.10.0              |             py_0          59 KB  conda-forge
    openssl-1.1.1c             |       h516909a_0         2.1 MB  conda-forge
    certifi-2019.6.16          |           py36_1         149 KB  conda-forge
    altair-3.2.0               |           py36_0         770 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         3.3 MB

The following NEW packages will be INSTAL

In [2]:
# Let's get into it!

# standard weapons of choice
import pandas as pd
from pandas.io.json import json_normalize
import numpy as np
import scipy
from scipy.sparse import csr_matrix
from sklearn.model_selection import RandomizedSearchCV
from sklearn.preprocessing import StandardScaler, OneHotEncoder
import pandas_profiling

# for geolocation assistance
import geocoder

# for requesting
import requests

# all the viz
import folium
import matplotlib.pyplot as plt
import seaborn as sns
from wordcloud import WordCloud, STOPWORDS

# for the hybrid recommender system
from lightfm import LightFM

%matplotlib inline

In [3]:
# The code was removed by Watson Studio for sharing.

In [4]:
# Creating the url for the categories API call
category_url = f'https://api.foursquare.com/v2/venues/categories?client_id={CLIENT_ID}&client_secret={CLIENT_SECRET}&v={VERSION}'

# Creating the nub of the Food category dataframe with the parent cat name and id
json_response = requests.get(category_url).json()
food_category = json_response['response']['categories'][3]
food_id_dataframe = json_normalize(food_category)
food_id_dataframe = food_id_dataframe[['id', 'name']]

# Filling out the rest of the Food category dataframe with all the children and their respective name and ids
food_categories = food_category['categories']
for i, cat in enumerate(food_categories, 1):
    food_id_dataframe.loc[i] = [cat['id'], cat['name']]

food_id_dataframe.set_index('name', inplace=True)

# Here is the completed food_id_dataframe
food_id_dataframe

Unnamed: 0_level_0,id
name,Unnamed: 1_level_1
Food,4d4b7105d754a06374d81259
Afghan Restaurant,503288ae91d4c4b30a586d67
African Restaurant,4bf58dd8d48988d1c8941735
American Restaurant,4bf58dd8d48988d14e941735
Asian Restaurant,4bf58dd8d48988d142941735
Australian Restaurant,4bf58dd8d48988d169941735
Austrian Restaurant,52e81612bcbc57f1066b7a01
BBQ Joint,4bf58dd8d48988d1df931735
Bagel Shop,4bf58dd8d48988d179941735
Bakery,4bf58dd8d48988d16a941735


In [5]:
# ! pip install openpyxl # You may need to install this in your environment if it's not already included in your Anaconda distro

# Now to construct the user_feature matrix
user_category_df = food_id_dataframe.T
user_category_df.drop('id', axis=0, inplace=True)
user_category_df

# I need my friends and colleagues to add their user preferences for each restaurant category to the dataframe, so in order to collect their inputs I need an .xlsx version of the user_feature dataframe
user_category_df.to_excel('request_inputs.xlsx')

In [6]:
# Though the magic of time, my friends and colleagues have input their preferences for each cuisine.  I'll add them now.
# I gave my friends and colleauges instructions to fill out the user_category matrix with a rank of 1 - 5 (1 being lowest and 5 being highest).
# If my friend didn't have a preference or experience with a particular cuisine, I instructed them the leave the field blank.  The recommender system will take care of that.


In [7]:
# I also have the preferences of the users who submitted profiles

user_pref_column_names = ['user_name', 'loc_name', 'loc_address', 'pref_1', 'pref_2', 'pref_3', 'pref_4', 'pref_5']

user_loc_pref_df = pd.DataFrame(columns=user_pref_column_names)

def add_user_pref(username, locname, locaddress, pref1, pref2, pref3, pref4, pref5):
    '''
    Adds individual users to the user_loc_pref_df
    '''
    user_dict = {user_pref_column_names[0]: username, 
                 user_pref_column_names[1]: locname, 
                 user_pref_column_names[2]: locaddress, 
                 user_pref_column_names[3]: pref1, 
                 user_pref_column_names[4]: pref2, 
                 user_pref_column_names[5]: pref3, 
                 user_pref_column_names[6]: pref4, 
                 user_pref_column_names[7]: pref5}
    user_loc_pref_df.loc[len(user_loc_pref_df)] = user_dict

add_user_pref('Jessica', 'Amazon Doppler', '2021 7th Ave, Seattle, WA 98121', 'Asian Restaurant', 'Food Truck', 'Greek Restaurant', 'Indian Restaurant', 'Mediterranean Restaurant')
add_user_pref('Candy', 'The Park Place Building', '1200 6th Ave, Seattle, WA 98101', 'American Restaurant', 'Dumpling Restaurant', 'Italian Restaurant', 'Pizza Place', 'Steakhouse')
add_user_pref('Jessie', 'The IBM Building', '1200 5th Ave, Seattle, WA 98101', 'Greek Restaurant', 'Indian Restaurant', 'Mediterranean Restaurant', 'Middle Eastern Restaurant', 'Vegetarian / Vegan Restaurant')
add_user_pref('Elizabeth', 'Amazon Doppler', '2021 7th Ave, Seattle, WA 98121', 'Asian Restaurant', 'Indian Restaurant', 'Middle Eastern Restaurant', 'Soup Place', 'Vegetarian / Vegan Restaurant')
add_user_pref('Tracey', 'Amazon Kumo', '1915 Terry Ave, Seattle, WA 98101', '', '', '', '', '')
add_user_pref('David', 'Amazon re:Invent', '2121 8th Ave, Seattle, WA 98121', '', '', '', '', '')
add_user_pref('Anna', 'The Walt Disney Company', '925 4th Ave, Seattle, WA 98104', 'Indian Restaurant', 'Mediterranean Restaurant', 'Molecular Gastronomy Restaurant', 'Tea Room', 'Turkish Restaurant')
add_user_pref('Rosemary', 'Westlake Center', '400 Pine St, Seattle, WA 98101', 'African Restaurant', 'Bubble Tea Shop', 'Dumpling Restaurant', 'Seafood Restaurant', 'Vegetarian / Vegan Restaurant')

user_loc_pref_df.set_index('user_name', inplace=True)
user_loc_pref_df

Unnamed: 0_level_0,loc_name,loc_address,pref_1,pref_2,pref_3,pref_4,pref_5
user_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Jessica,Amazon Doppler,"2021 7th Ave, Seattle, WA 98121",Asian Restaurant,Food Truck,Greek Restaurant,Indian Restaurant,Mediterranean Restaurant
Candy,The Park Place Building,"1200 6th Ave, Seattle, WA 98101",American Restaurant,Dumpling Restaurant,Italian Restaurant,Pizza Place,Steakhouse
Jessie,The IBM Building,"1200 5th Ave, Seattle, WA 98101",Greek Restaurant,Indian Restaurant,Mediterranean Restaurant,Middle Eastern Restaurant,Vegetarian / Vegan Restaurant
Elizabeth,Amazon Doppler,"2021 7th Ave, Seattle, WA 98121",Asian Restaurant,Indian Restaurant,Middle Eastern Restaurant,Soup Place,Vegetarian / Vegan Restaurant
Tracey,Amazon Kumo,"1915 Terry Ave, Seattle, WA 98101",,,,,
David,Amazon re:Invent,"2121 8th Ave, Seattle, WA 98121",,,,,
Anna,The Walt Disney Company,"925 4th Ave, Seattle, WA 98104",Indian Restaurant,Mediterranean Restaurant,Molecular Gastronomy Restaurant,Tea Room,Turkish Restaurant
Rosemary,Westlake Center,"400 Pine St, Seattle, WA 98101",African Restaurant,Bubble Tea Shop,Dumpling Restaurant,Seafood Restaurant,Vegetarian / Vegan Restaurant


In [8]:
# Now to add the longitude and latitude of each of the users' locations
user_loc_pref_df.index
for user, address in zip(user_loc_pref_df.index, user_loc_pref_df['loc_address']):
    location = geocoder.arcgis(address)
    user_loc_pref_df.at[user, 'loc_lat'] = location.latlng[0]
    user_loc_pref_df.at[user, 'loc_lng'] = location.latlng[1]

user_loc_pref_df

Unnamed: 0_level_0,loc_name,loc_address,pref_1,pref_2,pref_3,pref_4,pref_5,loc_lat,loc_lng
user_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Jessica,Amazon Doppler,"2021 7th Ave, Seattle, WA 98121",Asian Restaurant,Food Truck,Greek Restaurant,Indian Restaurant,Mediterranean Restaurant,47.615461,-122.338238
Candy,The Park Place Building,"1200 6th Ave, Seattle, WA 98101",American Restaurant,Dumpling Restaurant,Italian Restaurant,Pizza Place,Steakhouse,47.608968,-122.332448
Jessie,The IBM Building,"1200 5th Ave, Seattle, WA 98101",Greek Restaurant,Indian Restaurant,Mediterranean Restaurant,Middle Eastern Restaurant,Vegetarian / Vegan Restaurant,47.608288,-122.333385
Elizabeth,Amazon Doppler,"2021 7th Ave, Seattle, WA 98121",Asian Restaurant,Indian Restaurant,Middle Eastern Restaurant,Soup Place,Vegetarian / Vegan Restaurant,47.615461,-122.338238
Tracey,Amazon Kumo,"1915 Terry Ave, Seattle, WA 98101",,,,,,47.616665,-122.334295
David,Amazon re:Invent,"2121 8th Ave, Seattle, WA 98121",,,,,,47.616852,-122.338556
Anna,The Walt Disney Company,"925 4th Ave, Seattle, WA 98104",Indian Restaurant,Mediterranean Restaurant,Molecular Gastronomy Restaurant,Tea Room,Turkish Restaurant,47.605826,-122.332736
Rosemary,Westlake Center,"400 Pine St, Seattle, WA 98101",African Restaurant,Bubble Tea Shop,Dumpling Restaurant,Seafood Restaurant,Vegetarian / Vegan Restaurant,47.611338,-122.337339


In [9]:
# Let's find the central location of the lunch buddies.  This will be done by getting the mean latitude and longitude of all the users.
central_lat = user_loc_pref_df['loc_lat'].mean()
central_lng = user_loc_pref_df['loc_lng'].mean()

print(central_lat, central_lng)

47.61235729895692 -122.33565420862787


In [16]:
# What to do about people at the same location?  Let's group them for mapping purposes
users_at_same_loc = [user for user in user_loc_pref_df.loc[user_loc_pref_df.duplicated(subset=['loc_lat', 'loc_lng'], keep=False), :].index]
users_at_same_loc

['Jessica', 'Elizabeth']

In [10]:
# Let's map out Seattle, WA USA
seattle_wa = geocoder.arcgis('Seattle, WA')

seattle_lat, seattle_lng = seattle_wa.latlng[0], seattle_wa.latlng[1] 
seattle_map = folium.Map(location=(seattle_lat, seattle_lng), zoom_start=14)

for lat, lng, uname, locname in zip(user_loc_pref_df['loc_lat'], user_loc_pref_df['loc_lng'], user_loc_pref_df.index, user_loc_pref_df['loc_name']):
    folium.CircleMarker(location=(lat, lng), popup=f'{uname} is located at {locname}').add_to(seattle_map)

icon = folium.features.CustomIcon('https://ss3.4sqi.net/img/categories_v2/food/breakfast_bg_32.png', icon_size=(32, 32))
folium.Marker(location=(central_lat, central_lng), popup=f'This is the central location', icon=icon, color='green').add_to(seattle_map)


seattle_map

In [11]:
# Now that we have the central location, we can make calls to the FourSquare API to explore which restaurants are in the area around the central location
radius = 100 # in meters
limit = 10

cat_id = food_id_dataframe[food_id_dataframe.index == 'Food']['id'].values[0]

# Creating the url for the categories API call
food_venues_url = f'https://api.foursquare.com/v2/venues/search?client_id={CLIENT_ID}&client_secret={CLIENT_SECRET}&v={VERSION}&ll={central_lat},{central_lng}&radius={radius}&categoryId={cat_id}&limit={limit}'
food_venues_url

# Creating the nub of the Food category dataframe with the parent cat name and id
food_response = requests.get(food_venues_url).json()
sub_cats = food_response['response']['venues']
for cat in sub_cats:
    print(cat['id'], cat['name'], cat['location']['lat'], cat['location']['lng'], cat['location']['address'], [c['id'] for c in cat['categories']], [c['name'] for c in cat['categories']], [c['icon']['prefix']+'bg_32'+c['icon']['suffix'] for c in cat['categories']][0])
# venues_dataframe = json_normalize(cat['id'], cat['name'], cat['location']['lat'], cat['location']['lng'], cat['location']['address'], [c['id'] for c in cat['categories']], [c['name'] for c in cat['categories']], [c['icon']['prefix']+'bg_44'+c['icon']['suffix'] for c in cat['categories']][0])
# venues_dataframe = food_id_dataframe[['id', 'name']]



4b67331cf964a520fe402be3 Starbucks 47.61251678 -122.3355006 600 Pine St ['4bf58dd8d48988d1e0931735'] ['Coffee Shop'] https://ss3.4sqi.net/img/categories_v2/food/coffeeshop_bg_32.png
573b2c3038faeff59fd2c4c3 Ebar 47.6123401969812 -122.33605742454529 500 Pine St ['4bf58dd8d48988d16d941735'] ['Café'] https://ss3.4sqi.net/img/categories_v2/food/cafe_bg_32.png
582f57c70162176af65dcf1a Din Tai Fung Dumpling House 47.61267083634607 -122.33507292554181 600 Pine St Ste 403 ['4bf58dd8d48988d108941735'] ['Dumpling Restaurant'] https://ss3.4sqi.net/img/categories_v2/food/dumplings_bg_32.png
5cda2dfea22db7002c226bac Matcha Café Maiko 47.612109 -122.337242 400 Pine St ['4bf58dd8d48988d1c9941735'] ['Ice Cream Shop'] https://ss3.4sqi.net/img/categories_v2/food/icecream_bg_32.png
4ac6b78af964a520f9b520e3 Il Fornaio Seattle 47.613027479982605 -122.33630175061217 600 Pine St ['4bf58dd8d48988d110941735'] ['Italian Restaurant'] https://ss3.4sqi.net/img/categories_v2/food/italian_bg_32.png
5a19f01db1ec13338

## Section 3) Methodology 🔬

### I will be utilizing the LightFM module to help fit the utilization matrix into a hybrid recommender system.  This will lay the foundation for introducing new users and new categories into the lunchtime ecosystem for the purposes of generating a predicted recommendation based on the profile of the exisiting users.  I will use stochastic gradient descent to help converge the weighted approximate-ranke pairwise loss function.  Additionally, I'll tune my hyperparameters utilizing the RandomizedSearchCV to help select the optimal parameters for the LightFM model. 

## Section 4) Results 📊

## Section 5) Observations 💡

## Section 6) Conclusions ✨