## Restaurant Recommender For Groups
**Business Problem**: It is Friday night 7pm. You and two of your colleagues would like to dine in somewhere. You exchange restaurant ideas for 30 minutes, but cannot agree on a decision. All of you are now frustrated and hangry, and decided to just grab Chipotle and call it a night.

**Proposed Solution**: Provide a Slack Chatbot that offers restaurant recommendations based on the group’s common cuisine preference. This iPython notebook documents the methodology for determining a common cuisine preference among a group of Foursquare users on Slack. The Slack chatbot will also provide the ability for users to schedule events with each other.

**Additional Detail**: Please see accompanying slides on https://docs.google.com/presentation/d/1zsz_M1aGIyk_L-ti0r-xGCCZW7WBQUczMG-rQIC8Q7I/pub?start=false&loop=false&delayms=3000

In [689]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation
 
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

# Download neighborhood names and coordinates; Original data source: https://geo.nyu.edu/catalog/nyu_2451_34572
!wget -O nyc_neighborhoods.csv --quiet https://ibm.box.com/shared/static/vgmf4nauors62vyzv7mg2ul4fpzvomen.csv

# read csv
nyc = pd.read_csv("nyc_neighborhoods.csv")

In [690]:
# assign client id, client secret, and version
CLIENT_ID = "LJ2K2AQQ3SBCETKDTRLCETGRXJO2MZXFXBCTGURVNPBRUKQY"
CLIENT_SECRET = "B5I2ASTKHFSTAH5P1UPWCM5K1CQEWCVYTX3PYSIOXTEVRVFT"
VERSION = "20170514"

In [691]:
# create url
url="https://api.foursquare.com/v2/venues/categories?client_id={}&client_secret={}&v={}".format(CLIENT_ID, CLIENT_SECRET, VERSION)
url

'https://api.foursquare.com/v2/venues/categories?client_id=LJ2K2AQQ3SBCETKDTRLCETGRXJO2MZXFXBCTGURVNPBRUKQY&client_secret=B5I2ASTKHFSTAH5P1UPWCM5K1CQEWCVYTX3PYSIOXTEVRVFT&v=20170514'

In [692]:
# load json results
results = requests.get(url).json()

In [693]:
# assign relevant part of JSON to foodcat
foodcat = results["response"]["categories"][3]["categories"]

# tranform foodcat into a dataframe, keeping only shortName
foodcat = json_normalize(foodcat)
foodcat = foodcat['shortName']
foodcat.head()

0        Afghan
1       African
2      American
3         Asian
4    Australian
Name: shortName, dtype: object

In [694]:
# create dummy users with randomized count of venue likes for each food category
df = pd.DataFrame(np.random.randint(0,25,size=(5, 91)), index=['Alex', 'Bella', 'Carlos', 'Denise', 'Elise'], columns=foodcat)

# insert total likes by user
total_likes = df.sum(axis = 1)
df.insert(0, 'Likes', total_likes)

# insert standard deviations from mean based on number of likes
df.insert(1, 'Multiplier', (df.Likes - df.Likes.mean())/df.Likes.std())
df

shortName,Likes,Multiplier,Afghan,African,American,Asian,Australian,Austrian,BBQ,Bagels,...,Sri Lankan,Steakhouse,Swiss,Tea Room,Theme Restaurant,Truck Stop,Turkish,Ukrainian,Vegetarian / Vegan,Wings
Alex,1114,0.47006,5,11,21,23,5,5,23,0,...,13,4,24,20,12,24,22,6,2,3
Bella,1117,0.517702,16,16,7,13,24,6,9,18,...,13,22,6,14,4,18,17,2,20,15
Carlos,1135,0.803549,2,15,13,1,20,7,19,15,...,12,8,17,16,17,2,22,24,16,5
Denise,978,-1.689677,17,17,10,23,14,8,4,10,...,12,14,17,8,14,21,21,6,9,22
Elise,1078,-0.101635,1,6,5,14,3,4,21,0,...,7,1,12,4,22,19,4,17,14,2


In [695]:
# convert counts percentage based on total likes by person
df = df.loc[:,"Afghan":"Wings"].div(df["Likes"], axis=0)
df

shortName,Afghan,African,American,Asian,Australian,Austrian,BBQ,Bagels,Bakery,Belgian,...,Sri Lankan,Steakhouse,Swiss,Tea Room,Theme Restaurant,Truck Stop,Turkish,Ukrainian,Vegetarian / Vegan,Wings
Alex,0.004488,0.009874,0.018851,0.020646,0.004488,0.004488,0.020646,0.0,0.017056,0.017953,...,0.01167,0.003591,0.021544,0.017953,0.010772,0.021544,0.019749,0.005386,0.001795,0.002693
Bella,0.014324,0.014324,0.006267,0.011638,0.021486,0.005372,0.008057,0.016115,0.006267,0.003581,...,0.011638,0.019696,0.005372,0.012534,0.003581,0.016115,0.015219,0.001791,0.017905,0.013429
Carlos,0.001762,0.013216,0.011454,0.000881,0.017621,0.006167,0.01674,0.013216,0.008811,0.014097,...,0.010573,0.007048,0.014978,0.014097,0.014978,0.001762,0.019383,0.021145,0.014097,0.004405
Denise,0.017382,0.017382,0.010225,0.023517,0.014315,0.00818,0.00409,0.010225,0.022495,0.010225,...,0.01227,0.014315,0.017382,0.00818,0.014315,0.021472,0.021472,0.006135,0.009202,0.022495
Elise,0.000928,0.005566,0.004638,0.012987,0.002783,0.003711,0.019481,0.0,0.013915,0.017625,...,0.006494,0.000928,0.011132,0.003711,0.020408,0.017625,0.003711,0.01577,0.012987,0.001855


In [696]:
# get sum of scores for each food category
total_score = df.sum(axis=0).sort_values(ascending = False)

# determine group cuisine preference
topcat = total_score.head(1).index[0].encode('ascii','ignore')
topcat

'Kebab'

In [697]:
# user will provide a neighborhood name; for demonstration purposes, we choose Tribeca
hood = "Tribeca" # this response will vary based on user input
hood_lat = nyc.loc[nyc['neighborhood'] == hood].latitude.item()
hood_lng = nyc.loc[nyc['neighborhood'] == hood].longitude.item()
radius = 1000

In [698]:
# Get recommended restaurants based on group cuisine preference
url="https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&radius={}&ll={},{}&query={}".format(CLIENT_ID, CLIENT_SECRET, VERSION, radius, hood_lat, hood_lng, topcat)
url

'https://api.foursquare.com/v2/venues/explore?client_id=LJ2K2AQQ3SBCETKDTRLCETGRXJO2MZXFXBCTGURVNPBRUKQY&client_secret=B5I2ASTKHFSTAH5P1UPWCM5K1CQEWCVYTX3PYSIOXTEVRVFT&v=20170514&radius=1000&ll=40.72152197,-74.01068329&query=Kebab'

In [699]:
# get JSON
results = requests.get(url).json()

In [700]:
# assign list of restaurants from json
toprest_json = results["response"]["groups"][0].values()[0]

In [701]:
# structure restaurant data (i.e. restaurant name, checkinCount, and whether the restaurant is open right now)
restaurant = []
checkinCount = []
isOpen = []
address = []
tip = []
for i in range(len(toprest_json)):
    try:
        restaurant.append(results['response']['groups'][0].values()[0][i].values()[2]['name'].encode('ascii','ignore'))
        checkinCount.append(results['response']['groups'][0].values()[0][i].values()[2]['stats']['checkinsCount'])
        address.append(results['response']['groups'][0].values()[0][i].values()[2]['location']['address'].encode('ascii','ignore'))
        try:
            isOpen.append(results['response']['groups'][0].values()[0][i].values()[2]['hours']['isOpen']) 
        except:
            isOpen.append("NA")
        try:
            tip.append(results['response']['groups'][0].values()[0][i].values()[1][0]['text'].encode('ascii','ignore'))
        except:
            tip.append("NA")
    except:
        restaurant.append(results['response']['groups'][0].values()[0][i].values()[1]['name'].encode('ascii','ignore'))
        checkinCount.append(results['response']['groups'][0].values()[0][i].values()[1]['stats']['checkinsCount'])
        address.append(results['response']['groups'][0].values()[0][i].values()[1]['location']['address'].encode('ascii','ignore'))
        try:
            isOpen.append(results['response']['groups'][0].values()[0][i].values()[1]['hours']['isOpen'])
        except:
            isOpen.append("NA")
        try:
            tip.append(results['response']['groups'][0].values()[0][i].values()[1][0]['text'].encode('ascii','ignore'))
        except:
            tip.append("NA")

In [702]:
# concatenate lists into dataframe rec_df
rec_df = pd.DataFrame(
    {'Restaurant': restaurant,
     'CheckInCount': checkinCount,
     'OpenNow': isOpen,
     'Address': address,
     'Tip': tip
    })

# reorder rec_df column names
rec_df = rec_df[['Restaurant', 'CheckInCount', 'OpenNow', 'Address', 'Tip']]

# filter rec_df by OpenNow = True
rec_df = rec_df[(rec_df.OpenNow == True)]

In [703]:
# return list of restaurants ordered by CheckInCount
rec_df = rec_df.sort_values(by = "CheckInCount", ascending = False)
rec_df.head()

Unnamed: 0,Restaurant,CheckInCount,OpenNow,Address,Tip
1,12 Chairs,6354,True,56 Macdougal St,Lamb mini kebabs
2,Antique Garage,5967,True,41 Mercer St,Live jazz and piano soloist most nights. Upmar...
0,Souk & Sandwich,1006,True,117 Avenue of the Americas,"O sanduche de kebab delicioso! Muito gostoso,..."
