# Lunchanator: The lunch deciding utility

### Introduction

The goal of this project is to solve the problem of deciding where a group of people should go eat together. Frequently coworkers and friends wish to eat together, but must make a decision as to where. In order to speed up the process and lower the probability of an argument, we will automate this decision process via a recommendation based on individual preferences using Foursquare data. 

The target audience is any individual interested in engaging in the social activity of eating in a large group, whether for pleasure, work, or both. Selecting a dining establishment that can accommodate several individuals' choices at once with minimal friction should be of interest to the average individual, in general.

### Data

In order to complete this project, we will need two major sources of data:

1. A database containing user profiles with their restaurant preferences. For now, users will have a name, a list of restaurants they enjoy, and a list of restaurants they wish to avoid. (A future version could eventually allow for a more complex profile.)

2. The Foursquare location data to help find a restaurant. (A future version could make use of a restaurants rating to help select a location.)

    In a future version, a third dataset would be useful for New York users:

3. The NYC Open Data webpage contains a listing of restaurant grades given by the city. This could be used to select restaurants with a minimum grade score.

A very basic example of user data:

|         |           | 
| ------------- |:-------------:| 
| User      | James |
| Likes      | Taco House 39, Chen's Garden, Pho Viet     | 
| Dislikes | Pizza Hut, La Sorrentina      |

### Methodology

Since no database currently contains our desired dataset on user restaurant preferences, we created one for 20 randomly generated users each having a profile with 15 local restaurants that they enjoy and 15 that they do not.

In order to make a recommendation, a group of users that are going to eat together needs to be selected from the overall user dataset. Once a group is selected, Foursquare is queried for a list of restaurants that are near the group's location. The result of the query is stored as a list.

We now begin constructing a dictionary where the keys are the restaurants from our Foursquare query and the values are the sum of the user preferences for that restaurant (a user contributes a 1 if they like it, -1 if they do not, and 0 if the restaurant does not appear anywhere in the user profile).

Once the dictionary is constructed, it can be converted into a dataframe. Restaurants are then ordered by the sum of the user preferences. A random restaurant is selected from the top 8 choices (otherwise the utility will generate the same restaurant if the group and user preferences remain static) to introduce variety.

We also have a method for determining the compatibility of two users. We can create a dictionary of restaurants that have been rated by both users. For each such restaurant we take the product of the user preferences, add up the values, and normalize. This is equivalent to taking a dot product of two vectors. Values close to 1 indicate the both users have similar preferences (it implies that both rated several restaurants the same way). This could be used to make restaurant recommendations based on other user preferences. This is an example of collaborative filtering.

### Results and Discussion

The output of any run in our utility is a single restaurant. This restaurant has been selected at random from a short list of the restaurants that are the most popular among the currently selected lunch group. While the output is simply a single item, the discussion in the Methodology section shows that this output has been produced by carefully considering the preferences of all users in a lunch group. 

In our examples below we also compute the compatibility between <code>user0</code> and all other users. This can be used to make a recommendation for a restaurant that <code>user0</code> may like/dislike based on other user preferences.

### Conclusion

Arrow's Theorem suggests that unless a lunch group is deciding where to eat between one of two choices, there is no perfect voting system for selecting an overall winner. Our goal here has been to introduce a system with a simple yet reasonable way to select a restaurant that will please as many people as possible in a lunch group. Clearly this will become more difficult as the group size increases, but we feel that this method is effective for small groups (that is, no more than 6).

In a future version, users should be able to create more complicated profiles, perhaps describing their preferences based on type of cuisine, how much they like/dislike a restaurant (a ranking from -5 to 5, say), past attendance (in case a user has already attended a restaurant in the past week), restaurant grade (in New York) or rating, etc. One should therefore consider this utility as a work in progress.

### Examples

Below we show some examples of our utility at work.

#### Recommendation for a lunch group

Importing needed packages

In [1]:
import requests
import json
import geocoder
import numpy as np
import pandas as pd

We used the Bing geocoder because the Google limit for the month was reached via other projects.

In [2]:
g = geocoder.bing('Brooklyn, NY', key='')

Foursquare request for restaurants in the Brooklyn, NY area.

In [3]:
url = 'https://api.foursquare.com/v2/venues/explore'

params = dict(
  client_id = '', # your Foursquare ID
  client_secret = '', # your Foursquare Secret
  v = '20180605', # Foursquare API version
  ll = str(g.json['lat']) + ',' + str(g.json['lng']),
  query = 'restaurant',
  radius = 1000,
  limit = 50,
)
resp = requests.get(url=url, params=params)
data = json.loads(resp.text)

Parsing results into a list containing restaurant names.

In [4]:
restaurants = []
for m in data['response']['groups'][0]['items']:
    restaurants.append(m['venue']['name'])

This block generates a pretend dataset of 20 users each having selected 15 "good" and 15 "bad" restaurants each. Note that no user can have the same restaurant in both categories.

In [5]:
d = {}
d['users'] = ['user'+str(i) for i in range(20)]
d['good restaurants'] = []
d['bad restaurants'] = []
for i in range(1,21):
    restaurant_list = np.random.choice(restaurants, size=30, replace=False)
    d['good restaurants'].append(restaurant_list[:15])
    d['bad restaurants'].append(restaurant_list[15:])

Transforming the above generated data into a dataframe.

In [6]:
user_df = pd.DataFrame.from_dict(d)

In [7]:
user_df = user_df[['users','good restaurants','bad restaurants']]

An example of user data.

In [8]:
user_df.head()

Unnamed: 0,users,good restaurants,bad restaurants
0,user0,"[Wild Ginger, Five Guys, Korilla BBQ, Pio Bage...","[Iron Chef House, La Vara, Lassen & Hennigs, B..."
1,user1,"[Yaso Tangbao, Five Guys, Dellarocco's, French...","[Iron Chef House, Five Guys, Chipotle Mexican ..."
2,user2,"[Iron Chef House, DeKalb Market Hall, Bareburg...","[Korilla BBQ, Luzzo's BK, Hibino, two8two Bar ..."
3,user3,"[Pio Bagel, Gregory's Coffee, Korilla BBQ, Mil...","[Damascus Bread & Pastry Shop, Shelsky's of Br..."
4,user4,"[two8two Bar & Burger, Court Street Bagels, Mi...","[La Bagel Delight, The Gumbo Bros, Fast and Fr..."


Creating a group of 6 people at random that wish to have lunch together.

In [9]:
df_lunch_group = user_df.sample(n=6)

In [10]:
df_lunch_group = df_lunch_group.sort_index()

The lunch group.

In [11]:
df_lunch_group

Unnamed: 0,users,good restaurants,bad restaurants
4,user4,"[two8two Bar & Burger, Court Street Bagels, Mi...","[La Bagel Delight, The Gumbo Bros, Fast and Fr..."
11,user11,"[Panera Bread, Dellarocco's, Grand Army, La Ba...","[The Atlantic ChipShop, Chez Moi, Pio Bagel, F..."
12,user12,"[Damascus Bread & Pastry Shop, Ki Sushi, Frenc...","[One Girl Cookies, Gregory's Coffee, The Atlan..."
13,user13,"[Damascus Bread & Pastry Shop, Yaso Tangbao, F...","[Wild Ginger, Hibino, Rocco's Tacos and Tequil..."
15,user15,"[Five Guys, Iron Chef House, Hibino, Doner Keb...","[Five Guys, Luzzo's BK, Court Street Bagels, F..."
19,user19,"[Five Guys, Panera Bread, Mile End Delicatesse...","[Yaso Tangbao, Yemen Cafe, Shelsky's of Brookl..."


Creating the list of restaurants that will be voted on.

In [12]:
lunch_votes = [(restaurant,0) for restaurant in restaurants]

In [13]:
lunch_dict = dict(lunch_votes)

Tallying votes.

In [14]:
for rest in lunch_dict.keys():
    for good_choice in df_lunch_group['good restaurants']:
        if rest in good_choice:
            lunch_dict[rest] += 1
    for bad_choice in df_lunch_group['bad restaurants']:
        if rest in bad_choice:
            lunch_dict[rest] -= 1

Conversion to properly formatted dataframe.

In [15]:
d = {'restaurants': [], 'votes': []}
for k,v in lunch_dict.items():
    d['restaurants'].append(k)
    d['votes'].append(v)

In [16]:
df_lunch_votes = pd.DataFrame.from_dict(d)

Sorting the list of restaurants by vote count.

In [17]:
df_lunch_votes_sorted = df_lunch_votes.sort_values('votes', ascending = False)

The sorted list.

In [18]:
df_lunch_votes_sorted

Unnamed: 0,restaurants,votes
41,DeKalb Market Hall,4
14,Panera Bread,3
10,Colonie,3
39,Korilla BBQ,3
5,The Gumbo Bros,3
23,two8two Bar & Burger,2
31,Mile End Delicatessen,2
28,Maison Kayser,2
2,La Bagel Delight,2
6,Sophies Cuban Cuisine,2


A restaurant is selected at random from the top 8.

In [19]:
df_lunch_votes_sorted['restaurants'][:8].sample()

39    Korilla BBQ
Name: restaurants, dtype: object

#### Determining user compatibilities for <code>user 0</code>

We create a dictionary that computes a number from -1 to 1 between <code>user0</code> and all other users. The closer the value is to 1, the more compatible the two users' preferences. Values close to -1 means the users differ significantly.

In [20]:
comp = {}
for user in user_df[user_df['users'] != 'user0'].users:
    mySum = 0
    count = 0
    for restaurant in restaurants:
        a = 0
        b = 0
        if restaurant in user_df['good restaurants'][user_df['users']=='user0'].tolist()[0]:
            a = 1
        elif restaurant in user_df['bad restaurants'][user_df['users']=='user0'].tolist()[0]:
            a = -1
        if restaurant in user_df['good restaurants'][user_df['users']==user].tolist()[0]:
            b = 1
        elif restaurant in user_df['bad restaurants'][user_df['users']==user].tolist()[0]:
            b = -1    
        mySum += a*b
        count += abs(a*b)
    if count:
        comp[user] = round(mySum/count,2)
    else:
        comp[user] = 0

We now extract the users with the highest compatibility score with <code>user0</code>.

In [21]:
compMax = max(comp.values())

In [22]:
for k,v in comp.items():
    if comp[k] == compMax:
        print(k)

user2


Now that we know that <code>user2</code> has the most similar preferences to <code>user0</code>, we extract restaurants in <code>user2</code>'s lists of preferences that are not in <code>user0</code>'s preferences to make recommendations.

In [24]:
A = set(user_df['good restaurants'][user_df['users']=='user0'].tolist()[0])
B = set(user_df['good restaurants'][user_df['users']=='user2'].tolist()[0])
B-A

{'Bar Tabac',
 'Bareburger',
 'DeKalb Market Hall',
 'Iron Chef House',
 'Maison Kayser',
 'Panera Bread',
 "Rocco's Tacos and Tequila Bar Brooklyn",
 'The Gumbo Bros'}