# Restaurants around Georgia Tech, Atlanta

*Venkatavaradan Sunderarajan*

## Introduction/Business Problem - *week 4 submission*

Georgia Institute of Technology is a public institution that was founded in 1885. It has a total student enrollment of of over 32,000. The campus is spread over about 400 acres in the heart of Atlanta. This provides an ideal setting of loyal customers in search of diverse and trending food options in and around the locality. 

Despite availability of numerous options, students are often bored of existing options and are in constant lookout for new, affordable and healthy eating options around campus. The constantly evolving tastes and diversity of student population facilitates a great opportunity for restaurants to open in this area.

This project will focus on:
1. Mapping existing dining locations in and around GT campus
2. Classifying them based on cuisine, distance, popularity
3. Identifying oppportunities for new restaurants and/or cafes.

The target audience is active entrepreneurs in the catering and food industry looking for opportunities to serve a young and diverse student population.

## Data - *week 4 submission*

The data for this study will be primarily obtained from FourSquare API. The location of restaurants, cafes as well as dining options listed under the popular spots will be studied. Appropriate consideration for user ratings, distance to campus and classifications based on cuisine, pricing and store timings will prove effective in identifying gaps in the existing availability.


--- **End of week 4 submission** ---

---

## Code

In [None]:
import pandas as pd
from geopy.geocoders import Nominatim
import folium
import requests
from pandas.io.json import json_normalize

### Location and API access setup

In [None]:
address = 'Georgia Tech'

geolocator = Nominatim(user_agent="gt_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Georgia Tech are {}, {}.'.format(latitude, longitude))

In [None]:
CLIENT_ID = 'AISHYS45K1UNBO4BMDGMKGWQVNVNXBPKEAQEM4FCQEW2DEW1' # your Foursquare ID
CLIENT_SECRET = '04LIUNELIOIHAOU3V1XC3GFRRS1LGBJ1MFESWONURQVC3SI2' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

### Restaurants

In [None]:
LIMIT = 50
radius = 5000 #5km - 3 miles approx.
search_query = 'restaurant'

url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)

results = requests.get(url).json()
results

In [None]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head()

In [None]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
df1 = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
df1['categories'] = df1.apply(get_category_type, axis=1)

# clean column names by keeping only last term
df1.columns = [column.split('.')[-1] for column in df1.columns]

df1

In [None]:
df1.head()

In [None]:
df1.shape

In [None]:
# Map with Restaurants around GT
map_gt = folium.Map(location=[latitude, longitude], zoom_start=14)

# add GT as a red circle mark
folium.CircleMarker(
    [latitude, longitude],
    radius=10,
    popup='Georgia Tech',
    fill=True,
    color='red',
    fill_color='red',
    fill_opacity=0.6
    ).add_to(map_gt)

# add circle of interest
folium.Circle([latitude, longitude], 
              radius=5000, popup=None, 
              tooltip=None, 
              color = 'black', 
              fill = 'False'
             ).add_to(map_gt)

# add markers to map
for lat, lng, name in zip(df1['lat'], df1['lng'], df1['name']):
    label = '{}'.format(name)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_gt)  
    
# map_gt

### Cafe

In [None]:
search_query2 = 'cafe'

url2 = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query2, radius, LIMIT)

results2 = requests.get(url2).json()
results2

In [None]:
# assign relevant part of JSON to venues
venues2 = results2['response']['venues']

# tranform venues into a dataframe
dataframe2 = json_normalize(venues2)
dataframe2.head()

In [None]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns2 = ['name', 'categories'] + [col for col in dataframe2.columns if col.startswith('location.')] + ['id']
df2 = dataframe2.loc[:, filtered_columns2]

# filter the category for each row
df2['categories'] = df2.apply(get_category_type, axis=1)

# clean column names by keeping only last term
df2.columns = [column.split('.')[-1] for column in df2.columns]

df2

In [None]:
df2.head()

In [None]:
df2.shape

In [None]:
for lat, lng, name in zip(df2['lat'], df2['lng'], df2['name']):
    label = '{}'.format(name)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='red',
        fill=True,
        fill_color='orange',
        fill_opacity=0.7,
        parse_html=False).add_to(map_gt) 

In [None]:
map_gt

In [None]:
df = df1.append(df2)

### Places of interest

In [None]:
#popular spots - explore
LIMIT = 200
url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, radius, LIMIT)
explore = requests.get(url).json()
expitems = explore['response']['groups'][0]['items']

In [None]:
dfexp = json_normalize(expitems) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories'] + [col for col in dfexp.columns if col.startswith('venue.location.')] + ['venue.id']
dataframe_filtered = dfexp.loc[:, filtered_columns]

# filter the category for each row
dataframe_filtered['venue.categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean columns
dataframe_filtered.columns = [col.split('.')[-1] for col in dataframe_filtered.columns]

dataframe_filtered.head(10)

In [None]:
dfcats = dataframe_filtered.groupby(['categories']).count()
catslist = dfcats.index

In [None]:
catslist

In [None]:
filterlist = ['Asian Restaurant', 'Bakery', 'Bar', 'Breakfast Spot', 'Brewery', 'Burger Joint', 'Burrito Place', 'Café', 'Coffee Shop', 'Cuban Restaurant', 'Dive Bar', 'Donut Shop', 'Fast Food Restaurant', 'Gastropub', 'Gourmet Shop', 'Ice Cream Shop', 'Indian Restaurant', 'Italian Restaurant','Japanese Restaurant', 'Juice Bar', 'Korean Restaurant','Mediterranean Restaurant', 'Mexican Restaurant','New American Restaurant','Pizza Place', 'Salad Place','Sandwich Place', 'Seafood Restaurant', 'Soup Place', 'Southern / Soul Food Restaurant', 'Spanish Restaurant','Steakhouse', 'Taco Place', 'Tapas Restaurant', 'Vietnamese Restaurant']

In [None]:
df3 = dataframe_filtered[dataframe_filtered.categories.isin(filterlist)]

In [None]:
df3.head()

In [None]:
df3.shape

In [None]:
# add popular spots to the map as turquoise circle markers

for lat, lng, name in zip(df3['lat'], df3['lng'], df3['name']):
    label = '{}'.format(name)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='Green',
        fill=True,
        fill_color='turquoise',
        fill_opacity=0.7,
        parse_html=False).add_to(map_gt) 
    
    
# display map
# map_gt

In [None]:
df.shape

In [None]:
df = df.append(df3)

In [None]:
df.shape

### Data Cleansing and Sorting

In [None]:
df = df.drop_duplicates(subset='id', keep = 'first')

In [None]:
df = df.drop(['labeledLatLngs', 'cc','city','state','country','formattedAddress','crossStreet', 'neighborhood'],axis = 1)

df.drop(df[df['name'].astype(str).str.contains('Equipment')].index, axis = 0, inplace = True)

df.drop(df[df['name'].astype(str).str.contains('Depot')].index, axis = 0, inplace = True)

df.drop(df[df['name'].astype(str).str.contains('Chiropractic')].index, axis = 0, inplace = True)

In [None]:
df.sort_values(by='categories', inplace = True)

df.reset_index(inplace=True, drop = True)

In [None]:
df.head()

In [None]:
df.shape

In [None]:
df.to_csv('df.csv')