# Battle of the Neighbourhoods

Calgary Alberta is a major city in Canada. In this project, we look at which neighbourhoods would be good to have a food truck festival in based on which neighbourhoods are considered food districts.

## Business problem

The local food truck association is planning to organize a food truck festival, the committee has decided that to showcase the quality of the city's food trucks, the best place to host this festival is in neighbourhoods with a higher amount of resteraunts, or who's top venues are resteraunts to characterise them as food districts. This allows the food trucks to serve people who are going out for supper and entertainment, but also provides opportunities for other sales, collaborations between food trucks and resteraunts, and will draw a large population to the food district during the festival. 

## The Data

The data for the neighbourhoods in Calgary have been scraped from the city of calgary's open data website. This data includes the name of the community, the quadrant of the city that the community is in and the latitude and longitude amongst other information.

See: https://data.calgary.ca/Base-Maps/Community-Points/j9ps-fyst

In [12]:
import pandas as pd
df = pd.read_csv('Community_Points.csv')
df.head()

Unnamed: 0,CLASS,CLASS_CODE,COMM_CODE,NAME,SECTOR,SRG,COMM_STRUCTURE,longitude,latitude,location
0,Residential,1,SPR,SPRUCE CLIFF,WEST,BUILT-OUT,1950s,-114.136468,51.048198,"(51.0481976609441, -114.136468233901)"
1,Industrial,2,MER,MERIDIAN,NORTHEAST,,EMPLOYMENT,-113.996513,51.056587,"(51.056586672185, -113.996512939974)"
2,Residential,1,LKV,LAKEVIEW,WEST,BUILT-OUT,1960s/1970s,-114.129634,50.999781,"(50.9997806571053, -114.129634105238)"
3,Major Park,3,GPK,GLENMORE PARK,SOUTH,,PARKS,-114.131536,50.990456,"(50.9904556913229, -114.131535599359)"
4,Residential,1,DRN,DEER RUN,SOUTH,BUILT-OUT,1980s/1990s,-114.009089,50.925408,"(50.9254084299019, -114.009088593916)"


In [13]:
import folium
latitude = 51.0486
longitude = -114.0708
map_calgary = folium.Map(location=[latitude, longitude], zoom_start=10)
for lat, lng, community in zip(df['latitude'], df['longitude'],df['NAME']):
    label='{}'.format(community)
    label=folium.Popup(label,parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_calgary)
map_calgary

## Methodology

The following methodology will be used to perform the study:
- Filter the data (for example, remove 'industrial communities')
- Use the Foursquare API to explore the neighbourhoods
- Get the most common venue categories in each community
- Group the communities by the most common venue
- Provide a list of communities that fall in the cluster where resteraunts are the most common venues

The k-means clustering approach will be used to perform the segmentation, folium to visualize the results.

## Filter Data

In [14]:
df_calgary=df[['CLASS','NAME','longitude', 'latitude']]

In [15]:
df_calgary.head()

Unnamed: 0,CLASS,NAME,longitude,latitude
0,Residential,SPRUCE CLIFF,-114.136468,51.048198
1,Industrial,MERIDIAN,-113.996513,51.056587
2,Residential,LAKEVIEW,-114.129634,50.999781
3,Major Park,GLENMORE PARK,-114.131536,50.990456
4,Residential,DEER RUN,-114.009089,50.925408


In [16]:
#Remove rows where 'Industrial'
df_calgary=df_calgary[df_calgary['CLASS'] != 'Industrial']
df_calgary.shape

(261, 4)

In [17]:
latitude = 51.0486
longitude = -114.0708
map_calgary = folium.Map(location=[latitude, longitude], zoom_start=10)
for lat, lng, community in zip(df_calgary['latitude'], df_calgary['longitude'],df_calgary['NAME']):
    label='{}'.format(community)
    label=folium.Popup(label,parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_calgary)
map_calgary

## Get Foursquare Data

In [18]:
import requests #library to handle requests
from pandas.io.json import json_normalize #tf json file to pd 

### Foursquare cred

In [22]:
#Remove for privacy

In [20]:
# Explore the 4Square data for the first community
neighborhood_latitude = df_calgary.loc[0, 'latitude'] # neighborhood latitude value
neighborhood_longitude = df_calgary.loc[0, 'longitude'] # neighborhood longitude value
neighborhood_name = df_calgary.loc[0, 'NAME'] # neighborhood name

#get the top 100 venues that are in the first community within a radius of 500 meters
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 500 # define radius
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)

#Send the GET request and examine the resutls
results = requests.get(url).json()

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

#clean the json and structure it into a pandas dataframe
venues = results['response']['groups'][0]['items']    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]
print('{} venues were returned by Foursquare for the first community.'.format(nearby_venues.shape[0]))
nearby_venues.head()

4 venues were returned by Foursquare for the first community.


Unnamed: 0,name,categories,lat,lng
0,Sushi Ichiban,Japanese Restaurant,51.044677,-114.140869
1,The Pie Junkie,Café,51.048179,-114.136459
2,Wolf Willow Studio,Art Gallery,51.048144,-114.136596
3,Spa Lady,Gym,51.044392,-114.140195


We can repeat this for all of the communities in the city of Calgary. As previously mentioned, we can group the communities together, and the ones that have resteraunts as their top venues will be considered food districts, and a list will be provided to the association.