# Applied Data Science Capstone
This notebook will be used for my capstone project in Coursera. 

In [1]:
import pandas as pd 
import numpy as np 
import folium
import requests
from pandas.io.json import json_normalize

print("Hello Capstone Project Course!")

Hello Capstone Project Course!


## Introduction
Let's imagine that in the past year, my family and I opened a restaurant in Columbus, OH. Specifically, in the Northwest Columbus community. We've achieved a far amount of success and want to expand. Since we know the Northwest community enjoys our food, we should aim to go into a neighborhood that is similar to the Northwest. Through this notebook, we will analyze and cluster and neighborhoods of Columbus to determine potential neighborhoods for our next place of business. 

## Data
The list of Columbus neighborhoods was obtained from [this website](http://opendata.columbus.gov/datasets/c4b483507f374e62bd705450e116e017_25/data). The data here also included the area of each neighborhood in squre feet, which I used to approximate the radius of each neighborhood assuming they are circular. This is not wholly accurate but a good enough approximation. To find the coordinates at the center of each neighborhood, [this map](https://www.arcgis.com/home/webmap/viewer.html?layers=c4b483507f374e62bd705450e116e017), where centers of the neighborhoods were approximated, then the coordinates were copy and pasted into a spreadsheet that was then exported as the [Columbus_Communities.csv](https://github.com/alexanderWhile/Coursera_Capstone/blob/master/Columbus_Communities.csv) found in this repository. 

Let's import our data into a dataframe and preview the information found in it. 

In [2]:
COLUMBUS_COMMUNITIES = pd.read_csv("Columbus_Communities.csv")
COLUMBUS_COMMUNITIES.head()

Unnamed: 0,Community,Latitude,Longitude,Radius
0,Airport,39.996795,-82.889889,1800
1,Brewery District,39.947067,-83.003872,700
2,Clintonville,40.047406,-83.013828,2200
3,Downtown,39.963515,-82.999752,1400
4,Dublin Road Corridor,39.97233,-83.036144,700


We will start by making a map of the centers of all the communities in Columbus. 

In [3]:
COLUMBUS_LATITUDE = 39.9612
COLUMBUS_LONGITUDE = -82.9988

COLUMBUS_MAP = folium.Map(
    location = [COLUMBUS_LATITUDE, COLUMBUS_LONGITUDE],
    zoom_start = 10,
)

for lat, lng, label, radius in zip(COLUMBUS_COMMUNITIES.Latitude, COLUMBUS_COMMUNITIES.Longitude, COLUMBUS_COMMUNITIES.Community)
    folium.vector_layers.CircleMarker(
        [lat,lng],
        radius=5,
        color='blue',
        popup=label,
        fill=True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(COLUMBUS_MAP)



COLUMBUS_MAP

SyntaxError: invalid syntax (<ipython-input-3-66642ced0b8f>, line 9)

*Note: GitHub will not render any folium maps. To see them, follow the link [here](https://nbviewer.jupyter.org/github/alexanderWhile/Coursera_Capstone/blob/master/notebook.ipynb)*

Now let's import our Foursquare credentials to begin utilizing the API and looking up venues. 

In [4]:
CLIENT_ID = 'IA4SDU5HX0UHCL4VSZJDAHBXWJHJY4HPTFNBLWHG4YHYSLWH'
CLIENT_SECRET = '21PID34DCUTLYIWC2RRRRWMBKIE1ZUUXQKE2ZEAASQ4VIWX5'
VERSION = '20200416'

LIMIT = 100

print("Client ID:",CLIENT_ID)
print("Client Secret:", CLIENT_SECRET)
print("Version:", VERSION)
print("Limit:", LIMIT)

Client ID: IA4SDU5HX0UHCL4VSZJDAHBXWJHJY4HPTFNBLWHG4YHYSLWH
Client Secret: 21PID34DCUTLYIWC2RRRRWMBKIE1ZUUXQKE2ZEAASQ4VIWX5
Version: 20200416
Limit: 100


Now we will preview our API calls by making a map of the venues in the community familiar to us, Northwest Columbus. First we will make a folium map centered on the community.

In [5]:
NORTHWEST = COLUMBUS_COMMUNITIES[COLUMBUS_COMMUNITIES.Community == 'Northwest'].reset_index()

NORTHWEST_MAP = folium.Map(
    location = [NORTHWEST.loc[0,'Latitude'], NORTHWEST.loc[0,'Longitude']],
    zoom_start=12
)

folium.vector_layers.CircleMarker(
    [NORTHWEST.loc[0,'Latitude'], NORTHWEST.loc[0,'Longitude']],
    radius=5,
    color = 'red',
    popup = NORTHWEST.loc[0,'Community'],
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(NORTHWEST_MAP)

NORTHWEST_MAP

Next we will make our API call.

In [6]:
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID,
    CLIENT_SECRET,
    VERSION,
    NORTHWEST.loc[0,'Latitude'],
    NORTHWEST.loc[0,'Longitude'],
    NORTHWEST.loc[0,'Radius'],
    LIMIT)

results = requests.get(url).json()
print("Success")

Success


Here we define a function to get the category of each venue from the .json file. 

In [7]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
    
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

print("Function Defined!")

Function Defined!


We convert the results into a well-formatted data frame.

In [8]:
NORTHWEST_VENUES = results['response']['groups'][0]['items']
NORTHWEST_VENUES = json_normalize(NORTHWEST_VENUES)

FILTERED_COLUMNS = ['venue.name','venue.categories','venue.location.lat','venue.location.lng']
NORTHWEST_VENUES = NORTHWEST_VENUES.loc[:,FILTERED_COLUMNS]

NORTHWEST_VENUES['venue.categories'] = NORTHWEST_VENUES.apply(get_category_type,axis = 1)

NORTHWEST_VENUES.columns = [col.split(".")[-1] for col in NORTHWEST_VENUES.columns]

NORTHWEST_VENUES.head()

Unnamed: 0,name,categories,lat,lng
0,Los Guachos Taqueria,Taco Place,40.064524,-83.057044
1,Graeter's Ice Cream,Ice Cream Shop,40.06499,-83.075559
2,City Egg,Breakfast Spot,40.064127,-83.058756
3,Texas Roadhouse,Steakhouse,40.064117,-83.060436
4,Planet Fitness - Temporarily Closed,Gym / Fitness Center,40.065112,-83.072097


Check the number of venues.

In [9]:
print("There are", NORTHWEST_VENUES.shape[0], "venues nearby.")

There are 100 venues nearby.


And finally add the venues to our folium map.

In [10]:
for lat, lng, label in zip(NORTHWEST_VENUES.lat, NORTHWEST_VENUES.lng, NORTHWEST_VENUES.name):
    folium.vector_layers.CircleMarker(
        [lat, lng],
        radius = 5,
        color = 'blue',
        popup = label,
    ).add_to(NORTHWEST_MAP)

NORTHWEST_MAP

As a sanity check, we will repeat the process to map all the venues in Downtown Columbus.

In [11]:
DOWNTOWN = COLUMBUS_COMMUNITIES[COLUMBUS_COMMUNITIES.Community == 'Downtown'].reset_index()

DOWNTOWN_MAP = folium.Map(
    location = [DOWNTOWN.loc[0,'Latitude'], DOWNTOWN.loc[0,'Longitude']],
    zoom_start=13
)

folium.vector_layers.CircleMarker(
    [DOWNTOWN.loc[0,'Latitude'], DOWNTOWN.loc[0,'Longitude']],
    radius=5,
    color = 'red',
    popup = DOWNTOWN.loc[0,'Community'],
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(DOWNTOWN_MAP)

url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID,
    CLIENT_SECRET,
    VERSION,
    DOWNTOWN.loc[0,'Latitude'],
    DOWNTOWN.loc[0,'Longitude'],
    DOWNTOWN.loc[0,'Radius'],
    LIMIT)

results = requests.get(url).json()

DOWNTOWN_VENUES = results['response']['groups'][0]['items']
DOWNTOWN_VENUES = json_normalize(DOWNTOWN_VENUES)

DOWNTOWN_VENUES = DOWNTOWN_VENUES.loc[:,FILTERED_COLUMNS]

DOWNTOWN_VENUES['venue.categories'] = DOWNTOWN_VENUES.apply(get_category_type,axis = 1)

DOWNTOWN_VENUES.columns = [col.split(".")[-1] for col in DOWNTOWN_VENUES.columns]

for lat, lng, label in zip(DOWNTOWN_VENUES.lat, DOWNTOWN_VENUES.lng, DOWNTOWN_VENUES.name):
    folium.vector_layers.CircleMarker(
        [lat, lng],
        radius = 5,
        color = 'blue',
        popup = label,
    ).add_to(DOWNTOWN_MAP)

DOWNTOWN_MAP