<center> <p style="font-size:30px"> The Battle of the Neighborhoods - Part 2 </p> </center>

# Let's import the libraries

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

import folium # map rendering library

### Let's obtain the data as a dataframe. After that, we'll perform some data-cleaning in order to eliminate those rows that contains "Not assigned data" or similars. We will also merge the repeated postal codes with different neighborhoods associated. This will conclude the pre-processing part.

In [2]:
df = pd.read_html('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M')
df = df[0]
dff = df.drop( df[(df['Borough']=='Not assigned')].index)
dff = dff.reset_index(drop=True) #Let's reset the index 

In [8]:
dff.columns=(['Postal Code', 'Borough','Neighborhood'])
dff.head()
df3= dff.groupby(['Postal Code', 'Borough'])['Neighborhood'].apply(list).apply(lambda x: ", ".join(x)).to_frame()
df3 = df3.reset_index() #Let's add the index to each row

## It's time to merge the coordinates (longitude and latituted) to the dataframe df3

In [10]:
coords = pd.read_csv('https://cocl.us/Geospatial_data')
DF = df3.merge(coords, on="Postal Code")

## Creating a map. 
### Now we will reduce the dataframe and we'll take into account only the Boroughs that contains the word 'Toronto'.

In [11]:
address = 'Toronto' 

geolocator = Nominatim(user_agent="TO_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geographic coordinates of Toronto are {}, {}.'.format(latitude, longitude))

DFF = DF[DF['Borough'].str.contains("Toronto")].reset_index(drop=True) #Reduced dataframe

toronto_map2 = folium.Map(location=[latitude, longitude],zoom_start=13)
# Now let's add some markers to map
for lat, lng, borough, neighborhood in zip(DFF['Latitude'], DFF['Longitude'], DFF['Borough'], DFF['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=2.5,
        popup=label,
        color='red',
        fill=False,
        fill_opacity=0.9,
        parse_html=False).add_to(toronto_map2)  
toronto_map2

The geographic coordinates of Toronto are 43.6534817, -79.3839347.


In [12]:
# Foursquare Credentials
CLIENT_ID = 'DQEZG2S4QT5HDY2BHE3VOCSPNHAZFZAQ1UGM0VWNJKSOHBTK' #  Foursquare ID
CLIENT_SECRET = 'ZJETNCGS2QPKYSOVFZOI4XG1NNP4CBWNMLAD4NL3KAWX0UOL' #  Foursquare Secret
VERSION = '20180605' # Foursquare API version


In [13]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):  
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)       
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)  
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [14]:
LIMIT=50
toronto_venues = getNearbyVenues(names=DFF['Neighborhood'],
                                   latitudes=DFF['Latitude'],
                                   longitudes=DFF['Longitude']
                                  )


The Beaches
The Danforth West, Riverdale
India Bazaar, The Beaches West
Studio District
Lawrence Park
Davisville North
North Toronto West, Lawrence Park
Davisville
Moore Park, Summerhill East
Summerhill West, Rathnelly, South Hill, Forest Hill SE, Deer Park
Rosedale
St. James Town, Cabbagetown
Church and Wellesley
Regent Park, Harbourfront
Garden District, Ryerson
St. James Town
Berczy Park
Central Bay Street
Richmond, Adelaide, King
Harbourfront East, Union Station, Toronto Islands
Toronto Dominion Centre, Design Exchange
Commerce Court, Victoria Hotel
Roselawn
Forest Hill North & West, Forest Hill Road Park
The Annex, North Midtown, Yorkville
University of Toronto, Harbord
Kensington Market, Chinatown, Grange Park
CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport
Stn A PO Boxes
First Canadian Place, Underground city
Christie
Dufferin, Dovercourt Village
Little Portugal, Trinity
Brockton, Parkdale Village, Exhibition Place
High 

In [30]:
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()

def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

num_top_venues = 20

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head(10)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,Berczy Park,Coffee Shop,Cheese Shop,Café,Cocktail Bar,Beer Bar,Seafood Restaurant,Farmers Market,Bakery,Restaurant,Pharmacy,Hotel,Breakfast Spot,Bistro,Indian Restaurant,Department Store,Beach,Basketball Stadium,Creperie,Jazz Club,Italian Restaurant
1,"Brockton, Parkdale Village, Exhibition Place",Café,Breakfast Spot,Bakery,Coffee Shop,Furniture / Home Store,Convenience Store,Bar,Stadium,Nightclub,Intersection,Restaurant,Climbing Gym,Gym,Grocery Store,Performing Arts Venue,Italian Restaurant,Burrito Place,Office,Pet Store,Deli / Bodega
2,"Business reply mail Processing Centre, South C...",Light Rail Station,Yoga Studio,Farmers Market,Butcher,Spa,Auto Workshop,Recording Studio,Pizza Place,Burrito Place,Gym / Fitness Center,Restaurant,Skate Park,Park,Comic Shop,Brewery,Garden,Garden Center,Fast Food Restaurant,Department Store,Deli / Bodega
3,"CN Tower, King and Spadina, Railway Lands, Har...",Airport Service,Airport Lounge,Airport Terminal,Coffee Shop,Harbor / Marina,Plane,Rental Car Location,Sculpture Garden,Boutique,Bar,Boat or Ferry,Airport Gate,Airport Food Court,Airport,Garden,Deli / Bodega,Donut Shop,Doner Restaurant,Dog Run,Distribution Center
4,Central Bay Street,Coffee Shop,Italian Restaurant,Sandwich Place,Bubble Tea Shop,Burger Joint,Café,Wine Bar,Modern European Restaurant,Japanese Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Office,New American Restaurant,Ice Cream Shop,Park,Pizza Place,Poke Place,Indian Restaurant,Gastropub,Hotel
5,Christie,Grocery Store,Café,Park,Baby Store,Coffee Shop,Candy Store,Restaurant,Diner,Italian Restaurant,Nightclub,Fried Chicken Joint,Garden Center,Dessert Shop,Department Store,Deli / Bodega,Dance Studio,Gaming Cafe,Cupcake Shop,Garden,Cuban Restaurant
6,Church and Wellesley,Coffee Shop,Japanese Restaurant,Yoga Studio,Gay Bar,Restaurant,Sushi Restaurant,Men's Store,Theme Restaurant,Indian Restaurant,Hobby Shop,Martial Arts Dojo,Diner,Sake Bar,Theater,Ice Cream Shop,Breakfast Spot,Ramen Restaurant,Pub,Bubble Tea Shop,Beer Bar
7,"Commerce Court, Victoria Hotel",Hotel,Coffee Shop,Café,Restaurant,Gym,American Restaurant,Japanese Restaurant,Gastropub,Seafood Restaurant,Deli / Bodega,Park,Beer Bar,Gym / Fitness Center,Sandwich Place,Ice Cream Shop,Salad Place,Bookstore,Bakery,Pub,Tailor Shop
8,Davisville,Pizza Place,Sandwich Place,Dessert Shop,Sushi Restaurant,Coffee Shop,Italian Restaurant,Café,Gym,Seafood Restaurant,Japanese Restaurant,Restaurant,Brewery,Indian Restaurant,Farmers Market,Pharmacy,Diner,Discount Store,Thai Restaurant,Gourmet Shop,Park
9,Davisville North,Park,Breakfast Spot,Pizza Place,Sandwich Place,Gym / Fitness Center,Department Store,Dance Studio,Hotel,Food & Drink Shop,Cuban Restaurant,Distribution Center,Discount Store,Diner,Dessert Shop,Deli / Bodega,Cupcake Shop,Creperie,Doner Restaurant,Coworking Space,Cosmetics Shop


<p style='font-size:20px'> This previous dataframe shows the 20 most common venues of each Neighbour. Which one is the best if we want to open a gym? The answer to this question is, at first, simple: the one that doesn't have a gym within its 20 most common venues. Actually, the answer can be a little bit more elaborated.</p>

## Can we take into account more features in order to pick the best place? Yes!

<p style='font-size:20px'> We can combine the neighborhoods that don't have many gyms or fitness centers but do have hotels or cafés.

In [45]:
toronto_possible_1 = neighborhoods_venues_sorted[(neighborhoods_venues_sorted['1st Most Common Venue']=='Hotel') | (neighborhoods_venues_sorted['2nd Most Common Venue']== 'Café') ].reset_index(drop=True)
toronto_possible_2 = neighborhoods_venues_sorted[(neighborhoods_venues_sorted['1st Most Common Venue']=='Café') | (neighborhoods_venues_sorted['2nd Most Common Venue']== 'Hotel') ].reset_index(drop=True)


<p style='font-size:20px'> toronto_possible_1 and toronto_possible_2 are two dataframes that contains the neighborhoods whose 1st or 2nd most common venue are cafés or hotels.</p>

In [50]:
toronto_possible_1

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,Christie,Grocery Store,Café,Park,Baby Store,Coffee Shop,Candy Store,Restaurant,Diner,Italian Restaurant,Nightclub,Fried Chicken Joint,Garden Center,Dessert Shop,Department Store,Deli / Bodega,Dance Studio,Gaming Cafe,Cupcake Shop,Garden,Cuban Restaurant
1,"Commerce Court, Victoria Hotel",Hotel,Coffee Shop,Café,Restaurant,Gym,American Restaurant,Japanese Restaurant,Gastropub,Seafood Restaurant,Deli / Bodega,Park,Beer Bar,Gym / Fitness Center,Sandwich Place,Ice Cream Shop,Salad Place,Bookstore,Bakery,Pub,Tailor Shop
2,"Garden District, Ryerson",Coffee Shop,Café,Clothing Store,Cosmetics Shop,Ramen Restaurant,Tea Room,Theater,Bookstore,Fast Food Restaurant,Japanese Restaurant,Plaza,Burger Joint,Hotel,Burrito Place,Pizza Place,Italian Restaurant,Sandwich Place,New American Restaurant,Shopping Mall,Lake
3,"High Park, The Junction South",Mexican Restaurant,Café,Thai Restaurant,Park,Discount Store,Diner,Bookstore,Italian Restaurant,Music Venue,Cajun / Creole Restaurant,Fast Food Restaurant,Speakeasy,Bar,Bakery,Flea Market,Furniture / Home Store,Gastropub,Fried Chicken Joint,Antique Shop,Arts & Crafts Store
4,"Little Portugal, Trinity",Bar,Café,Restaurant,Coffee Shop,Men's Store,Vietnamese Restaurant,Asian Restaurant,Vegetarian / Vegan Restaurant,Park,New American Restaurant,Miscellaneous Shop,Mexican Restaurant,Greek Restaurant,Korean Restaurant,Juice Bar,Japanese Restaurant,Italian Restaurant,Pizza Place,Ice Cream Shop,Wine Bar
5,"Richmond, Adelaide, King",Coffee Shop,Café,Steakhouse,Concert Hall,Hotel,Restaurant,American Restaurant,Pizza Place,New American Restaurant,Department Store,Gym / Fitness Center,Seafood Restaurant,Salon / Barbershop,Bookstore,Brazilian Restaurant,Plaza,Deli / Bodega,Burrito Place,Gym,Japanese Restaurant
6,"The Annex, North Midtown, Yorkville",Sandwich Place,Café,Coffee Shop,Pub,History Museum,Liquor Store,BBQ Joint,Indian Restaurant,Cheese Shop,Convenience Store,Pizza Place,Pharmacy,Middle Eastern Restaurant,Burger Joint,Park,Donut Shop,Vegetarian / Vegan Restaurant,Dance Studio,Deli / Bodega,Department Store
7,"Toronto Dominion Centre, Design Exchange",Coffee Shop,Café,Seafood Restaurant,Hotel,Japanese Restaurant,Restaurant,Beer Bar,Bakery,Gym / Fitness Center,Deli / Bodega,Basketball Stadium,Wine Bar,Pizza Place,Pub,Sandwich Place,Salad Place,Bookstore,Plaza,Ice Cream Shop,Speakeasy


In [49]:
toronto_possible_2

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,"Brockton, Parkdale Village, Exhibition Place",Café,Breakfast Spot,Bakery,Coffee Shop,Furniture / Home Store,Convenience Store,Bar,Stadium,Nightclub,Intersection,Restaurant,Climbing Gym,Gym,Grocery Store,Performing Arts Venue,Italian Restaurant,Burrito Place,Office,Pet Store,Deli / Bodega
1,"First Canadian Place, Underground city",Café,Coffee Shop,Restaurant,American Restaurant,Seafood Restaurant,Deli / Bodega,Concert Hall,Hotel,Gym,Tea Room,Bar,Bakery,Speakeasy,Gym / Fitness Center,New American Restaurant,Bookstore,Salad Place,Beer Bar,Greek Restaurant,Gluten-free Restaurant
2,"Kensington Market, Chinatown, Grange Park",Café,Mexican Restaurant,Vegetarian / Vegan Restaurant,Coffee Shop,Burger Joint,Vietnamese Restaurant,Bar,Pizza Place,Dessert Shop,Record Shop,Caribbean Restaurant,Cheese Shop,Park,Organic Grocery,Noodle House,Comfort Food Restaurant,Wine Bar,Belgian Restaurant,Doner Restaurant,Donut Shop
3,"Runnymede, Swansea",Café,Coffee Shop,Pizza Place,Pub,Sushi Restaurant,Italian Restaurant,Yoga Studio,Bookstore,Bar,Smoothie Shop,IT Services,Sandwich Place,Restaurant,Boutique,Falafel Restaurant,Diner,Burrito Place,Dessert Shop,Latin American Restaurant,Bank
4,St. James Town,Café,Coffee Shop,Cosmetics Shop,Creperie,Farmers Market,Gastropub,Restaurant,Seafood Restaurant,Hotel,German Restaurant,Middle Eastern Restaurant,Park,Beer Bar,Ice Cream Shop,Furniture / Home Store,Bookstore,Department Store,Camera Store,Breakfast Spot,Italian Restaurant
5,Stn A PO Boxes,Café,Restaurant,Beer Bar,Cocktail Bar,Coffee Shop,Creperie,Farmers Market,Seafood Restaurant,Cheese Shop,Hotel,Liquor Store,Park,Basketball Stadium,Bistro,Comfort Food Restaurant,Department Store,Breakfast Spot,Irish Pub,Italian Restaurant,Japanese Restaurant
6,Studio District,Café,Coffee Shop,American Restaurant,Bakery,Brewery,Gastropub,Wine Bar,Fish Market,Pet Store,Park,Middle Eastern Restaurant,Latin American Restaurant,Italian Restaurant,Ice Cream Shop,Gym / Fitness Center,Gay Bar,Convenience Store,Diner,Coworking Space,Seafood Restaurant
7,"University of Toronto, Harbord",Café,Restaurant,Bar,Bookstore,Sandwich Place,Japanese Restaurant,Bakery,Yoga Studio,Beer Bar,Beer Store,Italian Restaurant,Dessert Shop,Pub,Chinese Restaurant,Noodle House,Nightclub,College Arts Building,Comfort Food Restaurant,Moving Target,Bank


<p style="font-size:20px"> Taking a look at these two dataframes, we can see that there are some neighborhoods that do not have gyms nearby but that do have many cafes and some hotels.

In [68]:
TO1 = toronto_possible_1[(toronto_possible_1['Neighborhood']=='Toronto Dominion Centre, Design Exchange') | 
                   (toronto_possible_1['Neighborhood']== 'Garden District, Ryerson')].reset_index(drop=True)


In [67]:
TO2 = toronto_possible_2[(toronto_possible_2['Neighborhood']== 'Kensington Market, Chinatown, Grange Park')  |
                   (toronto_possible_2['Neighborhood']=='Runnymede, Swansea')].reset_index(drop=True)

In [74]:
T=[TO1,TO2]
T=pd.concat(T).reset_index(drop=True)
T

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,"Garden District, Ryerson",Coffee Shop,Café,Clothing Store,Cosmetics Shop,Ramen Restaurant,Tea Room,Theater,Bookstore,Fast Food Restaurant,Japanese Restaurant,Plaza,Burger Joint,Hotel,Burrito Place,Pizza Place,Italian Restaurant,Sandwich Place,New American Restaurant,Shopping Mall,Lake
1,"Toronto Dominion Centre, Design Exchange",Coffee Shop,Café,Seafood Restaurant,Hotel,Japanese Restaurant,Restaurant,Beer Bar,Bakery,Gym / Fitness Center,Deli / Bodega,Basketball Stadium,Wine Bar,Pizza Place,Pub,Sandwich Place,Salad Place,Bookstore,Plaza,Ice Cream Shop,Speakeasy
2,"Kensington Market, Chinatown, Grange Park",Café,Mexican Restaurant,Vegetarian / Vegan Restaurant,Coffee Shop,Burger Joint,Vietnamese Restaurant,Bar,Pizza Place,Dessert Shop,Record Shop,Caribbean Restaurant,Cheese Shop,Park,Organic Grocery,Noodle House,Comfort Food Restaurant,Wine Bar,Belgian Restaurant,Doner Restaurant,Donut Shop
3,"Runnymede, Swansea",Café,Coffee Shop,Pizza Place,Pub,Sushi Restaurant,Italian Restaurant,Yoga Studio,Bookstore,Bar,Smoothie Shop,IT Services,Sandwich Place,Restaurant,Boutique,Falafel Restaurant,Diner,Burrito Place,Dessert Shop,Latin American Restaurant,Bank


<p style="font-size:20px"> As it can be observed in the T dataframe, there are a some neighborhoods where it could be a nice idea to place a Gym or a Fitness Center. </p>

In [78]:
#Let's make a map with the zones where it is suitable to put a gym
TOO1 = DFF[(DFF['Neighborhood']=='Toronto Dominion Centre, Design Exchange') | 
                   (DFF['Neighborhood']== 'Garden District, Ryerson') |
                   (DFF['Neighborhood']== 'Kensington Market, Chinatown, Grange Park')  |
                   (DFF['Neighborhood']=='Runnymede, Swansea') ].reset_index(drop=True)


In [83]:
tormap = folium.Map(location=[latitude, longitude],zoom_start=13)
# Now let's add some markers to map
for lat, lng, borough, neighborhood in zip(TOO1['Latitude'], TOO1['Longitude'], TOO1['Borough'], TOO1['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=10,
        popup=label,
        color='red',
        fill=False,
        fill_opacity=0.9,
        parse_html=False).add_to(tormap)  
tormap

<p style="font-size:20px">  So there are four potential places to put a Fitness Center. For a further analysis, it could be useful to take into account other features as the average income of the neighborhood, the average age, and so on. But, since this is a first project, I think that determining the places based on the number of similar venues near the neighborhood and the number of cafés or hotels that there are is a good starting point.