# Capstone Project - Libraries as Economic and Social Stimuli for Toronto Neighborhoods
### Applied Data Science Capstone by IBM/Coursera

## Table of contents
* [Introduction: Social Impact](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Results](#results)
* [Discussion](#discussion)
* [Conclusion](#conclusion)

## Introduction: Economic and Social Impacts of Libraries in the Neighborhood<a name="introduction"></a>

This project will determine popularity of Public Libraries based on foot traffic (using Foursquare Check-ins). The stakeholders in this case are people who are living in or around these neighborhoods. And as such, this is not really a business problem but a social issue. To further assist in increasing foot traffic to the libraries, we will also study top venues nearby to glean from the data if specific venues contribute to library popularity as places that people visit.

Based on several studies, the presence of "3rd places" strengthens  communities. The home (first place), workspace / school (second place), and the library (or church or coffee shops) so called 3rd places help provide a socially accepting and safe environments. The library can be considered as a place where people meet and social relationships made. Neighborhoods that have "famous" 3rd places like libraries tend to add economic and safety values to their neighborhoods.

At the end of this project, we would know if the libraries have increasing visits and if not, encourage the local government and private establishments that contributed to visits of successful libraries to setup shop near the libraries (or inside if local ordinance permit) to promote people traffic into the libraries.

## Data <a name="data"></a>

In [1]:
#Verification of FourSuare Credentials
CLIENT_ID = '' # your Foursquare ID
CLIENT_SECRET = '' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

#ID and Secret removed for security reason

In [3]:
#Import modules for mapping and DataFrame processing

import pandas as pd
import numpy as np
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

# convert an address into latitude and longitude values
from geopy.geocoders import Nominatim 

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.pyplot as plt
import matplotlib.colors as colors

# map rendering library
!conda install -c conda-forge folium=0.5.0 --yes
import folium 

# library to handle requests
import requests 
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# import k-means from clustering stage
from sklearn.cluster import KMeans


Fetching package metadata .............
Solving package specifications: .

Package plan for installation in environment /opt/conda/envs/DSX-Python35:

The following NEW packages will be INSTALLED:

    altair:  2.2.2-py35_1 conda-forge
    branca:  0.3.1-py_0   conda-forge
    folium:  0.5.0-py_0   conda-forge
    vincent: 0.4.4-py_1   conda-forge

altair-2.2.2-p 100% |################################| Time: 0:00:00  47.23 MB/s
branca-0.3.1-p 100% |################################| Time: 0:00:00  28.45 MB/s
vincent-0.4.4- 100% |################################| Time: 0:00:00  34.57 MB/s
folium-0.5.0-p 100% |################################| Time: 0:00:00  47.61 MB/s


In [4]:
#Creating initial map of Toronto, Ontario.
address = 'Toronto, ON'

geolocator = Nominatim(user_agent="Tor_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

map_toronto = folium.Map(location=[latitude, longitude], zoom_start=11)


The geograpical coordinate of Toronto are 43.653963, -79.387207.


In [5]:
#Get Latitude and longitude from Foursquare of Toronto Public Libraries

radius = 5000
LIMIT = 20
search_query = 'fast'

query = 'Toronto Public library'

#url = 'https://api.foursquare.com/v2/venues/search?ll=40.7484,-73.9857

url_Lib = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}&query={}'.format(
   CLIENT_ID,
   CLIENT_SECRET,
   latitude,
   longitude,
   VERSION,
   radius,
   LIMIT,
    query,)


In [6]:
#Send the request to Foursquare
results = requests.get(url_Lib).json()

In [8]:
#Function to flatten json result to see correct path

from itertools import chain, starmap

def flatten_json(dictionary):
    """Flatten a nested json file"""

    def unpack(parent_key, parent_value):
        """Unpack one level of nesting in json file"""
        # Unpack one level only!!!
        
        if isinstance(parent_value, dict):
            for key, value in parent_value.items():
                temp1 = parent_key + '_' + key
                yield temp1, value
        elif isinstance(parent_value, list):
            i = 0 
            for value in parent_value:
                temp2 = parent_key + '_'+str(i) 
                i += 1
                yield temp2, value
        else:
            yield parent_key, parent_value    

            
    # Keep iterating until the termination condition is satisfied
    while True:
        # Keep unpacking the json file until all values are atomic elements (not dictionary or list)
        dictionary = dict(chain.from_iterable(starmap(unpack, dictionary.items())))
        # Terminate condition: not any value in the json file is dictionary or list
        if not any(isinstance(value, dict) for value in dictionary.values()) and \
           not any(isinstance(value, list) for value in dictionary.values()):
            break

    return dictionary

In [None]:
#To optionally view json path
flatten_json(results)

In [9]:
#Get the Library Names, IDs, and Latitude / Longitude

TorPubLib = []
TorPubLibName = []

for val in range(0,20):
    TorPubLib.append({results['response']['venues'][val]['location']['lat'],
                     results['response']['venues'][val]['location']['lng']
                     })
      
for val in range(0,20):
    TorPubLibName.append(results['response']['venues'][val]['name'])
    TorPubLibName.append(results['response']['venues'][val]['id'])
                         

TorPubLib_df = pd.DataFrame(TorPubLib)



In [10]:
#Reshape to get the correct Dataframe

TorPubLibName_df = pd.DataFrame(np.array(TorPubLibName).reshape(20,2), columns = list("ab"))
print(TorPubLibName_df)

                                                    a  \
0                              Toronto Public Library   
1   Toronto Public Library - Toronto Reference Lib...   
2    Toronto Public Library - Lillian H. Smith Branch   
3        Toronto Public Library - St. Lawrence Branch   
4             Toronto Public Library (St. James Town)   
5          Toronto Public Library - Palmerston Branch   
6        Toronto Public Library - College/Shaw Branch   
7     Toronto Public Library - Bloor Gladstone Branch   
8           Toronto Public Library (Fort York Branch)   
9            Toronto Public Library - Parkdale Branch   
10  Toronto Public Library - Northern District Branch   
11      Toronto Public Library - Pape/Danforth Branch   
12          Toronto Public Library - Deer Park Branch   
13          Toronto Public Library (Sanderson Branch)   
14                  Toronto Public Library Bookmobile   
15         Toronto Public Library (Merril Collection)   
16            Toronto Public Li

In [13]:
#Merge the dataframes

TorPubLib1 = []
TorPubLib1 = pd.merge(TorPubLibName_df, TorPubLib_df, left_index=True, right_index=True)
TorPubLib1.columns = ["Name","Venue ID","Longitude","Latitude"]

columns=list(TorPubLib1.columns)

In [14]:
TorPubLib1

Unnamed: 0,Name,Venue ID,Longitude,Latitude
0,Toronto Public Library,4c8938c8944e224b52e72285,-79.383295,43.652631
1,Toronto Public Library - Toronto Reference Lib...,4b5f2e80f964a52088ab29e3,-79.386944,43.671795
2,Toronto Public Library - Lillian H. Smith Branch,4ae6010ff964a520f7a321e3,-79.398372,43.658137
3,Toronto Public Library - St. Lawrence Branch,4b51e5aff964a5203c5a27e3,-79.36833,43.650048
4,Toronto Public Library (St. James Town),4b807beef964a5209d7630e3,-79.374998,43.66879
5,Toronto Public Library - Palmerston Branch,4b26b348f964a520b97f24e3,-79.413978,43.665074
6,Toronto Public Library - College/Shaw Branch,4d9b3f1bb4fa37044e7b980d,-79.420167,43.654941
7,Toronto Public Library - Bloor Gladstone Branch,4b80365bf964a520e25c30e3,-79.434173,43.660097
8,Toronto Public Library (Fort York Branch),53879eb7498ee9aa7b6fe3d8,-79.400445,43.639172
9,Toronto Public Library - Parkdale Branch,4b51e4f4f964a5201b5a27e3,-79.432714,43.641248


In [15]:
#Get Library Rating

lib_rate_all = []

for vid in TorPubLib1['Venue ID']:
    url_rate = 'https://api.foursquare.com/v2/venues/{}?&client_id={}&client_secret={}&v={}'.format(
       vid, 
       CLIENT_ID,
       CLIENT_SECRET,
        VERSION)
    lib_rate = requests.get(url_rate).json()
    lib_rate_all.append(lib_rate)


In [16]:
library_rating_all = []

for f in range(0,20):
    try:
        library_rating=lib_rate_all[f]['response']['venue']['rating']
    except:
        library_rating=0
    library_rating_all.append(library_rating)

library_rating_all_df=pd.DataFrame(library_rating_all)


In [17]:
#Merge Ratings with the original Library DataFrame

TorPubLib1Rate_df = pd.merge(TorPubLib1, library_rating_all_df, left_index=True, right_index=True)
TorPubLib1Rate_df.rename(columns={0:'Rating'}, inplace=True)
TorPubLib1Rate_df

Unnamed: 0,Name,Venue ID,Longitude,Latitude,Rating
0,Toronto Public Library,4c8938c8944e224b52e72285,-79.383295,43.652631,6.4
1,Toronto Public Library - Toronto Reference Lib...,4b5f2e80f964a52088ab29e3,-79.386944,43.671795,9.1
2,Toronto Public Library - Lillian H. Smith Branch,4ae6010ff964a520f7a321e3,-79.398372,43.658137,8.2
3,Toronto Public Library - St. Lawrence Branch,4b51e5aff964a5203c5a27e3,-79.36833,43.650048,5.5
4,Toronto Public Library (St. James Town),4b807beef964a5209d7630e3,-79.374998,43.66879,7.2
5,Toronto Public Library - Palmerston Branch,4b26b348f964a520b97f24e3,-79.413978,43.665074,0.0
6,Toronto Public Library - College/Shaw Branch,4d9b3f1bb4fa37044e7b980d,-79.420167,43.654941,5.8
7,Toronto Public Library - Bloor Gladstone Branch,4b80365bf964a520e25c30e3,-79.434173,43.660097,8.4
8,Toronto Public Library (Fort York Branch),53879eb7498ee9aa7b6fe3d8,-79.400445,43.639172,8.3
9,Toronto Public Library - Parkdale Branch,4b51e4f4f964a5201b5a27e3,-79.432714,43.641248,7.2


In [18]:
#Get top venues around the libraries

radius =500
limit = 10


from itertools import islice

results_top_all = []

for index, row in TorPubLib1.iterrows():
    url_top = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(
       CLIENT_ID,
       CLIENT_SECRET,
       row["Latitude"],
       row["Longitude"],
       VERSION,
       radius,
       limit)
    results_top = requests.get(url_top).json()
    results_top_all.append(results_top)
    #print(results_top)

print(len(results_top_all)) #Just to quickly verify result
   
    

20


In [19]:
#Just to check the result
results_top_all[19]

{'meta': {'code': 200, 'requestId': '5c8a925c4434b9374106507b'},
 'response': {'groups': [{'items': [{'reasons': {'count': 0,
       'items': [{'reasonName': 'globalInteractionReason',
         'summary': 'This spot is popular',
         'type': 'general'}]},
      'referralId': 'e-0-5557fce2498e63ff8af4ce9f-0',
      'venue': {'categories': [{'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/icecream_',
          'suffix': '.png'},
         'id': '4bf58dd8d48988d1c9941735',
         'name': 'Ice Cream Shop',
         'pluralName': 'Ice Cream Shops',
         'primary': True,
         'shortName': 'Ice Cream'}],
       'id': '5557fce2498e63ff8af4ce9f',
       'location': {'address': '16 Vaughan Rd.',
        'cc': 'CA',
        'country': 'Canada',
        'distance': 42,
        'formattedAddress': ['16 Vaughan Rd.', 'Canada'],
        'labeledLatLngs': [{'label': 'display',
          'lat': 43.681743588143384,
          'lng': -79.41801062262026}],
        'lat': 43.681

In [20]:
#View flattened json
flatten_json(results_top_all[0])

{'meta_code': 200,
 'meta_requestId': '5c8a9256f594df4685411b33',
 'response_groups_0_items_0_reasons_count': 0,
 'response_groups_0_items_0_reasons_items_0_reasonName': 'globalInteractionReason',
 'response_groups_0_items_0_reasons_items_0_summary': 'This spot is popular',
 'response_groups_0_items_0_reasons_items_0_type': 'general',
 'response_groups_0_items_0_referralId': 'e-0-5227bb01498e17bf485e6202-0',
 'response_groups_0_items_0_venue_categories_0_icon_prefix': 'https://ss3.4sqi.net/img/categories_v2/parks_outdoors/neighborhood_',
 'response_groups_0_items_0_venue_categories_0_icon_suffix': '.png',
 'response_groups_0_items_0_venue_categories_0_id': '4f2a25ac4b909258e854f55f',
 'response_groups_0_items_0_venue_categories_0_name': 'Neighborhood',
 'response_groups_0_items_0_venue_categories_0_pluralName': 'Neighborhoods',
 'response_groups_0_items_0_venue_categories_0_primary': True,
 'response_groups_0_items_0_venue_categories_0_shortName': 'Neighborhood',
 'response_groups_0_it

In [21]:
#Retrieve details for top venues

venueTop = []

for e in range(0,20):
    for i in range(0,10):
        venueTop.append(results_top_all[e]['response']['groups'][0]['items'][i]['venue']['name'])
        venueTop.append(results_top_all[e]['response']['groups'][0]['items'][i]['venue']['categories'][0]['name'])
        venueTop.append(results_top_all[e]['response']['groups'][0]['items'][i]['venue']['id'])
        venueTop.append(results_top_all[e]['response']['groups'][0]['items'][i]['venue']['location']['lat'])
        venueTop.append(results_top_all[e]['response']['groups'][0]['items'][i]['venue']['location']['lng'])

In [22]:
#Convert to DataFrame
venueTop_df = pd.DataFrame(np.array(venueTop).reshape(200,5))

In [23]:
venueTop_df.head()

Unnamed: 0,0,1,2,3,4
0,Downtown Toronto,Neighborhood,5227bb01498e17bf485e6202,43.65323167517444,-79.38529600606677
1,Nathan Phillips Square,Plaza,4ad4c05ef964a520a6f620e3,43.65227047322295,-79.38351631164551
2,Eggspectation Bell Trinity Square,Breakfast Spot,537773d1498e74a75bb75c1e,43.65314383888587,-79.38198016678167
3,Old City Hall,Monument / Landmark,4ad4c05ef964a5208ef620e3,43.652008800876125,-79.3817442232328
4,Indigo,Bookstore,4b2a6eb8f964a52012a924e3,43.65351471121164,-79.38069591056922


In [24]:
#Add Library and Library Rating as well

venueTop_df.insert(0,'Library Name','')
venueTop_df.insert(1,'Library Rating','')

rtp=0

for tpr in range (0,20):
    for rtp in range(rtp,rtp+10):
        venueTop_df.loc[venueTop_df.index[rtp], 'Library Name'] = TorPubLib1Rate_df['Name'][tpr]
        venueTop_df.loc[venueTop_df.index[rtp], 'Library Rating'] = TorPubLib1Rate_df['Rating'][tpr]
        rtp=rtp+1

In [25]:
venueTop_df.head(7)

Unnamed: 0,Library Name,Library Rating,0,1,2,3,4
0,Toronto Public Library,6.4,Downtown Toronto,Neighborhood,5227bb01498e17bf485e6202,43.65323167517444,-79.38529600606677
1,Toronto Public Library,6.4,Nathan Phillips Square,Plaza,4ad4c05ef964a520a6f620e3,43.65227047322295,-79.38351631164551
2,Toronto Public Library,6.4,Eggspectation Bell Trinity Square,Breakfast Spot,537773d1498e74a75bb75c1e,43.65314383888587,-79.38198016678167
3,Toronto Public Library,6.4,Old City Hall,Monument / Landmark,4ad4c05ef964a5208ef620e3,43.652008800876125,-79.3817442232328
4,Toronto Public Library,6.4,Indigo,Bookstore,4b2a6eb8f964a52012a924e3,43.65351471121164,-79.38069591056922
5,Toronto Public Library,6.4,M Square Coffee Co,Coffee Shop,54132b3b498ee9ca9332e189,43.65121797253777,-79.38355459932247
6,Toronto Public Library,6.4,Apple Eaton Centre,Electronics Store,4ad788c8f964a520e40b21e3,43.652823,-79.380615


In [26]:
#Renaming Columns

venueTop_df.columns=['Library Name','Library Rating','Venue Name','Category','Venue id','Venue Latitude','Venue Longitude']
venueTop_df.head()

Unnamed: 0,Library Name,Library Rating,Venue Name,Category,Venue id,Venue Latitude,Venue Longitude
0,Toronto Public Library,6.4,Downtown Toronto,Neighborhood,5227bb01498e17bf485e6202,43.65323167517444,-79.38529600606677
1,Toronto Public Library,6.4,Nathan Phillips Square,Plaza,4ad4c05ef964a520a6f620e3,43.65227047322295,-79.38351631164551
2,Toronto Public Library,6.4,Eggspectation Bell Trinity Square,Breakfast Spot,537773d1498e74a75bb75c1e,43.65314383888587,-79.38198016678167
3,Toronto Public Library,6.4,Old City Hall,Monument / Landmark,4ad4c05ef964a5208ef620e3,43.652008800876125,-79.3817442232328
4,Toronto Public Library,6.4,Indigo,Bookstore,4b2a6eb8f964a52012a924e3,43.65351471121164,-79.38069591056922


In [27]:
# one hot encoding for the column = Category
library_onehot_df = pd.get_dummies(venueTop_df['Category'])

# Drop column as it is now encoded 
venueTop_df = venueTop_df.drop('Category',axis = 1)

# Join the encoded df 
venueTop_df = venueTop_df.join(library_onehot_df) 

venueTop_df.head()

Unnamed: 0,Library Name,Library Rating,Venue Name,Venue id,Venue Latitude,Venue Longitude,American Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bar,Beer Bar,Beer Store,Bistro,Bookstore,Brazilian Restaurant,Breakfast Spot,Café,Caribbean Restaurant,Chiropractor,Cocktail Bar,Coffee Shop,Comedy Club,Comic Shop,Concert Hall,Cosmetics Shop,Deli / Bodega,Dessert Shop,Diner,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Farmers Market,Filipino Restaurant,Food & Drink Shop,Food Court,French Restaurant,Furniture / Home Store,Gaming Cafe,Gastropub,Gourmet Shop,Greek Restaurant,Grocery Store,Gym / Fitness Center,Hawaiian Restaurant,Health Food Store,High School,Hotel,Ice Cream Shop,Italian Restaurant,Japanese Restaurant,Juice Bar,Korean Restaurant,Library,Liquor Store,Lounge,Market,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Monument / Landmark,Movie Theater,Music Store,Neighborhood,Nightclub,Park,Pharmacy,Pizza Place,Plaza,Poutine Place,Ramen Restaurant,Restaurant,Sandwich Place,Smoothie Shop,Spa,Speakeasy,Steakhouse,Supermarket,Sushi Restaurant,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Thrift / Vintage Store,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Whisky Bar,Wine Bar,Yoga Studio
0,Toronto Public Library,6.4,Downtown Toronto,5227bb01498e17bf485e6202,43.65323167517444,-79.38529600606677,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Toronto Public Library,6.4,Nathan Phillips Square,4ad4c05ef964a520a6f620e3,43.65227047322295,-79.38351631164551,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Toronto Public Library,6.4,Eggspectation Bell Trinity Square,537773d1498e74a75bb75c1e,43.65314383888587,-79.38198016678167,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Toronto Public Library,6.4,Old City Hall,4ad4c05ef964a5208ef620e3,43.652008800876125,-79.3817442232328,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Toronto Public Library,6.4,Indigo,4b2a6eb8f964a52012a924e3,43.65351471121164,-79.38069591056922,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [28]:
venueTop_df.shape

(200, 95)

In [30]:
library_grouped = venueTop_df.groupby('Library Name').mean().reset_index()
library_grouped.head()

Unnamed: 0,Library Name,American Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bar,Beer Bar,Beer Store,Bistro,Bookstore,Brazilian Restaurant,Breakfast Spot,Café,Caribbean Restaurant,Chiropractor,Cocktail Bar,Coffee Shop,Comedy Club,Comic Shop,Concert Hall,Cosmetics Shop,Deli / Bodega,Dessert Shop,Diner,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Farmers Market,Filipino Restaurant,Food & Drink Shop,Food Court,French Restaurant,Furniture / Home Store,Gaming Cafe,Gastropub,Gourmet Shop,Greek Restaurant,Grocery Store,Gym / Fitness Center,Hawaiian Restaurant,Health Food Store,High School,Hotel,Ice Cream Shop,Italian Restaurant,Japanese Restaurant,Juice Bar,Korean Restaurant,Library,Liquor Store,Lounge,Market,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Monument / Landmark,Movie Theater,Music Store,Neighborhood,Nightclub,Park,Pharmacy,Pizza Place,Plaza,Poutine Place,Ramen Restaurant,Restaurant,Sandwich Place,Smoothie Shop,Spa,Speakeasy,Steakhouse,Supermarket,Sushi Restaurant,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Thrift / Vintage Store,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Whisky Bar,Wine Bar,Yoga Studio
0,Toronto Public Library,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.1,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Toronto Public Library (Annette Street),0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.1,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Toronto Public Library (Fort York Branch),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.1,0.0,0.0,0.3,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.1,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1
3,Toronto Public Library (High Park Branch),0.1,0.0,0.0,0.0,0.1,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Toronto Public Library (Merril Collection),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.1,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.1,0.0,0.0,0.1,0.1,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0


In [31]:
num_top_venues = 5

for library in library_grouped['Library Name']:
    print("----"+library+"----")
    temp = library_grouped[library_grouped['Library Name'] == library].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Toronto Public Library----
               venue  freq
0     Breakfast Spot   0.1
1         Restaurant   0.1
2   Asian Restaurant   0.1
3  Electronics Store   0.1
4     Cosmetics Shop   0.1


----Toronto Public Library (Annette Street)----
                    venue  freq
0                     Bar   0.2
1      Italian Restaurant   0.1
2               Speakeasy   0.1
3  Furniture / Home Store   0.1
4      Mexican Restaurant   0.1


----Toronto Public Library (Fort York Branch)----
         venue  freq
0  Coffee Shop   0.3
1  Yoga Studio   0.1
2         Café   0.1
3         Park   0.1
4        Diner   0.1


----Toronto Public Library (High Park Branch)----
                 venue  freq
0               Bakery   0.2
1  American Restaurant   0.1
2          Coffee Shop   0.1
3            Juice Bar   0.1
4    Food & Drink Shop   0.1


----Toronto Public Library (Merril Collection)----
                           venue  freq
0            Dumpling Restaurant   0.1
1                  Smoothie Sh

In [32]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [33]:
num_top_venues = 5

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Library']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
library_venues_sorted = pd.DataFrame(columns=columns)
library_venues_sorted['Library'] = library_grouped['Library Name']

for ind in np.arange(library_grouped.shape[0]):
    library_venues_sorted.iloc[ind, 1:] = return_most_common_venues(library_grouped.iloc[ind, :], num_top_venues)

library_venues_sorted.head()

Unnamed: 0,Library,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Toronto Public Library,Neighborhood,Electronics Store,Cosmetics Shop,Asian Restaurant,Monument / Landmark
1,Toronto Public Library (Annette Street),Bar,Café,Grocery Store,Arts & Crafts Store,Italian Restaurant
2,Toronto Public Library (Fort York Branch),Coffee Shop,Yoga Studio,Park,Diner,Ramen Restaurant
3,Toronto Public Library (High Park Branch),Bakery,American Restaurant,Pizza Place,Ice Cream Shop,BBQ Joint
4,Toronto Public Library (Merril Collection),Smoothie Shop,Coffee Shop,Vegetarian / Vegan Restaurant,Dessert Shop,Gaming Cafe


In [34]:
# set number of clusters
kclusters = 5

library_grouped_clustering = library_grouped.drop('Library Name', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(library_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([2, 3, 2, 4, 1, 1, 2, 3, 1, 3], dtype=int32)

In [37]:
# add clustering labels

library_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
library_merged = TorPubLib1

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
library_merged = library_merged.join(library_venues_sorted.set_index('Library'), on='Name')

library_merged.head() 

Unnamed: 0,Name,Venue ID,Longitude,Latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Toronto Public Library,4c8938c8944e224b52e72285,-79.383295,43.652631,2,Neighborhood,Electronics Store,Cosmetics Shop,Asian Restaurant,Monument / Landmark
1,Toronto Public Library - Toronto Reference Lib...,4b5f2e80f964a52088ab29e3,-79.386944,43.671795,1,Gourmet Shop,Hotel,Comic Shop,Italian Restaurant,Thai Restaurant
2,Toronto Public Library - Lillian H. Smith Branch,4ae6010ff964a520f7a321e3,-79.398372,43.658137,1,Smoothie Shop,Coffee Shop,Vegetarian / Vegan Restaurant,Dessert Shop,Gaming Cafe
3,Toronto Public Library - St. Lawrence Branch,4b51e5aff964a5203c5a27e3,-79.36833,43.650048,2,Coffee Shop,Café,Grocery Store,Bar,Restaurant
4,Toronto Public Library (St. James Town),4b807beef964a5209d7630e3,-79.374998,43.66879,2,Coffee Shop,Market,Diner,Restaurant,Caribbean Restaurant


In [38]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(library_merged['Latitude'], library_merged['Longitude'], library_merged['Name'], library_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [39]:
library_merged.loc[library_merged['Cluster Labels'] == 0, library_merged.columns[[0] + list(range(5, library_merged.shape[1]))]]

Unnamed: 0,Name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
10,Toronto Public Library - Northern District Branch,Italian Restaurant,Bookstore,Deli / Bodega,Vietnamese Restaurant,Wine Bar
18,Toronto Public Library - Dufferin/St. Clair Br...,Italian Restaurant,Breakfast Spot,Brazilian Restaurant,Coffee Shop,Mexican Restaurant


In [40]:
library_merged.loc[library_merged['Cluster Labels'] == 1, library_merged.columns[[0] + list(range(5, library_merged.shape[1]))]]

Unnamed: 0,Name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
1,Toronto Public Library - Toronto Reference Lib...,Gourmet Shop,Hotel,Comic Shop,Italian Restaurant,Thai Restaurant
2,Toronto Public Library - Lillian H. Smith Branch,Smoothie Shop,Coffee Shop,Vegetarian / Vegan Restaurant,Dessert Shop,Gaming Cafe
5,Toronto Public Library - Palmerston Branch,Korean Restaurant,Spa,Health Food Store,Vegetarian / Vegan Restaurant,Dessert Shop
6,Toronto Public Library - College/Shaw Branch,Music Store,Breakfast Spot,Whisky Bar,Vegetarian / Vegan Restaurant,Nightclub
13,Toronto Public Library (Sanderson Branch),Filipino Restaurant,Art Gallery,Vegetarian / Vegan Restaurant,Park,Italian Restaurant
14,Toronto Public Library Bookmobile,Theater,Gym / Fitness Center,Movie Theater,Comedy Club,Mediterranean Restaurant
15,Toronto Public Library (Merril Collection),Smoothie Shop,Coffee Shop,Vegetarian / Vegan Restaurant,Dessert Shop,Gaming Cafe


In [41]:
library_merged.loc[library_merged['Cluster Labels'] == 2, library_merged.columns[[0] + list(range(5, library_merged.shape[1]))]]

Unnamed: 0,Name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Toronto Public Library,Neighborhood,Electronics Store,Cosmetics Shop,Asian Restaurant,Monument / Landmark
3,Toronto Public Library - St. Lawrence Branch,Coffee Shop,Café,Grocery Store,Bar,Restaurant
4,Toronto Public Library (St. James Town),Coffee Shop,Market,Diner,Restaurant,Caribbean Restaurant
8,Toronto Public Library (Fort York Branch),Coffee Shop,Yoga Studio,Park,Diner,Ramen Restaurant


In [42]:
library_merged.loc[library_merged['Cluster Labels'] == 3, library_merged.columns[[0] + list(range(5, library_merged.shape[1]))]]

Unnamed: 0,Name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
7,Toronto Public Library - Bloor Gladstone Branch,Café,Bar,Vegetarian / Vegan Restaurant,Mexican Restaurant,Coffee Shop
9,Toronto Public Library - Parkdale Branch,French Restaurant,Café,Arts & Crafts Store,Hawaiian Restaurant,BBQ Joint
12,Toronto Public Library - Deer Park Branch,Café,Yoga Studio,Tea Room,Bagel Shop,Chiropractor
16,Toronto Public Library (Annette Street),Bar,Café,Grocery Store,Arts & Crafts Store,Italian Restaurant


In [43]:
library_merged.loc[library_merged['Cluster Labels'] == 4, library_merged.columns[[0] + list(range(5, library_merged.shape[1]))]]

Unnamed: 0,Name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
11,Toronto Public Library - Pape/Danforth Branch,Pizza Place,Greek Restaurant,Diner,Turkish Restaurant,BBQ Joint
17,Toronto Public Library (High Park Branch),Bakery,American Restaurant,Pizza Place,Ice Cream Shop,BBQ Joint
19,Toronto Public Library - Wychwood Branch,Dessert Shop,Ice Cream Shop,Yoga Studio,Pharmacy,Caribbean Restaurant


## Methodology <a name="methodology"></a>

As mentioned earlier, the objective is determine which libraries are in need of more foot traffic and which venues can possibly help boost visit to these libraries. Several factors are necessary for the libraries to play a successful role as economic and social stimuli for their communities. As such, we would be dealing with objective data and a little subjective as well. The key venues needed would have to be of different categories so as to attract different people of varied interests which is an essential factor in library success.

The study was limited to a 5 KM radius from the geographical center of Toronto and limited to 20 libraries. About 30% of the libraries do not have any rating which would probably suggest that people are not enthusiastic or interested in those libraries. In Foursquare lingo, there is not enough sentiment data to do a rating calculation. Instead of not including these in the study, I decided to include them and clearly classify them as low foot traffic. We will review the other 70% of libraries that have ratings and from these identify the establishments that would help. 

We have used k-means clustering to identify similarities among the libraries. 


## Results <a name="results"></a>

Around 30% of the libraries have no ratings on Foursquare.The results in the data section show the different categories of popular venues around 500 meters from each library. From these, it is evident that those with ratings have a big chunk of establishments nearby as cafe (or coffee shops) then restaurants. Italian restaurants  are in good demand around these libraries. It is also interesting to note that bars are also near these libraries.

Furthermore, top venues for libraries with no rating are the same. The top categories are nearly identical for both with ratings and none (Cafe, Italian Restaurant, Restaurants, and Bar). 

K-means clustering also re-inforced this commonality among libraries. The clusters that were developed includes both libraries with rating and no rating. 


## Discussion <a name="discussion"></a>

There are venue similarities between these libraries regardless whether they are rated or not. This was shown with the results generated and with the k-means clustering.  Only one cluster was isolated but still showed restaurants, although specializing more in foreign flavors. It can also be that these libraries are reasonably closed to one another thus showing the same top venues. This “colocation” may have attributed to  some of them having the same results as far as the top venues are concerned.

As a sidenote, I have rerun the clustering so the results in this notebook is not exactly the same in the presentation and the report. But the idea is still the same. 


## Conclusion <a name="conclusion"></a>

Based on these observations, I would have to conclude that foot traffic to libraries are not greatly influenced by establishments around them. 

The study actually started out with the idea of finding which venues would help in establishing libraries as a third space, those that can help stimulate economic growth and at the same time become a social center to strengthen the communities.

The results of the study indicate that the venues near the libraries do not extensively contribute to their foot traffic. What would encourage foot traffic to libraries would then be other factors. It can be proximity to schools, transportation stations, or government institutions but that would be for another study.

