<h1 style="color:red;"> Capstone Project: Battle of Neighborhoods</h1>

### Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">
<ol>
    <li><a href="#1.-Introduction---Business-Problem">Introduction - Business Problem</a>        
    <li><a href="#2.Data">Data</a>        
    <li><a href="#3.-Methodology">Methodology</a>
</ol>
</div>

<h2 style="color:blue;">1. <u>Introduction - Business Problem</u> </h2>
<h3>Opening a Yoga Studio</h3>

<i>I want to open a new Yoga Studio in Toronto. Where should I open it? </i>
This is one of the questions that an entrepreneur would need to answer before starting his or her business. Location is key for a new business and one that requires careful research. In this capstone project, I will attempt to solve this problem through data analysis

<h2 style="color:blue;">2.<u>Data</u></h2>

To solve the problem, we will need data from various sources: 
<ul>
    <li><b>Foursquare</b>: we will be leveraging the existing venues for each neighborhood in order to analyze how many yoga studios are in each neighborhood. This will be used to assess the competition in each neighborhood</li>
    <li><b>Demographics</b>: age and sex. Since yoga studios tend to be popular among women, we will be taking this into consideration in our analysis. We will be looking at the number of female population for each neighborhood. This data will be retrieved from the Open Data section on the website of the City of Toronto. </li>
    <li><b>Average income per neighborhood</b>: It's important to know the purchasing power of the residents of each neighborhood before establishing a business. This will also be retrieved from the City of Toronto website.</li?
    
</ul>
Contains information licensed under the Open Government Licence – Toronto. Link: https://open.toronto.ca/open-data-license/

<h2 style="color:blue;">3. <u>Methodology</u></h2>

### Neighborhood Data Profile
In this section we will be building the Neighborhood data profile which is composed of Demographics (Age, sex), number of population in each neighborhood and the average income. 

#### Data Cleaning and Wrangling

In [62]:
import pandas as pd
import numpy as np
import wget
import requests
import csv
import geocoder
import geopy
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans
import folium
import json 
print("Libraries imported")

Libraries imported


Let's import our source file: 

In [2]:
csvfile = "C:/Users/nirin/OneDrive/Documents/Capstone Data/neighbourhood-profiles-2016.csv"
dfnp = pd.read_csv(csvfile,
                 header=0,
                 delimiter=',',                 
                 quotechar='"',
                 error_bad_lines=False,
                 engine='python')
dfnp.shape

(2383, 146)

In [3]:
dfnp.tail()

Unnamed: 0,_id,Category,Topic,Data Source,Characteristic,City of Toronto,Agincourt North,Agincourt South-Malvern West,Alderwood,Annex,...,Willowdale West,Willowridge-Martingrove-Richview,Woburn,Woodbine Corridor,Woodbine-Lumsden,Wychwood,Yonge-Eglinton,Yonge-St.Clair,York University Heights,Yorkdale-Glen Park
2378,2379,Mobility,Mobility status - Place of residence 5 years ago,Census Profile 98-316-X2016001,Migrants,400950,3170,3145,925,6390,...,3765,2270,7260,985,620,1350,2425,2310,4965,1345
2379,2380,Mobility,Mobility status - Place of residence 5 years ago,Census Profile 98-316-X2016001,Internal migrants,184120,880,980,680,3930,...,1545,1110,1720,610,395,780,1260,1355,1700,580
2380,2381,Mobility,Mobility status - Place of residence 5 years ago,Census Profile 98-316-X2016001,Intraprovincial migrants,141135,735,760,615,2630,...,1070,960,1400,350,320,570,970,1025,1490,445
2381,2382,Mobility,Mobility status - Place of residence 5 years ago,Census Profile 98-316-X2016001,Interprovincial migrants,42985,135,220,70,1310,...,475,150,335,250,85,210,290,325,195,135
2382,2383,Mobility,Mobility status - Place of residence 5 years ago,Census Profile 98-316-X2016001,External migrants,216835,2280,2170,245,2460,...,2220,1175,5540,395,220,575,1160,955,3285,775


Let's drop the rows based on the value of the Topic and Categories since not all of the information will be useful

In [4]:
dfnp = dfnp.drop(dfnp[(dfnp["Category"].isin(['Aboriginal peoples',
                              'Ethnic origin',
                              'Housing',
                              'Immigration and citizenship',
                              'Journey to work',
                              'Language',
                              'Mobility',
                              'Neighbourhood Information',
                              'Visible Minority',
                              'Language of work'
                             ]))].index)
dfnp.shape

(709, 146)

In [5]:
dfnp = dfnp.drop(dfnp[(dfnp["Topic"].isin([
    'Income sources',
    'Household and dwelling characteristics',
    'Family characteristics',
    'Household type',
    'Family characteristics of adults',
    'Income of households in 2015',
    'Income of economic families in 2015',
    'Low income in 2015',
    'Major field of study - Classification of Instructional Programs (CIP) 2016',
    'Location of study compared with province or territory of residence with countries outside Canada',
    'Work activity during the reference year',
    'Class of worker',
    'Occupation - National Occupational Classification (NOC) 2016',
    'Industry - North American Industry Classification System (NAICS) 2012',
    'Place of work status',
    'Income taxes',
    'Highest certificate, diploma or degree',
    'Population and dwellings',
    'Visible minority population'
        
]))].index)
dfnp.shape

(161, 146)

In [6]:
dfnp = dfnp.drop(dfnp[(dfnp["Characteristic"].isin([
    'Children (0-14 years)',
'Youth (15-24 years)',
'Working Age (25-54 years)',
'Pre-retirement (55-64 years)',
'Seniors (65+ years)',
'Older Seniors (85+ years)',
'Marital status for the population aged 15 years and over',
'Total - Income statistics in 2015 for the population aged 15 years and over in private households',
'Number of total income recipients aged 15 years and over in private households',
'Median total income in 2015 among recipients ($)',
'Number of after-tax income recipients aged 15 years and over in private households - 100% data',
'Median after-tax income in 2015 among recipients ($)',
'Number of market income recipients aged 15 years and over in private households - 100% data',
'Median market income in 2015 among recipients ($)',
'Number of government transfers recipients aged 15 years and over in private households - 100% data',
'Median government transfers in 2015 among recipients ($)',
'Number of employment income recipients aged 15 years and over in private households - 100% data',
'Median employment income in 2015 among recipients ($)',
'Total - Income statistics in 2015 for the population aged 15 years and over in private households - 25% sample data',
'Number of total income recipients aged 15 years and over in private households - 25% sample data',
'Average total income in 2015 among recipients ($)',
'Number of after-tax income recipients aged 15 years and over in private households - 25% sample data',
'Average after-tax income in 2015 among recipients ($)',
'Number of market income recipients aged 15 years and over in private households - 25% sample data',
'Average market income in 2015 among recipients ($)',
'Number of government transfers recipients aged 15 years and over in private households - 25% sample data',
'Average government transfers in 2015 among recipients ($)',
'Number of employment income recipients aged 15 years and over in private households - 25% sample data',
'Average employment income in 2015 among recipients ($)',
'Total - Employment income statistics for the population aged 15 years and over in private households - 25% sample data',
'Number of employment income recipients aged 15 years and over in private households who worked full year full time in 2015 - 25% sample data',
'Median employment income in 2015 for full-year full-time workers ($)',
'Average employment income in 2015 for full-year full-time workers ($)',
'Composition of total income in 2015 of the population aged 15 years and over in private households (%) - 100% data',
'Market income (%)',
'Employment income (%)',
'Government transfers (%)',
'Total - Population aged 15 years and over by Labour force status - 25% sample data'    
]))].index)
                      
dfnp.shape

(146, 146)

In [7]:
dfnp.dropna(subset = ["Annex"], inplace=True)
dfnp = dfnp.drop(dfnp[(dfnp["_id"].isin(['946','947','948','949','950','952','953','954','959','960','961','962','963','964','966','967','968','969','975','976','977','978','979','980','981','982','983','984','985','986','987','988','989','990','991','992','993','994','995','996','997','998','999','1000','1001','1002','1003','1004','1005','1006','1007','1008','1009','1010','1011','1012','1013','1014','1016','1017','1892','1893','1894','1895','1896','1897','1899','1900','1901','1902','1903','1904','1905','1907'
]))].index)
dfnp = dfnp.drop(labels=['Data Source','Category','City of Toronto','_id'],axis=1)


Now that our data is clean, let's create data frames based on the Topic and Visualize them

### Age - Sex

In [8]:
df_age = dfnp[(dfnp["Topic"]=="Age characteristics")]
df_age=df_age.drop(labels = ['Topic'],axis = 1)
df_age.set_index('Characteristic', inplace=True)
df_age = df_age.transpose()
df_age.head()

Characteristic,Male: 0 to 04 years,Male: 05 to 09 years,Male: 10 to 14 years,Male: 15 to 19 years,Male: 20 to 24 years,Male: 25 to 29 years,Male: 30 to 34 years,Male: 35 to 39 years,Male: 40 to 44 years,Male: 45 to 49 years,...,Female: 55 to 59 years,Female: 60 to 64 years,Female: 65 to 69 years,Female: 70 to 74 years,Female: 75 to 79 years,Female: 80 to 84 years,Female: 85 to 89 years,Female: 90 to 94 years,Female: 95 to 99 years,Female: 100 years and over
Agincourt North,660,695,660,840,1015,1015,835,680,760,890,...,1165,1070,985,690,575,485,350,160,60,10
Agincourt South-Malvern West,575,540,460,780,1000,1045,820,625,610,760,...,915,795,690,450,405,350,205,100,20,0
Alderwood,360,270,225,285,355,355,410,455,420,440,...,485,400,325,210,180,210,130,70,5,5
Annex,445,365,325,465,1215,2080,1610,1055,835,850,...,915,940,950,700,565,425,345,260,90,25
Banbury-Don Mills,570,660,675,715,700,645,735,735,815,1010,...,1005,895,955,790,730,650,615,360,105,20


Let's store the values below as the percentage of total. 

In [9]:
df_age[["Male: 0 to 04 years","Male: 05 to 09 years","Male: 10 to 14 years","Male: 15 to 19 years","Male: 20 to 24 years","Male: 25 to 29 years","Male: 30 to 34 years","Male: 35 to 39 years","Male: 40 to 44 years","Male: 45 to 49 years","Male: 50 to 54 years","Male: 55 to 59 years","Male: 60 to 64 years","Male: 65 to 69 years","Male: 70 to 74 years","Male: 75 to 79 years","Female: 10 to 14 years","Male: 80 to 84 years","Male: 85 to 89 years","Male: 90 to 94 years","Male: 95 to 99 years","Male: 100 years and over","Female: 0 to 04 years","Female: 05 to 09 years","Female: 15 to 19 years","Female: 20 to 24 years","Female: 25 to 29 years","Female: 30 to 34 years","Female: 35 to 39 years","Female: 40 to 44 years","Female: 45 to 49 years","Female: 50 to 54 years","Female: 55 to 59 years","Female: 60 to 64 years","Female: 65 to 69 years","Female: 70 to 74 years","Female: 75 to 79 years","Female: 80 to 84 years","Female: 85 to 89 years","Female: 90 to 94 years","Female: 95 to 99 years","Female: 100 years and over"]]= df_age[["Male: 0 to 04 years","Male: 05 to 09 years","Male: 10 to 14 years","Male: 15 to 19 years","Male: 20 to 24 years","Male: 25 to 29 years","Male: 30 to 34 years","Male: 35 to 39 years","Male: 40 to 44 years","Male: 45 to 49 years","Male: 50 to 54 years","Male: 55 to 59 years","Male: 60 to 64 years","Male: 65 to 69 years","Male: 70 to 74 years","Male: 75 to 79 years","Female: 10 to 14 years","Male: 80 to 84 years","Male: 85 to 89 years","Male: 90 to 94 years","Male: 95 to 99 years","Male: 100 years and over","Female: 0 to 04 years","Female: 05 to 09 years","Female: 15 to 19 years","Female: 20 to 24 years","Female: 25 to 29 years","Female: 30 to 34 years","Female: 35 to 39 years","Female: 40 to 44 years","Female: 45 to 49 years","Female: 50 to 54 years","Female: 55 to 59 years","Female: 60 to 64 years","Female: 65 to 69 years","Female: 70 to 74 years","Female: 75 to 79 years","Female: 80 to 84 years","Female: 85 to 89 years","Female: 90 to 94 years","Female: 95 to 99 years","Female: 100 years and over"]].astype("float")

In [10]:
ageSum = df_age.to_numpy().sum()
print("The total of all population is{}", ageSum)

The total of all population is{} 2731015.0


Let's create a new dataframe that has the total of male and female of age 15 to 50 years old. These are our target of interest. 

In [11]:
newcol = ['Active Male', 'Active Female']
activePop = pd.DataFrame(columns=newcol)
activePop.shape

(0, 2)

In [12]:
activePop["Active Male"] = df_age["Male: 15 to 19 years"] + df_age["Male: 20 to 24 years"] + df_age["Male: 25 to 29 years"] + df_age["Male: 30 to 34 years"] + df_age["Male: 35 to 39 years"] + df_age["Male: 40 to 44 years"] + df_age["Male: 45 to 49 years"]

In [13]:
activePop["Active Female"] = df_age["Female: 15 to 19 years"] + df_age["Female: 20 to 24 years"] + df_age["Female: 25 to 29 years"] + df_age["Female: 30 to 34 years"] + df_age["Female: 35 to 39 years"] + df_age["Female: 40 to 44 years"] + df_age["Female: 45 to 49 years"]

In [14]:
activeSum = activePop.to_numpy().sum()
print("The total of all population is{}",activeSum)

The total of all population is{} 1367135.0


In [15]:
#Getting the total of all active population
activePop = activePop/1367135

#resetting the index and renaming the old index 
activePop.reset_index(level = 0, inplace=True)
activePop = activePop.rename(columns = {'index': 'Neighborhood'})
activePop.head()

Unnamed: 0,Neighborhood,Active Male,Active Female
0,Agincourt North,0.004414,0.004773
1,Agincourt South-Malvern West,0.004125,0.004199
2,Alderwood,0.00199,0.001979
3,Annex,0.005932,0.006451
4,Banbury-Don Mills,0.003917,0.004356


In [16]:
activePop.to_csv('PopgenderbyNeighborhood.csv')

### Income in each Neighborhood

In [17]:
df_income = dfnp[(dfnp["Topic"]=="Income of individuals in 2015")]
df_income=df_income.drop(labels = ['Topic'],axis = 1)
df_income.set_index('Characteristic', inplace=True)
df_income = df_income.transpose()

In [18]:
cols = [5,6,9]
df_income = df_income.drop(df_income.columns[cols],axis=1)


In [19]:
df_income.rename(columns={'    $60,000 to $69,999':'$60k to $69,999',
                          '    $70,000 to $79,999':'$70k to $79,999',
                          '    $80,000 to $89,999':'$80k to $89,999',
                          '    $90,000 to $99,999':'$90k to $99,999',
                          '      $150,000 and over':'$150k and over',
                          '      $100,000 to $149,999':'$100k to $149,999',
                          '    $30,000 to $39,999':'$30k to $39,999',
                          '    $40,000 to $49,999':'$40k to $49,999',
                          '    $50,000 to $59,999':'$50k to $59,999'},inplace=True)
df_income.shape

(140, 9)

In [20]:
df_income['$60k to $69,999']= df_income['$60k to $69,999'].str.replace(',', '')
df_income['$70k to $79,999']= df_income['$70k to $79,999'].str.replace(',', '')
df_income['$80k to $89,999']= df_income['$80k to $89,999'].str.replace(',', '')
df_income['$90k to $99,999']= df_income['$90k to $99,999'].str.replace(',', '')
df_income['$150k and over'] = df_income['$150k and over'].str.replace(',', '')
df_income['$100k to $149,999']= df_income['$100k to $149,999'].str.replace(',', '')
df_income['$30k to $39,999']= df_income['$30k to $39,999'].str.replace(',', '')
df_income['$40k to $49,999']= df_income['$40k to $49,999'].str.replace(',', '')
df_income['$50k to $59,999']= df_income['$50k to $59,999'].str.replace(',', '')

In [21]:
df_income[["$60k to $69,999","$70k to $79,999","$80k to $89,999","$90k to $99,999","$150k and over","$100k to $149,999","$30k to $39,999","$40k to $49,999","$50k to $59,999"]] = df_income[["$60k to $69,999","$70k to $79,999","$80k to $89,999","$90k to $99,999","$150k and over","$100k to $149,999","$30k to $39,999","$40k to $49,999","$50k to $59,999"]].astype("float")

Let's get the total row for each neighborhood and store the income brackets as percentages of that total

In [22]:
#get the total of each neighborhood
df_income['RowTotal'] = df_income.sum(axis=1)

#Divide each neighborhood bracket by each row total
df_income2 = df_income.div(df_income['RowTotal'],axis = 0)
df_income2.reset_index(inplace=True)

#renaming the old index
df_income2 = df_income2.rename(columns = {'index' : 'Neighborhood'})
df_income2.head()

Characteristic,Neighborhood,"$60k to $69,999","$70k to $79,999","$80k to $89,999","$90k to $99,999",$150k and over,"$100k to $149,999","$30k to $39,999","$40k to $49,999","$50k to $59,999",RowTotal
0,Agincourt North,0.100465,0.076074,0.050523,0.042393,0.015679,0.061556,0.286295,0.220093,0.146922,1.0
1,Agincourt South-Malvern West,0.109416,0.075597,0.057692,0.041777,0.021883,0.069629,0.267905,0.206897,0.149204,1.0
2,Alderwood,0.121053,0.092982,0.069298,0.064912,0.039474,0.108772,0.192105,0.166667,0.144737,1.0
3,Annex,0.0938,0.082878,0.064247,0.053325,0.196274,0.1407,0.13813,0.124317,0.106328,1.0
4,Banbury-Don Mills,0.104357,0.089345,0.070304,0.060051,0.119736,0.14903,0.145002,0.140242,0.121933,1.0


In [23]:
df_income2.to_csv('IncomebyNeighborhood.csv')

### Adding the geolocation data

We will use the data from the previous assignment to retrieve the zip codes and then the geolocation data for each neighborhood. We will not use Borough in this section

In [24]:
#let's open the csv file
csvfile = "C:/Users/nirin/Downloads/output.csv"
neighpostal = pd.read_csv(csvfile,
                 header=0,
                 delimiter='  ',                 
                 quotechar='"',
                 error_bad_lines=False,
                 engine='python')
neighpostal.shape

(180, 3)

In [25]:
neighpostal.drop(neighpostal[neighpostal['Borough'] =='Not assigned'].index, inplace = True)
print("removed rows with Not assigned Borough")

removed rows with Not assigned Borough


In [26]:
neighpostal.shape

(103, 3)

In [27]:
neighpostal.tail(50)

Unnamed: 0,Postal Code,Borough,Neighbourhood
83,M3M,North York,"Downsview,Downsview-Roding-CFB"
84,M4M,East Toronto,Studio District
85,M5M,North York,"Bedford Park-Nortown,Lawrence Manor East"
86,M6M,York,"Del Ray,Mount Dennis,Keelsdale and Silverthorn..."
89,M9M,North York,"Humberlea,Emery,Humbermede,Pelmo Park-Humberlea"
90,M1N,Scarborough,Birchcliffe-Cliffside
91,M2N,North York,"Willowdale,Willowdale East,Lansing-Westgate"
92,M3N,North York,Downsview
93,M4N,Central Toronto,"Lawrence Park North,Lawrence Park South"
94,M5N,Central Toronto,Roselawn


In [28]:
#Importing the geographical data
geodf = pd.read_csv("C:/Users/nirin/Geospatial_Coordinates.csv",                   
                 header=0,
                 delimiter=",",                 
                 error_bad_lines=False,
                 engine='python'
                   )
geodf.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [29]:
dfmerge = neighpostal.merge(geodf)
dfmerge.tail(5)

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
98,M8X,Etobicoke,"The Kingsway,Montgomery Road,Old Mill North,Ki...",43.653654,-79.506944
99,M4Y,Downtown Toronto,Church and Wellesley,43.66586,-79.38316
100,M7Y,East Toronto,"Business reply mail Processing Centre,South Ce...",43.662744,-79.321558
101,M8Y,Etobicoke,"Old Mill South,King's Mill Park,Sunnylea,Humbe...",43.636258,-79.498509
102,M8Z,Etobicoke,"Mimico NW,The Queensway West,South of Bloor,Ki...",43.628841,-79.520999


In [30]:
dfmerge.rename(columns = {'Neighbourhood' : 'Neighborhood' }, inplace = True)

In [31]:
dfmerge.head()

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,"Parkwoods,Parkwoods-Donalda",43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park,Harbourfront,Moss Park,Mount Pleas...",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor,Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park,Ontario Provincial Government",43.662301,-79.389494


In [32]:
dfmerge['Neighborhood'] = dfmerge['Neighborhood'].str.split(',')

In [33]:
dfmerge = dfmerge.explode('Neighborhood')
dfmerge.shape

Let's combine the demographics data with the geographical data

In [36]:
activePop.shape

(140, 3)

In [37]:
activePopGeo = activePop.merge(dfmerge)
activePopGeo.shape

In [123]:
activePopGeo.tail(15)

Unnamed: 0,Neighborhood,Active Male,Active Female,Postal Code,Borough,Latitude,Longitude
123,Westminster-Branson,0.00421,0.004765,M3H,North York,43.754328,-79.442259
124,Weston,0.003003,0.003262,M9N,York,43.706876,-79.518188
125,Weston-Pelham Park,0.002129,0.00211,M6N,York,43.673185,-79.487262
126,Wexford/Maryvale,0.004648,0.00463,M1R,Scarborough,43.750072,-79.295849
127,Willowdale East,0.010153,0.011316,M2N,North York,43.77012,-79.408493
128,Willowdale West,0.003105,0.003164,M2R,North York,43.782736,-79.442259
129,Willowridge-Martingrove-Richview,0.003328,0.003398,M9R,Etobicoke,43.688905,-79.554724
130,Woburn,0.009297,0.009604,M1G,Scarborough,43.770992,-79.216917
131,Woodbine Corridor,0.002187,0.002341,M4C,East York,43.695344,-79.318389
132,Woodbine-Lumsden,0.00135,0.001404,M4C,East York,43.695344,-79.318389


## Using Foursquare API

In this section, let's use the Foursquare API to get a list of the yoga studios around toronto for each neighborhood

In [93]:
import random # library for random number generation
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

In [127]:
CLIENT_ID = 'NGBR4FWG0IAT0DHXIBKYXI2E1JI2LEJMEL3L1YA1RW4CSLXG'
CLIENT_SECRET = '23BDE00ECXWL0IEIDFLXDMQIK0PERQTG1MSKSKUOGBPAQSRJ' 
VERSION = '20180323'
#VERSION = '20190425'
LIMIT = 20
print('My credentials:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

My credentials:
CLIENT_ID: NGBR4FWG0IAT0DHXIBKYXI2E1JI2LEJMEL3L1YA1RW4CSLXG
CLIENT_SECRET:23BDE00ECXWL0IEIDFLXDMQIK0PERQTG1MSKSKUOGBPAQSRJ


In [124]:
search_query = 'Yoga'
radius = 500
neighb = 'Yonge-Eglinton'
geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(neighb)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

43.7067479 -79.3983271


In [125]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&v={}&ll={},{}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, VERSION, latitude, longitude, search_query, radius, LIMIT)
url

'https://api.foursquare.com/v2/venues/search?client_id=NGBR4FWG0IAT0DHXIBKYXI2E1JI2LEJMEL3L1YA1RW4CSLXG&client_secret=23BDE00ECXWL0IEIDFLXDMQIK0PERQTG1MSKSKUOGBPAQSRJ&v=20180323&ll=43.7067479,-79.3983271&query=Yoga&radius=500&limit=10'

In [160]:
results = requests.get(url).json()
print(results['response'].keys())
results['response']['venues']

dict_keys(['venues'])


[{'id': '4efa4c786c25c411edcaf286',
  'name': 'Yoga Tree Midtown',
  'contact': {},
  'location': {'address': '40 Eglinton Ave. E',
   'crossStreet': 'at Yonge St.',
   'lat': 43.70764167668336,
   'lng': -79.3974716381371,
   'labeledLatLngs': [{'label': 'display',
     'lat': 43.70764167668336,
     'lng': -79.3974716381371}],
   'distance': 120,
   'postalCode': 'M4P 3A2',
   'cc': 'CA',
   'city': 'Toronto',
   'state': 'ON',
   'country': 'Canada',
   'formattedAddress': ['40 Eglinton Ave. E (at Yonge St.)',
    'Toronto ON M4P 3A2',
    'Canada']},
  'categories': [{'id': '4bf58dd8d48988d102941735',
    'name': 'Yoga Studio',
    'pluralName': 'Yoga Studios',
    'shortName': 'Yoga Studio',
    'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/shops/gym_yogastudio_',
     'suffix': '.png'},
    'primary': True}],
  'verified': True,
  'stats': {'tipCount': 0,
   'usersCount': 0,
   'checkinsCount': 0,
   'visitsCount': 0},
  'beenHere': {'count': 0,
   'lastCheckinExpire

In [179]:
  yoga_venues = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']

In [184]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]

    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}query=yoga&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()['response'][0]
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venues']['name'], 
            v['venues']['location']['lat'], 
            v['venues']['location']['lng'],  
            v['venues']['categories'][0]['name']) for v in results])
    
    yoga_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    yoga_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(yoga_venues)

In [185]:
yoga_venues = getNearbyVenues(names=dfmerge['Neighborhood'],
                                  latitudes=dfmerge['Latitude'],
                                   longitudes=dfmerge['Longitude']
                                  )

Parkwoods


KeyError: 0

In [183]:
yoga_venues.head()

AttributeError: 'list' object has no attribute 'head'

<h2 style="color:blue;">4. <u>Results</u></h2
    

<h2 style="color:blue;">5. <u>Discussion</u></h2>

<h2 style="color:blue;">6. <u>Conclusion</u><h2>