# Capstone Project

# Table of Contents
1. [Introduction](##Introduction)
2. [Data](##Data)
3. [Methodology](##Methodology)
4. [Result](##Result)
5. [Discussion](##Discussion)
6. [Conclusion](##Conclusion)
7. [Coding](##Coding)

## Introduction

#### São Paulo (SP) is the largest city in Latin America with 12.252.023 habitants based on Wikipedia data. SP is also the hearth of Brazilian economy with representation of the country financial and industrial systems.  It is also one of the most international hubs of the continent being formed by multiple waves of different nationalities and cultures that enrich the city ethos as well as the main entry door to Latin America through its international.
#### Altogether, it makes SP one of the most important cities of Brazil but also one of the most susceptible to contagious of different types of international crisis such as financial crashes and health issues. During the current pandemic of COVID-19, SP is the epicentre of Brazilian transmission of COVID-19 and its most acute health risk as result. Estimates point to several hundreds of thousands of people infected and another dozens of thousands in need of hospital services.
#### SP has around 29,000 hospital beds as of 2018 (municipality records) and an average of 2.4688 beds per 1,000 habitants. Even though this is slightly lower than the target, these facilities are not evenly available in town and understanding needs is critical for the COVID-19 preparedness efforts. 
#### As result, the city has been ongoing different preparation for the epidemic peak in the next month, including the installation of temporary medical facilities to support the existing system. This project will support this decision by:
#### 1.	Identifying the profile of existing hospitals in SP,
#### 2.	Identifying the profile of subprefectures,
#### 3.	Compare it with socioeconomical data per region of the city,
#### 4.	Suggest the locations with higher need of temporary augmentation.



## Data

#### Several data sets will have to be used. SP is divided in Subprefectures which can be loosely compared with Toronto boroughs and it is the main internal administrative division of the city. This also means that several datasets are available based on this structure and SP maintain open access databases or indicators that can be used for this exercise. The following databases will be used:
#### 1.	The subdistrict population and size was collected from the municipality:
#### https://www.prefeitura.sp.gov.br/cidade/secretarias/subprefeituras/subprefeituras/dados_demograficos/index.php?p=12758
#### 2.	The list of health units and number of beds were collected from the secretary of health from SP:
#### http://dados.prefeitura.sp.gov.br/dataset/f944b957-2193-48a4-9068-72a26e6ee577/resource/fd72d932-fc65-43cc-a74f-f309225f74e8/download/deinfosacadsau2014.csv
#### 3.	The subprefectures shapefile was collected from the municipality:
#### http://geosampa.prefeitura.sp.gov.br/PaginasPublicas/downloadIfr.aspx?orig=DownloadCamadas&arq=01_Limites%20Administrativos%5C%5CSubprefeituras%5C%5CShapefile%5C%5CSIRGAS_SHP_subprefeitura&arqTipo=Shapefile
#### 4. The Socioeconomic data was collected from the municipality:
#### http://dados.prefeitura.sp.gov.br/dataset/f944b957-2193-48a4-9068-72a26e6ee577/resource/fd72d932-fc65-43cc-a74f-f309225f74e8/download/deinfosacadsau2014.csv



## Methodology

## Results

## Discussion

## Coding

## Package Installation

In [2]:
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
import json # library to handle JSON files
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans
!conda install -c conda-forge folium=0.5.0 --yes
import folium # map rendering library
import random
print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    branca-0.4.0               |             py_0          26 KB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    python_abi-3.6             |          1_cp36m           4 KB  conda-forge
    openssl-1.1.1f             |       h516909a_0         2.1 MB  conda-forge
    ca-certificates-2020.4.5.1 |       hecc5488_0         146 KB  conda-forge
    altair-4.1.0               |             py_1         614 KB  conda-forge
    certifi-2020.4.5.1         |   py36h9f0ad1d_0         151 KB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
    ------------------------------------------------------------
                       

In [3]:
from geopy.geocoders import Nominatim
from IPython.display import Image 
from IPython.core.display import HTML 
from pandas.io.json import json_normalize
import requests # library to handle requests
from bs4 import BeautifulSoup

# Data Preparation

In [4]:
# Add subprefecture data
req = requests.get("https://www.prefeitura.sp.gov.br/cidade/secretarias/subprefeituras/subprefeituras/dados_demograficos/index.php?p=12758")

soup = BeautifulSoup(req.content,'lxml')

table = soup.find_all('table')[0]

df = pd.read_html(str(table))

sub_pop=pd.DataFrame(df[0])

sub_pop.head(94)

Unnamed: 0,Subprefeituras,Distritos,Área (km²),População (2010),Densidade Demográfica (Hab/km²)
0,Aricanduva,Aricanduva,660.0,89.622,13.579
1,Aricanduva,Carrão,750.0,83.281,11.104
2,Aricanduva,Vila Formosa,740.0,94.799,12.811
3,Aricanduva,TOTAL,2150.0,267.702,12.451
4,Butantã,Butantã,1250.0,54.196,4.336
5,Butantã,Morumbi,1140.0,46.957,4.119
6,Butantã,Raposo Tavares,1260.0,100.164,7.95
7,Butantã,Rio Pequeno,970.0,118.459,12.212
8,Butantã,Vila Sônia,990.0,108.441,10.954
9,Butantã,TOTAL,5610.0,428.217,7.633


In [5]:
print(sub_pop.shape)
sub_pop['Subprefeituras'].unique()

(126, 5)


array(['Aricanduva', 'Butantã', 'Campo Limpo', 'Capela do Socorro',
       'Casa Verde', 'Cidade Ademar', 'Cidade Tiradentes',
       'Ermelino Matarazzo', 'Freguesia do Ó', 'Guaianases', 'Ipiranga',
       'Itaim Paulista', 'Itaquera', 'Jabaquara', 'Jaçanã', 'Lapa',
       "M'Boi Mirim", 'Mooca', 'Parelheiros', 'Penha', 'Perus',
       'Pinheiros', 'Pirituba', 'Santana', 'Santo Amaro', 'São Mateus',
       'São Miguel', 'Sapopemba', 'Sé', 'Vila Maria/Vila Guilherme',
       'Vila Mariana', 'Vila Prudente'], dtype=object)

In [6]:
sub_pop.dtypes

Subprefeituras                      object
Distritos                           object
Área (km²)                         float64
População (2010)                   float64
Densidade Demográfica (Hab/km²)    float64
dtype: object

In [7]:
# The original CSV has the subtotals per subprefecture in the file, therefore we are filtering only these rows. One note is that one subprefecture did not have any district and the column 'Total' was the name of the subprefecture.
sub_total = sub_pop[(sub_pop['Distritos'] != 'TOTAL') & (sub_pop['Distritos'] != 'Cidade Tiradentes') & (sub_pop['Distritos'] != 'Sapopemba') & (sub_pop['Distritos'] != 'Jabaquara')].index
sub_pop.drop(sub_total, inplace=True)
sub_pop.sort_values(by=['Subprefeituras'])
print(sub_pop.count())
sub_pop.head(32)

Subprefeituras                     32
Distritos                          32
Área (km²)                         32
População (2010)                   32
Densidade Demográfica (Hab/km²)    32
dtype: int64


Unnamed: 0,Subprefeituras,Distritos,Área (km²),População (2010),Densidade Demográfica (Hab/km²)
3,Aricanduva,TOTAL,2150.0,267.702,12.451
9,Butantã,TOTAL,5610.0,428.217,7.633
13,Campo Limpo,TOTAL,3670.0,607.105,16.542
17,Capela do Socorro,TOTAL,13420.0,594.93,4.433
21,Casa Verde,TOTAL,2670.0,309.376,11.587
24,Cidade Ademar,TOTAL,3070.0,410.998,13.388
25,Cidade Tiradentes,Cidade Tiradentes,1500.0,211.501,14.1
28,Ermelino Matarazzo,TOTAL,1510.0,207.509,13.742
31,Freguesia do Ó,TOTAL,3150.0,407.245,12.928
34,Guaianases,TOTAL,1780.0,268.508,15.085


In [8]:
# The label district is no longer necessary.
sub_pop.drop(columns=['Distritos'], axis = 1, inplace = True)
sub_pop.head()

Unnamed: 0,Subprefeituras,Área (km²),População (2010),Densidade Demográfica (Hab/km²)
3,Aricanduva,2150.0,267.702,12.451
9,Butantã,5610.0,428.217,7.633
13,Campo Limpo,3670.0,607.105,16.542
17,Capela do Socorro,13420.0,594.93,4.433
21,Casa Verde,2670.0,309.376,11.587


In [9]:
# Replace names to English to facilitate discussions.
sub_pop = sub_pop.rename(columns={'Subprefeituras': 'SUBPREF', 'Área (km²)':'TOTAL AREA SQM', 'População (2010)':'POPULATION 2010', 'Densidade Demográfica (Hab/km²)':'POP DENSITY'})
sub_pop.head()

Unnamed: 0,SUBPREF,TOTAL AREA SQM,POPULATION 2010,POP DENSITY
3,Aricanduva,2150.0,267.702,12.451
9,Butantã,5610.0,428.217,7.633
13,Campo Limpo,3670.0,607.105,16.542
17,Capela do Socorro,13420.0,594.93,4.433
21,Casa Verde,2670.0,309.376,11.587


In [10]:
# Ready for descriptive statistics
sub_pop.describe()

Unnamed: 0,TOTAL AREA SQM,POPULATION 2010,POP DENSITY
count,32.0,32.0,32.0
mean,4654.36875,352.234469,1295.079094
std,6078.184792,120.972136,4070.0429
min,19.8,139.441,2.553
25%,2365.0,268.3065,9.12275
50%,3320.0,334.3975,12.4095
75%,4792.5,428.93925,15.37225
max,35350.0,607.105,16454.0


In [12]:
# Hospital List
file_name='http://dados.prefeitura.sp.gov.br/dataset/f944b957-2193-48a4-9068-72a26e6ee577/resource/fd72d932-fc65-43cc-a74f-f309225f74e8/download/deinfosacadsau2014.csv'
hosp_sp=pd.read_csv(file_name, encoding='latin1', index_col=None)
hosp_sp.head()

Unnamed: 0,ID,LONG,LAT,SETCENS,AREAP,CODDIST,DISTRITO,CODSUBPREF,SUBPREF,REGIAO5,REGIAO8,ESTABELECI,ENDERECO,BAIRRO,TELEFONE,CEP,CNES,SA_DEPADM,DEPADM,SA_TIPO,TIPO,SA_CLASSE,CLASSE,LEITOS
0,1,-46490063,-23522787,355030864000052,3550308005143,65,PONTE RASA,22,ERMELINO MATARAZZO,Leste,Leste 2,BURGO PAULISTA -AMA ESPECIALIDADES,"JOSE SILVA ALCANTARA FILHO,R,1031",BURGO PAULISTA,22800080.0,3680000.0,6393608.0,1,Municipal,49,AMA ESPECIALIDADES,1,AMBULATORIOS ESPECIALIZADOS,0
1,2,-46773393,-23673297,355030819000016,3550308005232,19,CAPAO REDONDO,17,CAMPO LIMPO,Sul,Sul 2,CAPAO REDONDO - AMA ESPECIALIDADES,"SANTANA,COM,AV,774",JD. BOA ESPERANCA,58742846.0,5666000.0,6194974.0,1,Municipal,49,AMA ESPECIALIDADES,1,AMBULATORIOS ESPECIALIZADOS,0
2,3,-46651898,-23531575,355030869000001,3550308005027,70,SANTA CECILIA,9,SE,Centro,Centro,"HUMBERTO PASCALI,DR STA CECILIA - AMA ESPECIAL...","VITORINO CARMILO,R,599",CAMPOS ELISEOS,38260096.0,1153000.0,6138314.0,1,Municipal,49,AMA ESPECIALIDADES,1,AMBULATORIOS ESPECIALIZADOS,0
3,4,-46454974,-23538945,355030837000019,3550308005202,36,ITAQUERA,27,ITAQUERA,Leste,Leste 2,ITAQUERA-AMA ESPECIALIDADES,"AMERICO SALVADOR NOVELLI,R,265",ITAQUERA,62860015.0,8210090.0,6394558.0,1,Municipal,49,AMA ESPECIALIDADES,1,AMBULATORIOS ESPECIALIZADOS,0
4,5,-46539564,-23599363,355030872000050,3550308005158,74,SAO LUCAS,29,VILA PRUDENTE,Leste,Leste 1,JD GUAIRACA - AMA ESPECIALIDADES,"ERVA IMPERIAL,R,501",CID CONTINENTAL,,3244030.0,6759998.0,1,Municipal,49,AMA ESPECIALIDADES,1,AMBULATORIOS ESPECIALIZADOS,0


In [13]:
hosp_sp.dtypes

ID              int64
LONG            int64
LAT             int64
SETCENS         int64
AREAP           int64
CODDIST         int64
DISTRITO       object
CODSUBPREF      int64
SUBPREF        object
REGIAO5        object
REGIAO8        object
ESTABELECI     object
ENDERECO       object
BAIRRO         object
TELEFONE      float64
CEP           float64
CNES           object
SA_DEPADM       int64
DEPADM         object
SA_TIPO         int64
TIPO           object
SA_CLASSE       int64
CLASSE         object
LEITOS          int64
dtype: object

In [14]:
hosp_sp.drop(columns=['ID','SETCENS','AREAP','CODDIST','DISTRITO','REGIAO5','REGIAO8','ENDERECO','BAIRRO','TELEFONE','CEP','CNES','SA_DEPADM','DEPADM','SA_TIPO','SA_CLASSE'], axis = 1, inplace = True)
#hosp_pop.drop(hosptodrop, inplace=True)
#hosp_pop.sort_values(by=['SUBPREF'])
hosp_sp.head()

Unnamed: 0,LONG,LAT,CODSUBPREF,SUBPREF,ESTABELECI,TIPO,CLASSE,LEITOS
0,-46490063,-23522787,22,ERMELINO MATARAZZO,BURGO PAULISTA -AMA ESPECIALIDADES,AMA ESPECIALIDADES,AMBULATORIOS ESPECIALIZADOS,0
1,-46773393,-23673297,17,CAMPO LIMPO,CAPAO REDONDO - AMA ESPECIALIDADES,AMA ESPECIALIDADES,AMBULATORIOS ESPECIALIZADOS,0
2,-46651898,-23531575,9,SE,"HUMBERTO PASCALI,DR STA CECILIA - AMA ESPECIAL...",AMA ESPECIALIDADES,AMBULATORIOS ESPECIALIZADOS,0
3,-46454974,-23538945,27,ITAQUERA,ITAQUERA-AMA ESPECIALIDADES,AMA ESPECIALIDADES,AMBULATORIOS ESPECIALIZADOS,0
4,-46539564,-23599363,29,VILA PRUDENTE,JD GUAIRACA - AMA ESPECIALIDADES,AMA ESPECIALIDADES,AMBULATORIOS ESPECIALIZADOS,0


In [15]:
# Replace names to English to facilitate discussions.
hosp_sp = hosp_sp.rename(columns={'LONG': 'LNG', 'CODSUBPREF':'SUBPREFECTURE CODE', 'ESTABELECI':'HOSP NAME','CLASSE':'CLASS','LEITOS':'BEDS'})
hosp_sp.head()

Unnamed: 0,LNG,LAT,SUBPREFECTURE CODE,SUBPREF,HOSP NAME,TIPO,CLASS,BEDS
0,-46490063,-23522787,22,ERMELINO MATARAZZO,BURGO PAULISTA -AMA ESPECIALIDADES,AMA ESPECIALIDADES,AMBULATORIOS ESPECIALIZADOS,0
1,-46773393,-23673297,17,CAMPO LIMPO,CAPAO REDONDO - AMA ESPECIALIDADES,AMA ESPECIALIDADES,AMBULATORIOS ESPECIALIZADOS,0
2,-46651898,-23531575,9,SE,"HUMBERTO PASCALI,DR STA CECILIA - AMA ESPECIAL...",AMA ESPECIALIDADES,AMBULATORIOS ESPECIALIZADOS,0
3,-46454974,-23538945,27,ITAQUERA,ITAQUERA-AMA ESPECIALIDADES,AMA ESPECIALIDADES,AMBULATORIOS ESPECIALIZADOS,0
4,-46539564,-23599363,29,VILA PRUDENTE,JD GUAIRACA - AMA ESPECIALIDADES,AMA ESPECIALIDADES,AMBULATORIOS ESPECIALIZADOS,0


In [17]:
#description of the hospitalar situation
hosp_sp.shape
hosp_sp['BEDS'].sum()

17380

In [36]:
# Add Socioeconomic Data
file_name2='https://raw.githubusercontent.com/otcosta/Coursera_Capstone/master/DEINFO_IDH_UDH_2000_2010_Dados.csv'
hdi_sp=pd.read_csv(file_name2, encoding='latin1')
hdi_sp.head()

Unnamed: 0,COD_ID,NOME_UDH,CODMUN6,NOME_MUN,CODUF,NOME_UF,CODRM,NOME_RM,ANO,ESPVIDA,FECTOT,MORT1,MORT5,SOBRE40,SOBRE60,RAZDEP,T_ENV,E_ANOSESTUDO,T_ANALF11A14,T_ANALF15A17,T_ANALF15M,T_ANALF18A24,T_ANALF18M,T_ANALF25A29,T_ANALF25M,T_ATRASO_2_BASICO,T_ATRASO_2_FUND,T_FBBAS,T_FBFUND,T_FBMED,T_FBSUPER,T_FLBAS,T_FLFUND,T_FLMED,T_FLSUPER,T_FREQ0A5,T_FREQ15A17,T_FREQ18A24,T_FREQ25A29,T_FREQ5A6,T_FREQ6A14,T_FREQ6A17,T_FREQFUND1517,T_FREQFUND1824,T_FREQMED1824,T_FUND11A13,T_FUND15A17,T_FUND18A24,T_FUND18M,T_FUND25M,T_MED18A20,T_MED18A24,T_MED18M,T_MED25M,T_SUPER25M,CORTE1,CORTE2,CORTE3,CORTE4,CORTE9,GINI,PIND,PINDCRI,PMPOB,PMPOBCRI,PPOB,PPOBCRI,PREN10RICOS,PREN20,PREN20RICOS,PREN40,PREN60,PREN80,PRENTRAB,R1040,R2040,RDPC,RDPC1,RDPC10,RDPC2,RDPC3,RDPC4,RDPC5,RDPCT,RIND,RMPOB,RPOB,THEIL,P_AGRO,P_COM,P_CONSTR,P_EXTR,P_SERV,P_SIUP,P_TRANSF,CPR,EMP,P_FORMAL,TRABCC,TRABPUB,TRABSC,P_FUND,P_MED,P_SUPER,REN0,REN1,REN2,REN3,REN5,RENOCUP,THEILtrab,T_ATIV,T_ATIV1014,T_ATIV1517,T_ATIV1824,T_ATIV18M,T_ATIV2529,T_DES,T_DES1014,T_DES1517,T_DES1824,T_DES18M,T_DES2529,T_AGUA,T_BANAGUA,T_DENS,T_LIXO,T_LUZ,AGUA_ESGOTO,PAREDE,T_CRIFUNDIN_TODOS,T_FORA0A5,T_FORA6A14,T_FUNDIN_TODOS,T_FUNDIN_TODOS_MMEIO,T_FUNDIN18MINF,T_M10A17CF,T_MULCHEFEFIF014,T_NESTUDA_NTRAB_MMEIO,T_OCUPDESLOC_1,T_RMAXIDOSO,T_SLUZ,T_VULNERA_NESTUDA_NTRAB_MMEIO,T_VULNERA_MULCHEFE,T_VULNERA_RMAXIDOSO,T_VULNERA_OCUPDESLOC_13,HOMEM0A4,HOMEM10A14,HOMEM15A19,HOMEM20A24,HOMEM25A29,HOMEM30A34,HOMEM35A39,HOMEM40A44,HOMEM45A49,HOMEM50A54,HOMEM55A59,HOMEM5A9,HOMEM60A64,HOMEM65A69,HOMEM70A74,HOMEM75A79,HOMENS80,HOMEMTOT,MULH0A4,MULH10A14,MULH15A19,MULH20A24,MULH25A29,MULH30A34,MULH35A39,MULH40A44,MULH45A49,MULH50A54,MULH55A59,MULH5A9,MULH60A64,MULH65A69,MULH70A74,MULH75A79,MULHER80,MULHERTOT,PEA,PEA1014,PEA1517,PEA18M,PESO1,PESO1114,PESO1113,PESO1214,PESO13,PESO15,PESO1517,PESO1524,PESO1618,PESO18,PESO1820,PESO1824,PESO1921,PESO25,PESO4,PESO5,PESO6,PESO610,PESO617,PESO65,PESOM1014,PESOM1517,PESOM15M,PESOM25M,PESORUR,PESOTOT,PESOURB,PIA,PIA1014,PIA1517,PIA18M,POP,T_FBFUND_tudo,T_FBMED_tudo,T_FBBAS_tudo,T_FLFUND_tudo,T_FLMED_tudo,T_FLBAS_tudo,T_FUND11A13_tudo,I_ESCOLARIDADE,I_FREQ_PROP,IDHM,IDHM_E,IDHM_L,IDHM_R,CODDIST,DISTRITO,REGIAO8,PREFREG
0,1355030000000.0,Jardim AnÃ¡lia Franco / Vila Formosa : Hospita...,355030,SÃ£o Paulo (SP),35,SP,63500,RM SÃ£o Paulo (SP),2000,76.57,1.31,12.3,14.31,94.77,85.8,42.02,10.87,11.49,0.25,0.66,2.35,0.37,2.46,0.8,2.86,5.63,2.91,104.75,104.0,106.73,37.85,94.29,96.25,72.57,29.14,49.9,90.08,41.57,8.33,91.21,98.68,96.31,13.61,1.72,8.66,92.35,76.03,86.61,65.91,61.88,61.31,67.5,49.74,46.29,17.0,488.02,847.21,1335.23,2337.63,3416.16,0.47,0.21,0.65,0.64,1.63,4.01,7.2,33.57,4.22,51.53,12.53,26.43,48.47,85.52,10.72,8.22,1562.09,329.29,5244.36,649.43,1085.64,1721.35,4024.75,1562.09,39.04,79.05,165.76,0.39,0.14,24.73,4.16,0.1,54.47,0.59,15.12,27.42,10.67,66.06,38.37,6.19,15.34,78.71,62.3,21.71,2.02,9.4,29.63,41.97,63.89,2949.6,0.46,,1.39,33.48,82.65,66.32,83.4,,72.6,24.09,16.64,11.16,12.57,100.0,99.7,20.64,100.0,100.0,0.0,,7.04,50.1,1.32,12.01,1.28,27.35,0.8,29.23,1.11,,1.53,0.0,47.17,1.9,100.0,,578,722,873,846,765,762,792,727,615,537,416,579,359,301,273,172,105,9419,550,722,813,909,851,797,892,819,742,625,531,606,461,445,387,267,231,10644,,20.0,330.0,10161.0,213,1188,880,904,667,16308,987,3440,1044,15321,1042,2454,1075,12868,248,248,196,1192,3366,2180,722,461,8767,7045,0,20062,20062,17752.0,1444.0,987.0,15321.0,20013.0,108.6,115.5,110.49,99.02,75.18,96.55,96.92,0.659,0.802,0.818,0.751,0.86,0.848,85,Vila Formosa,Leste 1,Aricanduva/Formosa/CarrÃ£o
1,1355030000000.0,Vila CalifÃ³rnia,355030,SÃ£o Paulo (SP),35,SP,63500,RM SÃ£o Paulo (SP),2000,76.78,1.32,12.0,13.97,94.89,86.09,43.84,11.97,11.52,0.74,0.63,2.51,0.63,2.62,1.45,2.97,7.12,2.23,101.93,106.8,89.23,41.99,93.13,96.08,63.85,25.99,45.46,91.88,43.79,11.1,91.07,99.26,97.22,19.23,2.51,8.26,87.0,70.65,79.94,65.65,63.2,49.02,57.73,50.44,49.19,20.56,531.4,902.84,1399.65,2342.51,3448.69,0.47,,,0.36,0.27,4.14,9.06,34.47,4.42,51.65,12.8,26.54,48.35,78.44,10.77,8.07,1651.79,364.84,5693.41,692.37,1134.35,1801.66,4265.73,1651.79,,84.85,177.96,0.38,1.18,21.45,2.89,0.0,55.71,0.25,15.21,27.3,8.56,67.8,40.8,7.79,14.35,79.86,65.03,26.16,1.2,5.71,28.78,41.08,64.26,2953.98,0.42,,0.86,32.98,79.25,63.24,86.52,,100.0,40.28,20.32,11.16,13.16,100.0,99.07,23.34,100.0,100.0,0.0,,10.11,54.54,0.74,12.57,2.46,28.12,0.0,19.52,1.22,,0.77,0.0,31.0,6.78,63.75,,579,665,801,771,696,735,766,686,581,523,414,599,403,333,265,171,144,9132,551,686,773,840,821,769,852,846,715,643,520,565,533,439,406,301,297,10557,,12.0,313.0,9546.0,224,1086,815,830,680,16044,949,3185,938,15095,951,2236,965,12859,226,234,241,1195,3230,2356,686,477,8755,7142,0,19689,19689,17395.0,1351.0,949.0,15095.0,19545.0,114.27,110.57,113.24,99.26,69.35,97.65,92.11,0.657,0.744,0.808,0.714,0.863,0.857,20,CarrÃ£o,Leste 1,Aricanduva/Formosa/CarrÃ£o
2,1355030000000.0,Vila CarrÃ£o / Vila Formosa : CemitÃ©rio Vila ...,355030,SÃ£o Paulo (SP),35,SP,63500,RM SÃ£o Paulo (SP),2000,72.9,1.84,18.0,20.96,92.48,80.38,44.07,11.77,10.8,1.07,0.72,3.55,1.02,3.74,1.24,4.25,10.76,5.32,108.31,113.64,94.76,14.88,93.36,95.66,56.7,8.4,30.58,89.7,29.94,10.22,84.61,98.34,95.9,28.54,4.18,11.5,90.04,63.05,71.66,51.19,45.99,37.55,40.41,31.46,29.18,5.25,278.17,430.24,624.67,976.05,1439.67,0.42,0.93,1.48,4.5,8.27,17.45,25.54,30.8,5.2,47.75,15.1,29.81,52.25,83.44,8.16,6.32,710.57,184.91,2188.54,351.73,522.56,797.06,1696.59,710.57,51.49,98.67,172.34,0.31,0.0,18.81,3.94,0.0,50.84,0.55,21.04,21.78,1.71,59.16,46.57,5.53,23.47,60.54,39.47,6.1,0.94,12.84,54.56,67.89,85.68,1457.06,0.32,,4.02,42.73,82.46,69.11,88.0,,55.89,49.88,28.04,16.35,15.04,99.99,99.12,46.48,100.0,100.0,0.0,,20.46,69.42,1.66,17.93,5.71,37.22,1.49,47.52,5.48,,1.0,0.0,35.37,13.99,34.87,,312,409,477,489,406,409,400,407,341,264,220,367,247,218,164,93,69,5292,308,407,495,465,480,427,495,451,430,337,272,318,309,283,228,147,124,5976,,33.0,238.0,5937.0,122,653,489,484,381,9147,557,1926,571,8590,639,1369,625,7221,117,130,120,718,1928,1326,407,292,4943,3983,0,11268,11268,9963.0,816.0,557.0,8590.0,11166.0,121.52,108.65,117.89,98.34,59.07,96.04,94.69,0.512,0.688,0.71,0.623,0.798,0.721,20,CarrÃ£o,Leste 1,Aricanduva/Formosa/CarrÃ£o
3,1355030000000.0,Vila Formosa : Escola Municipal de Ensino Fund...,355030,SÃ£o Paulo (SP),35,SP,63500,RM SÃ£o Paulo (SP),2000,74.85,1.51,14.8,17.24,93.75,83.33,44.47,10.93,11.48,0.86,1.11,3.77,0.85,3.94,1.25,4.58,8.23,3.63,105.34,104.49,107.39,29.5,91.42,90.39,61.86,18.98,38.57,94.24,39.8,10.02,88.41,97.75,96.72,22.7,2.47,12.14,83.07,68.02,80.0,54.64,49.57,46.13,52.38,36.37,33.17,9.28,357.88,585.63,878.44,1394.35,1952.09,0.43,0.36,1.36,2.44,4.71,9.72,15.38,32.42,4.99,48.74,14.57,29.14,51.26,78.53,8.9,6.69,996.25,248.38,3229.9,477.52,725.62,1101.96,2427.78,996.25,36.85,98.63,180.24,0.33,0.08,20.94,5.5,0.0,53.81,0.31,18.66,22.88,3.62,66.31,49.5,4.62,18.33,67.94,47.01,12.26,1.05,10.6,46.07,61.18,82.22,1823.08,0.37,,2.42,43.44,86.81,66.79,89.73,,67.94,57.33,25.36,14.69,9.64,99.99,99.01,33.4,99.87,100.0,0.0,,16.04,61.43,2.25,18.34,3.37,35.84,1.25,40.35,2.64,,1.78,0.0,26.72,14.38,80.94,,898,1067,1198,1270,1050,991,1073,965,843,667,540,891,573,441,417,242,199,13320,843,1078,1273,1315,1070,1136,1245,1166,976,840,735,860,732,618,528,358,302,15072,,52.0,606.0,14268.0,304,1749,1303,1317,1061,22757,1394,5056,1445,21363,1622,3662,1632,17702,376,366,338,1781,4923,3104,1078,731,12292,9704,0,28392,28392,24902.0,2145.0,1394.0,21363.0,28306.0,116.98,122.56,118.61,97.75,66.74,97.23,92.39,0.546,0.714,0.749,0.653,0.831,0.775,85,Vila Formosa,Leste 1,Aricanduva/Formosa/CarrÃ£o
4,1355030000000.0,Aricanduva : Centro de EducaÃ§Ã£o Infantil Cor...,355030,SÃ£o Paulo (SP),35,SP,63500,RM SÃ£o Paulo (SP),2000,75.02,1.5,14.6,16.95,93.85,83.58,41.22,9.56,10.96,1.13,0.52,3.78,1.11,4.0,1.81,4.6,12.31,9.83,105.48,104.81,107.23,23.77,88.57,91.84,57.55,13.52,30.93,84.42,31.04,9.78,72.67,95.18,92.21,18.8,3.17,12.13,80.9,67.69,84.36,61.82,56.6,44.56,53.11,41.22,38.47,9.88,325.35,566.5,836.61,1399.0,2025.29,0.47,1.34,2.97,4.53,9.07,14.94,25.28,35.06,3.85,51.91,12.65,26.52,48.09,81.31,11.08,8.2,1009.63,194.49,3540.03,444.35,700.0,1088.67,2620.62,1009.63,61.65,95.48,162.1,0.41,0.44,22.17,4.52,0.0,55.93,0.27,13.61,20.92,4.46,65.99,47.79,5.82,20.58,72.98,53.13,12.92,0.43,10.64,42.34,55.23,79.59,1956.39,0.38,,6.13,40.37,83.17,67.46,86.17,,61.39,53.18,31.31,15.71,13.53,99.99,98.08,33.76,100.0,100.0,0.0,,12.83,69.07,4.82,12.53,4.3,30.81,2.21,36.41,5.99,,1.88,0.0,44.68,13.54,64.46,,224,270,315,319,285,264,267,254,220,202,170,234,143,137,78,51,32,3460,226,273,352,343,296,306,274,292,262,251,214,221,201,144,119,67,80,3916,,33.0,155.0,3741.0,94,444,333,348,259,5929,385,1328,408,5545,412,943,420,4602,97,102,79,451,1280,705,273,202,3197,2503,0,7375,7375,6472.0,543.0,385.0,5545.0,7343.0,110.82,118.29,112.88,95.18,62.45,92.84,90.77,0.618,0.665,0.749,0.649,0.834,0.777,4,Aricanduva,Leste 1,Aricanduva/Formosa/CarrÃ£o


In [37]:
# To verify the # of subprefectures in SP...
hdi_sp['PREFREG'].unique()

array(['Aricanduva/Formosa/CarrÃ£o', 'ButantÃ£', 'Campo Limpo',
       'Capela do Socorro', 'Casa Verde/Cachoeirinha', 'Cidade Ademar',
       'Cidade Tiradentes', 'Ermelino Matarazzo',
       'Freguesia/BrasilÃ¢ndia', 'Guaianases', 'Ipiranga',
       'Itaim Paulista', 'Itaquera', 'Jabaquara', 'JaÃ§anÃ£/TremembÃ©',
       'Lapa', "M'Boi Mirim", 'Mooca', 'Parelheiros', 'Penha', 'Perus',
       'Pinheiros', 'Pirituba/JaraguÃ¡', 'Santana/Tucuruvi',
       'Santo Amaro', 'SÃ£o Mateus', 'SÃ£o Miguel', 'SÃ©',
       'Vila Maria/Vila Guilherme', 'Vila Mariana', 'Vila Prudente',
       'Sapopemba'], dtype=object)

In [38]:
hdi_sp.dtypes

COD_ID                           float64
NOME_UDH                          object
CODMUN6                            int64
NOME_MUN                          object
CODUF                              int64
NOME_UF                           object
CODRM                              int64
NOME_RM                           object
ANO                                int64
ESPVIDA                          float64
FECTOT                           float64
MORT1                            float64
MORT5                            float64
SOBRE40                          float64
SOBRE60                          float64
RAZDEP                           float64
T_ENV                            float64
E_ANOSESTUDO                     float64
T_ANALF11A14                     float64
T_ANALF15A17                     float64
T_ANALF15M                       float64
T_ANALF18A24                     float64
T_ANALF18M                       float64
T_ANALF25A29                     float64
T_ANALF25M      

In [39]:
hdi_sp = hdi_sp[['T_ENV','GINI', 'PIND','RDPC','IDHM','DISTRITO','REGIAO8','PREFREG']]
hdi_sp.head()

Unnamed: 0,T_ENV,GINI,PIND,RDPC,IDHM,DISTRITO,REGIAO8,PREFREG
0,10.87,0.47,0.21,1562.09,0.818,Vila Formosa,Leste 1,Aricanduva/Formosa/CarrÃ£o
1,11.97,0.47,,1651.79,0.808,CarrÃ£o,Leste 1,Aricanduva/Formosa/CarrÃ£o
2,11.77,0.42,0.93,710.57,0.71,CarrÃ£o,Leste 1,Aricanduva/Formosa/CarrÃ£o
3,10.93,0.43,0.36,996.25,0.749,Vila Formosa,Leste 1,Aricanduva/Formosa/CarrÃ£o
4,9.56,0.47,1.34,1009.63,0.749,Aricanduva,Leste 1,Aricanduva/Formosa/CarrÃ£o


In [43]:
# Replace names to English to facilitate discussions.
hdi_sp = hdi_sp.rename(columns={'T_ENV': 'AGING RATE', 'PIND':'EXTREME POVERTY RATE', 'RDPC':'INCOME PER CAPITA','DISTRITO':'DISTRICT','REGIAO8':'REGION', 'PREFREG':'SUBPREF'})
hdi_sp.head()

Unnamed: 0,AGING RATE,GINI,EXTREME POVERTY RATE,INCOME PER CAPITA,IDHM,DISTRICT,REGION,SUBPREF
0,10.87,0.47,0.21,1562.09,0.818,Vila Formosa,Leste 1,Aricanduva/Formosa/CarrÃ£o
1,11.97,0.47,,1651.79,0.808,CarrÃ£o,Leste 1,Aricanduva/Formosa/CarrÃ£o
2,11.77,0.42,0.93,710.57,0.71,CarrÃ£o,Leste 1,Aricanduva/Formosa/CarrÃ£o
3,10.93,0.43,0.36,996.25,0.749,Vila Formosa,Leste 1,Aricanduva/Formosa/CarrÃ£o
4,9.56,0.47,1.34,1009.63,0.749,Aricanduva,Leste 1,Aricanduva/Formosa/CarrÃ£o


In [None]:
#Harmonize names
print('HDI names are:' hdi_sp['SUBPREF'].unique())
print('Sub Data names are:' sub_pop['SUBPREF'].unique())
print('Hospital names are:' hosp_sp['SUBPREF'].unique())

### All NaN values on neighborhood were eliminated by dropping the boroughs assigned as 'Not assigned'

## Four Square data

In [10]:
!wget -q -O 'Geospatial_Coordinates.csv' http://cocl.us/Geospatial_data
print('Data downloaded!')

Data downloaded!


In [11]:
gpsloc=pd.read_csv('Geospatial_Coordinates.csv')
gpsloc.rename(columns={'Postal Code' : 'Postal code'}, inplace=True)
gpsloc.head()

Unnamed: 0,Postal code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [12]:
nhoodg = pd.merge(nhood, gpsloc, on='Postal code', how='left')
nhoodg.head()

Unnamed: 0,Postal code,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,Regent Park / Harbourfront,43.65426,-79.360636
3,M6A,North York,Lawrence Manor / Lawrence Heights,43.718518,-79.464763
4,M7A,Downtown Toronto,Queen's Park / Ontario Provincial Government,43.662301,-79.389494


## the label on the gps CSV file had to be renamed as it contained an extra upper letter

In [13]:
address = 'Toronto'

geolocator = Nominatim(user_agent="my-application")
location = geolocator.geocode(address)
tor_lat = location.latitude
tor_lon = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(tor_lat, tor_lon))

The geograpical coordinate of Toronto are 43.6534817, -79.3839347.


In [14]:
map_toronto = folium.Map(location=[tor_lat, tor_lon], zoom_start=10)



for lat, lng, borough, neighborhood in zip(nhoodg['Latitude'], nhoodg['Longitude'], nhoodg['Borough'], nhoodg['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_toronto)
    

map_toronto

# Question 2

In [15]:
borough_names = list(nhoodg.Borough.unique())

with_toronto = []

for x in borough_names:
    if "toronto" in x.lower():
        with_toronto.append(x)
        
with_toronto

['Downtown Toronto', 'East Toronto', 'West Toronto', 'Central Toronto']

In [16]:
toronto_in_name = nhoodg[nhoodg['Borough'].isin(with_toronto)].reset_index(drop=True)
toronto_in_name.head()

Unnamed: 0,Postal code,Borough,Neighborhood,Latitude,Longitude
0,M5A,Downtown Toronto,Regent Park / Harbourfront,43.65426,-79.360636
1,M7A,Downtown Toronto,Queen's Park / Ontario Provincial Government,43.662301,-79.389494
2,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937
3,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
4,M4E,East Toronto,The Beaches,43.676357,-79.293031


In [17]:
map_with_toronto = folium.Map(location=[tor_lat, tor_lon], zoom_start=10)


for lat, lng, borough, neighborhood in zip(toronto_in_name['Latitude'], toronto_in_name['Longitude'], toronto_in_name['Borough'], toronto_in_name['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_with_toronto)
    

map_with_toronto

# Question 3

In [18]:
# define Foursquare Credentials and Version
CLIENT_ID = '' #add your key
CLIENT_SECRET = '' #add your key
VERSION = '20180605'
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: XIG4PH5ETDGAN4RWLWB3PUB5S5KMC4QL1U110XPTVIUPJHJ3
CLIENT_SECRET:H5NPXDLPIWTOTBTYFCLTQY4W1WHI0XI45JF1XXDIUXJH51GN


In [85]:
toronto_in_name.loc[0, 'Neighborhood']

'Regent Park / Harbourfront'

In [86]:
neighborhood_latitude = toronto_in_name.loc[0, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = toronto_in_name.loc[0, 'Longitude'] # neighborhood longitude value

neighborhood_name = toronto_in_name.loc[0, 'Neighborhood'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Regent Park / Harbourfront are 43.6542599, -79.3606359.


In [88]:
LIMIT = 100 # limit of number of venues returned by Foursquare API



radius = 500 # define radius



url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url # display URL
results = requests.get(url).json()

In [89]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [90]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Roselle Desserts,Bakery,43.653447,-79.362017
1,Tandem Coffee,Coffee Shop,43.653559,-79.361809
2,Cooper Koo Family YMCA,Distribution Center,43.653249,-79.358008
3,Body Blitz Spa East,Spa,43.654735,-79.359874
4,Morning Glory Cafe,Breakfast Spot,43.653947,-79.361149


In [91]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

45 venues were returned by Foursquare.


In [92]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [93]:
nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Roselle Desserts,Bakery,43.653447,-79.362017
1,Tandem Coffee,Coffee Shop,43.653559,-79.361809
2,Cooper Koo Family YMCA,Distribution Center,43.653249,-79.358008
3,Body Blitz Spa East,Spa,43.654735,-79.359874
4,Morning Glory Cafe,Breakfast Spot,43.653947,-79.361149


In [94]:
toronto_venues = getNearbyVenues(names=toronto_in_name['Neighborhood'],
                                   latitudes=toronto_in_name['Latitude'],
                                   longitudes=toronto_in_name['Longitude']
                                  )

Regent Park / Harbourfront
Queen's Park / Ontario Provincial Government
Garden District, Ryerson
St. James Town
The Beaches
Berczy Park
Central Bay Street
Christie
Richmond / Adelaide / King
Dufferin / Dovercourt Village
Harbourfront East / Union Station / Toronto Islands
Little Portugal / Trinity
The Danforth West / Riverdale
Toronto Dominion Centre / Design Exchange
Brockton / Parkdale Village / Exhibition Place
India Bazaar / The Beaches West
Commerce Court / Victoria Hotel
Studio District
Lawrence Park
Roselawn
Davisville North
Forest Hill North & West
High Park / The Junction South
North Toronto West
The Annex / North Midtown / Yorkville
Parkdale / Roncesvalles
Davisville
University of Toronto / Harbord
Runnymede / Swansea
Moore Park / Summerhill East
Kensington Market / Chinatown / Grange Park
Summerhill West / Rathnelly / South Hill / Forest Hill SE / Deer Park
CN Tower / King and Spadina / Railway Lands / Harbourfront West / Bathurst  Quay / South Niagara / Island airport
Rosed

In [96]:
print(toronto_venues.shape)
toronto_venues.head()

(1681, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Regent Park / Harbourfront,43.65426,-79.360636,Roselle Desserts,43.653447,-79.362017,Bakery
1,Regent Park / Harbourfront,43.65426,-79.360636,Tandem Coffee,43.653559,-79.361809,Coffee Shop
2,Regent Park / Harbourfront,43.65426,-79.360636,Cooper Koo Family YMCA,43.653249,-79.358008,Distribution Center
3,Regent Park / Harbourfront,43.65426,-79.360636,Body Blitz Spa East,43.654735,-79.359874,Spa
4,Regent Park / Harbourfront,43.65426,-79.360636,Morning Glory Cafe,43.653947,-79.361149,Breakfast Spot


In [97]:
toronto_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Berczy Park,55,55,55,55,55,55
Brockton / Parkdale Village / Exhibition Place,22,22,22,22,22,22
Business reply mail Processing CentrE,16,16,16,16,16,16
CN Tower / King and Spadina / Railway Lands / Harbourfront West / Bathurst Quay / South Niagara / Island airport,16,16,16,16,16,16
Central Bay Street,77,77,77,77,77,77
Christie,19,19,19,19,19,19
Church and Wellesley,79,79,79,79,79,79
Commerce Court / Victoria Hotel,100,100,100,100,100,100
Davisville,36,36,36,36,36,36
Davisville North,8,8,8,8,8,8


In [99]:
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 235 uniques categories.


## One hot encoding

In [100]:
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

toronto_onehot['Neighborhood'] = toronto_venues['Neighborhood'] 

fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Yoga Studio,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Workshop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Stadium,Basketball Stadium,Beach,Bed & Breakfast,Beer Bar,Beer Store,Belgian Restaurant,Bistro,Boat or Ferry,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Business Service,Butcher,Café,Cajun / Creole Restaurant,Camera Store,Candy Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Auditorium,College Gym,College Rec Center,Colombian Restaurant,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Coworking Space,Creperie,Cuban Restaurant,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hobby Shop,Home Service,Hospital,Hostel,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indoor Play Area,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Knitting Store,Korean Restaurant,Lake,Latin American Restaurant,Lawyer,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Mac & Cheese Joint,Market,Martial Arts Dojo,Massage Studio,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Museum,Music Venue,Neighborhood,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Playground,Plaza,Poke Place,Polish Restaurant,Pool,Poutine Place,Pub,Ramen Restaurant,Record Shop,Rental Car Location,Restaurant,Roof Deck,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Skate Park,Skating Rink,Smoke Shop,Snack Place,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Stationery Store,Steakhouse,Strip Club,Summer Camp,Supermarket,Sushi Restaurant,Swim School,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Theme Restaurant,Thrift / Vintage Store,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Women's Store
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Regent Park / Harbourfront,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Regent Park / Harbourfront,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Regent Park / Harbourfront,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Regent Park / Harbourfront,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Regent Park / Harbourfront,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [101]:
toronto_onehot.shape

(1681, 235)

In [102]:
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
toronto_grouped

Unnamed: 0,Neighborhood,Yoga Studio,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Workshop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Stadium,Basketball Stadium,Beach,Bed & Breakfast,Beer Bar,Beer Store,Belgian Restaurant,Bistro,Boat or Ferry,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Business Service,Butcher,Café,Cajun / Creole Restaurant,Camera Store,Candy Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Auditorium,College Gym,College Rec Center,Colombian Restaurant,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Coworking Space,Creperie,Cuban Restaurant,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hobby Shop,Home Service,Hospital,Hostel,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indoor Play Area,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Knitting Store,Korean Restaurant,Lake,Latin American Restaurant,Lawyer,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Mac & Cheese Joint,Market,Martial Arts Dojo,Massage Studio,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Museum,Music Venue,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Playground,Plaza,Poke Place,Polish Restaurant,Pool,Poutine Place,Pub,Ramen Restaurant,Record Shop,Rental Car Location,Restaurant,Roof Deck,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Skate Park,Skating Rink,Smoke Shop,Snack Place,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Stationery Store,Steakhouse,Strip Club,Summer Camp,Supermarket,Sushi Restaurant,Swim School,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Theme Restaurant,Thrift / Vintage Store,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Women's Store
0,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018182,0.0,0.0,0.0,0.0,0.018182,0.0,0.018182,0.036364,0.0,0.0,0.0,0.018182,0.018182,0.0,0.036364,0.0,0.0,0.018182,0.0,0.0,0.0,0.0,0.018182,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018182,0.036364,0.0,0.0,0.0,0.0,0.036364,0.0,0.0,0.0,0.0,0.0,0.054545,0.090909,0.0,0.0,0.0,0.0,0.0,0.018182,0.0,0.018182,0.0,0.018182,0.0,0.018182,0.0,0.0,0.0,0.0,0.018182,0.0,0.018182,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.036364,0.0,0.0,0.0,0.018182,0.0,0.0,0.0,0.0,0.0,0.0,0.018182,0.018182,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018182,0.018182,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018182,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018182,0.0,0.018182,0.018182,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018182,0.018182,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018182,0.0,0.0,0.018182,0.0,0.0,0.0,0.0,0.0,0.0,0.018182,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.036364,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.036364,0.0,0.018182,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018182,0.0,0.0,0.0,0.0,0.0,0.018182,0.0,0.0,0.0,0.0,0.018182,0.0,0.0,0.0,0.0,0.0,0.0,0.018182,0.0,0.0,0.0,0.0
1,Brockton / Parkdale Village / Exhibition Place,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.136364,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Business reply mail Processing CentrE,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0625,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,CN Tower / King and Spadina / Railway Lands / ...,0.0,0.0,0.0625,0.0625,0.0625,0.125,0.125,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Central Bay Street,0.012987,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.025974,0.0,0.038961,0.0,0.0,0.012987,0.0,0.025974,0.0,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012987,0.012987,0.025974,0.012987,0.012987,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012987,0.012987,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025974,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.025974,0.012987,0.0,0.0,0.0,0.0,0.051948,0.038961,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012987,0.012987,0.012987,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.0,0.025974,0.0,0.038961,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.0,0.0,0.025974,0.0,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.0,0.012987,0.038961,0.0,0.0,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.012987,0.0
5,Christie,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.157895,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.105263,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.210526,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.105263,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Church and Wellesley,0.025316,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.012658,0.012658,0.0,0.012658,0.0,0.012658,0.0,0.025316,0.012658,0.0,0.0,0.0,0.025316,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.063291,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.012658,0.0,0.0,0.012658,0.012658,0.0,0.012658,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.025316,0.050633,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.012658,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.025316,0.0,0.0,0.012658,0.012658,0.0,0.0,0.0,0.0,0.012658,0.063291,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.025316,0.025316,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.012658,0.0,0.0,0.025316,0.012658,0.0,0.0,0.037975,0.0,0.012658,0.0,0.012658,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.012658,0.012658,0.0,0.0,0.037975,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.012658,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Commerce Court / Victoria Hotel,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.07,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.12,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.01,0.0,0.04,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.02,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.07,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.03,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0
8,Davisville,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.027778,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.027778,0.0,0.0,0.055556,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.027778,0.0,0.0,0.027778,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Davisville North,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [103]:
toronto_grouped.shape

(39, 235)

In [104]:
#Top 5 types of venues per neighborhood.
num_top_venues = 5

for hood in toronto_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Berczy Park----
          venue  freq
0   Coffee Shop  0.09
1  Cocktail Bar  0.05
2    Restaurant  0.04
3          Café  0.04
4        Bakery  0.04


----Brockton / Parkdale Village / Exhibition Place----
                venue  freq
0                Café  0.14
1         Coffee Shop  0.09
2      Breakfast Spot  0.09
3                 Gym  0.05
4  Italian Restaurant  0.05


----Business reply mail Processing CentrE----
           venue  freq
0    Yoga Studio  0.06
1  Auto Workshop  0.06
2           Park  0.06
3     Comic Shop  0.06
4    Pizza Place  0.06


----CN Tower / King and Spadina / Railway Lands / Harbourfront West / Bathurst  Quay / South Niagara / Island airport----
              venue  freq
0    Airport Lounge  0.12
1   Airport Service  0.12
2  Airport Terminal  0.12
3       Coffee Shop  0.06
4  Sculpture Garden  0.06


----Central Bay Street----
                 venue  freq
0          Coffee Shop  0.18
1   Italian Restaurant  0.05
2       Sandwich Place  0.04
3         Bu

## Panda Creation

In [105]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [106]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Berczy Park,Coffee Shop,Cocktail Bar,Cheese Shop,Bakery,Seafood Restaurant,Farmers Market,Restaurant,Café,Beer Bar,Department Store
1,Brockton / Parkdale Village / Exhibition Place,Café,Coffee Shop,Breakfast Spot,Pet Store,Stadium,Burrito Place,Restaurant,Climbing Gym,Performing Arts Venue,Bakery
2,Business reply mail Processing CentrE,Yoga Studio,Auto Workshop,Comic Shop,Park,Pizza Place,Burrito Place,Restaurant,Brewery,Light Rail Station,Smoke Shop
3,CN Tower / King and Spadina / Railway Lands / ...,Airport Terminal,Airport Lounge,Airport Service,Rental Car Location,Boat or Ferry,Coffee Shop,Boutique,Harbor / Marina,Bar,Airport Gate
4,Central Bay Street,Coffee Shop,Italian Restaurant,Thai Restaurant,Sandwich Place,Burger Joint,Japanese Restaurant,Café,Dessert Shop,Spa,Salad Place


## Clustering the neighborhoods

In [108]:
# set number of clusters
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([3, 3, 3, 3, 3, 3, 3, 3, 3, 3], dtype=int32)

In [113]:
#add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Label', kmeans.labels_)

toronto_merged = toronto_in_name

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

toronto_merged.head() # check the last columns!

Unnamed: 0,Postal code,Borough,Neighborhood,Latitude,Longitude,Cluster Label,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M5A,Downtown Toronto,Regent Park / Harbourfront,43.65426,-79.360636,3,3,Coffee Shop,Pub,Park,Bakery,Mexican Restaurant,Café,Restaurant,Breakfast Spot,Event Space,Hotel
1,M7A,Downtown Toronto,Queen's Park / Ontario Provincial Government,43.662301,-79.389494,3,3,Coffee Shop,Creperie,Beer Bar,Boutique,Burger Joint,Burrito Place,Restaurant,Café,Park,College Auditorium
2,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,3,3,Coffee Shop,Clothing Store,Café,Japanese Restaurant,Bubble Tea Shop,Middle Eastern Restaurant,Cosmetics Shop,Tea Room,Bookstore,Diner
3,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,3,3,Coffee Shop,Café,Restaurant,American Restaurant,Beer Bar,Cosmetics Shop,Bakery,Diner,Japanese Restaurant,Italian Restaurant
4,M4E,East Toronto,The Beaches,43.676357,-79.293031,0,0,Pub,Coffee Shop,Trail,Health Food Store,Doner Restaurant,Dessert Shop,Diner,Discount Store,Distribution Center,Dog Run


In [121]:
# create map
map_clusters = folium.Map(location=[tor_lat, tor_lon], zoom_start=10)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lng, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhood'], toronto_merged['Cluster Label']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [122]:
toronto_merged.loc[toronto_merged['Cluster Label'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Label,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,East Toronto,0,0,Pub,Coffee Shop,Trail,Health Food Store,Doner Restaurant,Dessert Shop,Diner,Discount Store,Distribution Center,Dog Run
18,Central Toronto,0,0,Lawyer,Park,Bus Line,Swim School,Dessert Shop,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop


In [123]:
toronto_merged.loc[toronto_merged['Cluster Label'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Label,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
29,Central Toronto,1,1,Playground,Summer Camp,Women's Store,Department Store,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant


In [124]:
toronto_merged.loc[toronto_merged['Cluster Label'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Label,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
21,Central Toronto,2,2,Sushi Restaurant,Park,Jewelry Store,Trail,Doner Restaurant,Dessert Shop,Diner,Discount Store,Distribution Center,Dog Run
33,Downtown Toronto,2,2,Park,Trail,Playground,Women's Store,Deli / Bodega,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant


In [125]:
toronto_merged.loc[toronto_merged['Cluster Label'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Label,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Downtown Toronto,3,3,Coffee Shop,Pub,Park,Bakery,Mexican Restaurant,Café,Restaurant,Breakfast Spot,Event Space,Hotel
1,Downtown Toronto,3,3,Coffee Shop,Creperie,Beer Bar,Boutique,Burger Joint,Burrito Place,Restaurant,Café,Park,College Auditorium
2,Downtown Toronto,3,3,Coffee Shop,Clothing Store,Café,Japanese Restaurant,Bubble Tea Shop,Middle Eastern Restaurant,Cosmetics Shop,Tea Room,Bookstore,Diner
3,Downtown Toronto,3,3,Coffee Shop,Café,Restaurant,American Restaurant,Beer Bar,Cosmetics Shop,Bakery,Diner,Japanese Restaurant,Italian Restaurant
5,Downtown Toronto,3,3,Coffee Shop,Cocktail Bar,Cheese Shop,Bakery,Seafood Restaurant,Farmers Market,Restaurant,Café,Beer Bar,Department Store
6,Downtown Toronto,3,3,Coffee Shop,Italian Restaurant,Thai Restaurant,Sandwich Place,Burger Joint,Japanese Restaurant,Café,Dessert Shop,Spa,Salad Place
7,Downtown Toronto,3,3,Grocery Store,Café,Park,Coffee Shop,Nightclub,Baby Store,Candy Store,Gas Station,Diner,Athletics & Sports
8,Downtown Toronto,3,3,Restaurant,Coffee Shop,Café,Hotel,Gym,Bar,Bakery,Thai Restaurant,Asian Restaurant,Salad Place
9,West Toronto,3,3,Pharmacy,Bakery,Grocery Store,Café,Bar,Bank,Supermarket,Pool,Middle Eastern Restaurant,Art Gallery
10,Downtown Toronto,3,3,Coffee Shop,Aquarium,Hotel,Italian Restaurant,Café,Restaurant,Fried Chicken Joint,Brewery,Scenic Lookout,Sporting Goods Shop


In [126]:
toronto_merged.loc[toronto_merged['Cluster Label'] == 4, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Label,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
19,Central Toronto,4,4,Home Service,Pool,Garden,Women's Store,Department Store,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop
