# Location Pizza place

### Description of the project

The main objective of the project is to select in which *comuna* (borough in english) will be best to put a pizza place in Santiago, Chile, making a comparison with the information obtained with the city of New York .

In this first part I need to create a Dataframe with the geolocation of the different *comunas* in Santiago, Chile, to then proceed to analize the information available in them with Foursquare focussing in pizza places and their ratings. 
Then is necessary to make the same analysis with the city of New York and look for characteristics in the boroughs which have the biggest quantity of pizza places and with the best ratings to find the best *comuna* to put a pizza place in Santiago.

### Libraries needed

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files


!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Collecting package metadata (current_repodata.json): failed

NotWritableError: The current user does not have write permissions to a required path.
  path: /home/lorena/.conda/pkgs/urls.txt
  uid: 1000
  gid: 1000

If you feel that permissions on this path are set incorrectly, you can manually
change them by executing

  $ sudo chown 1000:1000 /home/lorena/.conda/pkgs/urls.txt

In general, it's not advisable to use 'sudo conda'.


Collecting package metadata (current_repodata.json): failed

NotWritableError: The current user does not have write permissions to a required path.
  path: /home/lorena/.conda/pkgs/urls.txt
  uid: 1000
  gid: 1000

If you feel that permissions on this path are set incorrectly, you can manually
change them by executing

  $ sudo chown 1000:1000 /home/lorena/.conda/pkgs/urls.txt

In general, it's not advisable to use 'sudo conda'.


Libraries imported.


##  *Comunas* of Santiago, Chile

Import information of ***comunas*** in Santiago from Wikipedia

In [2]:
santiago_comunas = pd.read_html('https://es.wikipedia.org/wiki/Anexo:Comunas_de_Santiago_de_Chile')[3]
print(santiago_comunas.shape)
santiago_comunas.head()

(36, 8)


Unnamed: 0,Comuna,Ubicación?,Población (2017)?,Viviendas (2002)?,Densidad poblacional (2002) ?,Crecimiento demográfico (2002-2017)?,ICVU (2019)?,Pobreza (2015)?
0,Cerrillos,surponiente,80832,19811.0,4329.08,12.9%,47.82 (74),19.7
1,Cerro Navia,norponiente,132622,35277.0,13482.91,-10.7%,42.42 (92),35.6
2,Conchalí,norte,126955,32609.0,12070.29,-4.4%,46.52 (84),21.6
3,El Bosque,sur,162505,42808.0,12270.72,-7.3%,48.54 (70),27.0
4,Estación Central,surponiente,147041,32357.0,9036.31,16.6%,49.96 (64),14.5


Importing the postcode of every *comuna* in Chile from Wikipedia

In [3]:
post_chile = pd.read_html('https://es.wikipedia.org/wiki/Anexo:C%C3%B3digos_postales_de_Chile')[0]
post_chile.rename(columns={'Comuna/localidad':'Comuna','Código':'Codigo'},inplace=True)
 
print(post_chile.shape)
post_chile.head()

(344, 2)


Unnamed: 0,Comuna,Codigo
0,Algarrobo,2710000
1,Alhué,9650000
2,Alto Biobío,4590000
3,Alto del Carmen,1650000
4,Alto Hospicio,1130000


Cleaning the dataframe to leave only the information needed

In [4]:
santiago_comunas.drop([santiago_comunas.columns[1],santiago_comunas.columns[2],santiago_comunas.columns[3],santiago_comunas.columns[4],santiago_comunas.columns[5],santiago_comunas.columns[6],santiago_comunas.columns[7]], axis=1, inplace=True)
santiago_comunas.head()

Unnamed: 0,Comuna
0,Cerrillos
1,Cerro Navia
2,Conchalí
3,El Bosque
4,Estación Central


Merging both dataframe to leave the name of the column with its postcode

In [5]:
Santiago = pd.merge(post_chile , santiago_comunas, on='Comuna', how='inner')
Santiago.reset_index()
print(Santiago.shape)
Santiago.head()

(36, 2)


Unnamed: 0,Comuna,Codigo
0,Cerrillos,9200000
1,Cerro Navia,9080000
2,Conchalí,8540000
3,El Bosque,8010000
4,Estación Central,9160000


### Importing geolocation information of every borough in Chile

In [6]:
from io import BytesIO
from zipfile import ZipFile
import urllib.request
from urllib.request import urlopen
# or: requests.get(url).content

resp = urlopen('http://download.geonames.org/export/zip/CL.zip')
zipfile = ZipFile(BytesIO(resp.read()))

#txt_chile = zipfile.read('CL.txt')

columns_geodata =['Country code','Postal code','Place name','Admin name1','Admin code1','Admin name2','Admin code2','Admin name3','Admin code3','lat','lng','accuracy']
#geodata_chile = pd.read_csv(zipfile.open('CL.txt'), names = columns_geodata)
geodata_chile = pd.read_csv(zipfile.open('CL.txt'),names = ['Name'])
geodata_chile.head()


Unnamed: 0,Name
0,CL\t2340000\tValparaíso\tRegión de Valparaíso\...
1,CL\t2480000\tCasablanca\tRegión de Valparaíso\...
2,CL\t2490000\tQuintero\tRegión de Valparaíso\t0...
3,CL\t2500000\tPuchuncaví\tRegión de Valparaíso\...
4,CL\t2510000\tConcón\tRegión de Valparaíso\t01\...


In [7]:
# Cleaning the dataframe

geodata_chile[columns_geodata] = geodata_chile.Name.str.split("\t",expand=True) 
geodata_chile.drop(['Name'],axis=1,inplace=True)
geodata_chile.head()

Unnamed: 0,Country code,Postal code,Place name,Admin name1,Admin code1,Admin name2,Admin code2,Admin name3,Admin code3,lat,lng,accuracy
0,CL,2340000,Valparaíso,Región de Valparaíso,1,Provincia de Valparaíso,51,Valparaíso,5101,-33.1298,-71.5735,4
1,CL,2480000,Casablanca,Región de Valparaíso,1,Provincia de Valparaíso,51,Casablanca,5102,-33.3158,-71.4353,4
2,CL,2490000,Quintero,Región de Valparaíso,1,Provincia de Valparaíso,51,Quintero,5107,-32.843,-71.4738,4
3,CL,2500000,Puchuncaví,Región de Valparaíso,1,Provincia de Valparaíso,51,Puchuncaví,5105,-32.7176,-71.4111,4
4,CL,2510000,Concón,Región de Valparaíso,1,Provincia de Valparaíso,51,Concón,5103,-32.9534,-71.4678,4


In [12]:
# Dropping innecessary information

new_geo_df = geodata_chile[['Postal code','lat','lng']]
new_geo_df.rename(columns={'Postal code':'Codigo'},inplace = True)


new_geo_df = new_geo_df.astype({'Codigo':'int64'})
print(new_geo_df.dtypes)


new_geo_df.head()

Codigo     int64
lat       object
lng       object
dtype: object


Unnamed: 0,Codigo,lat,lng
0,2340000,-33.1298,-71.5735
1,2480000,-33.3158,-71.4353
2,2490000,-32.843,-71.4738
3,2500000,-32.7176,-71.4111
4,2510000,-32.9534,-71.4678


# Dataframe to be used

In [9]:
import unicodedata


santiago_geolocation = pd.merge(Santiago , new_geo_df, on='Codigo', how='inner')
print(santiago_geolocation.shape)
print(santiago_geolocation.dtypes)

# Remove accents
import unidecode
santiago_geolocation['Comuna'] = santiago_geolocation['Comuna'].apply(unidecode.unidecode)


santiago_geolocation.head()

(36, 4)
Comuna    object
Codigo     int64
lat       object
lng       object
dtype: object


Unnamed: 0,Comuna,Codigo,lat,lng
0,Cerrillos,9200000,-33.5003,-70.7174
1,Cerro Navia,9080000,-33.4228,-70.745
2,Conchali,8540000,-33.3837,-70.6774
3,El Bosque,8010000,-33.5629,-70.6764
4,Estacion Central,9160000,-33.4645,-70.6986


# Map of Santiago

In [10]:
#Santiago Latitude and Longitude
latitude =-33.416889
longitude= -70.606705

# create map of North York using latitude and longitude values
map_santiago = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(santiago_geolocation['lat'], santiago_geolocation['lng'], santiago_geolocation['Comuna']):
    label = '{}'.format(label)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_santiago)  
    
map_santiago