**Table of Content**

1. [Notebook setup](#Notebook setup)
1. [Data retrieval](#Data retrieval)
    1. [Population Data](#Population Data)
    1. [Real Estate Data](#Real Estate Data)
    1. [Web scrapping the list of Neighborhoods](#Web scrapping the list of Neighborhoods)


Install the python packages needed for the data analysis

1. > #### Notebook setup<a id="Notebook setup"/>

In [None]:
from pandas.io.json import json_normalize
import folium
from geopy.geocoders import Nominatim
import requests
import pandas as pd
from bs4 import BeautifulSoup
import seaborn as sns

import requests # library to handle requests
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

import re

import time
import folium # plotting library
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', 500)

#libraries for Data preprocess
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import Normalizer

#librarie for ML Clustring
from sklearn.cluster import KMeans

import requests
import io

LIMIT = 100 # limit of number of venues returned by Foursquare API

In [None]:
!conda install -c conda-forge geopy --yes

2. > ### Data retrieval<a id="Data retrieval"></a>

We will need three sources of data:
* Population data
* Real estate data
* Venue location and profile

2. 1. > #### Population data<a id="Population Data"></a>

We  will need to retrieve the population distribution by the neighborhoods of Lisbon from a public access database. The data is from the year 2011, this is the date of the last national census in Portugal and is available on the national statistics authority website.

To retrieve the data from the website, we need to add the *headers={'User-Agent': 'Mozilla/5.0'}* option to the *requests.get* call because the website requires it to allow us to download the data.

In [None]:
req = requests.get('https://www.ine.pt/clientFiles/Tx229qn_YCWF3G-boMIgFv1ysVK3XljoZXw-264d_32562.csv', headers={'User-Agent': 'Mozilla/5.0'}).text

A quick inspection of the file, show us that the file has a few lines on the begining and at the end that are the data description and need to be skipped before we load the data in a data frame.

In [None]:
print(req[1:600])

The file has 53 lines of data, the total number of neighborhoods, the information comes from an inspection of the file but also from the website. So we can skipe the first 14 lines and load just the next 53 lines. 

We will also name the data frame headers according to the data in the file. We will only keep the relevant columns for the data analysis and skip the other ones. Then we will review what was loaded and clean the data.

In [None]:
req = requests.get('https://www.ine.pt/clientFiles/Tx229qn_YCWF3G-boMIgFv1ysVK3XljoZXw-264d_32562.csv', headers={'User-Agent': 'Mozilla/5.0'}).text
headers = ['Data reference period','id', 'Neighborhood','Total','15 - 19 years', '20 - 24 years', '25 - 29 years','30 - 34 years','empty']
type = {'Neighborhood': 'string'}
population_data = pd.read_csv(io.StringIO(req), skiprows=14, nrows=53, header=None, names=headers, sep =':|;', engine='python', usecols=list(range(2,8)), dtype=type)

In [None]:
population_data.Neighborhood[:6].apply(lambda x: '-{}-'.format(x)) #check what was loaded in the neighborhood column.

As we can see above, we need to strip the white spaces from the start of the neighborhood column.

In [None]:
population_data.Neighborhood = population_data.Neighborhood.apply(lambda x: x.strip())

In [None]:
population_data.dtypes

In [None]:
population_data.head().sort_values('Neighborhood')

2. 2. > #### Real Estate data<a id="Real Estate Data"></a>

Next we will retrieve from the same source, the average value per square meter of dwellings sales in the city of Lisbon by neighborhood. The data is from the last quarter 2019.
We will use the same strategy describe above to retrieve the data.
We need to skip the first 12 lines of the file and only load the next 24 lines.

In [None]:
req = requests.get('http://www.ine.pt/clientFiles/doiruoM_g0OE783CqcQFlrT_gZ5W4b37oicowX1X_93336.csv', headers={'User-Agent': 'Mozilla/5.0'}).text
headers = ['Neighborhood','Median value per m2 of dwellings sales']
type = {'Neighborhood': 'string'}
real_estate = pd.read_csv(io.StringIO(req), skiprows=12, nrows=24, header=None, names= headers, sep =':|;', engine='python', usecols=[1,2], dtype = type)

In [None]:
real_estate.Neighborhood[:6].apply(lambda x: '-{}-'.format(x)) #check what was loaded in the neighborhood column.

In [None]:
real_estate.Neighborhood = real_estate.Neighborhood.apply(lambda x: x.strip())

In [None]:
real_estate.head().sort_values('Neighborhood')

In [None]:
real_estate.info()

You may notice that these data are display a different set of Neighborhood. This is because between 2011 and 2019 there was a administrative territorial reset of the Neighborhood in Portugal and manny were merged.

So the first data frame is showing a population distribution for an outdated administrative division of the city. We will need to web scrap a wikipedia page to get the relation between the current Neighborhoods and the these one.

3. > ### Web scrapping the list of Neighborhoods<a id="Web scrapping the list of Neighborhoods"/>

We will get the information we need to merge the two data frames from a wikipedia page that relates the two sets of Neighborhood names. 

In [None]:
website_url = requests.get('https://pt.wikipedia.org/wiki/Lista_de_freguesias_de_Lisboa').text

In [None]:
soup = BeautifulSoup(website_url,'lxml')

In [None]:
headers = ['Neighborhood', 'Old Neighborhood']
type = {'Neighborhood': 'string', 'Old Neighborhood': 'string'}
neighborhoods = pd.DataFrame(columns = headers)

In [None]:
for row in soup.table.find_all('tr'):
    row_data=[]
    for data in row.find_all('td'):
        row_data.append(data.text.strip())
    #print("row_data", row_data)
    if len(row_data) == 10 :
        neighborhoods.loc[len(neighborhoods)] = [row_data[2], row_data[7]]
        neighboor = row_data[2]
    elif len(row_data) == 5 and row_data[0] != '62':
        neighborhoods.loc[len(neighborhoods)] = [neighboor, row_data[2]]
    elif len(row_data) == 5 and row_data[0] == '62':
        neighborhoods.loc[len(neighborhoods)] = [row_data[2],'']

In [None]:
neighborhoods.head()

In [None]:
neighborhoods = neighborhoods.astype('string')
neighborhoods.info()

In [None]:
def retrieve_first_word_of(line):
    return line.split('[')[0].split('(')[0]

In [None]:
neighborhoods['Neighborhood'] = neighborhoods['Neighborhood'].apply(lambda x: retrieve_first_word_of(x).strip())
neighborhoods['Old Neighborhood'] = neighborhoods['Old Neighborhood'].apply(lambda x: retrieve_first_word_of(x).strip())
neighborhoods = neighborhoods.astype('string')

In [None]:
neighborhoods

4. > ### Data Preparation<a id="Data Preparation"/>

We will merge the data sets, starting with the population and the Neighborhoods and then we will merge the resulting data frame with the real estate data frame.

In [None]:
population_data.head()

In [None]:
population_data.rename(columns={'Neighborhood': 'Old Neighborhood'}, inplace = True)

In [None]:
df_Neighborhood_population = pd.merge(neighborhoods, population_data, on='Old Neighborhood')

In [None]:
df_Neighborhood_population.info()

In [None]:
df_Neighborhood_population.head(10)

In [None]:
columns = df_Neighborhood_population.columns

In [None]:
stats = df_Neighborhood_population[(columns[2:])].loc[29]

In [None]:
olivais = (stats * 0.6).astype(int)
parque = (stats * 0.4).astype(int)

In [None]:
parque

In [None]:
df_Neighborhood_population = df_Neighborhood_population.drop([29], axis = 0)

In [None]:
#parque['Neighborhood'] = "Parque das Nações"

In [None]:
parque

In [None]:
a = ["Parque das Nações", ""]
a.extend(parque.tolist())

b = ["Olivais", "Santa Maria dos Olivais"]
b.extend(olivais.tolist())
dfnew = pd.DataFrame([a,b], columns=df_Neighborhood_population.columns)
df_Neighborhood_population = df_Neighborhood_population.append(dfnew)
df_Neighborhood_population.info()

In [None]:
df_Neighborhood_population.head(10)

In [None]:
neighborhoods = neighborhoods.groupby(['Neighborhood'])['Old Neighborhood'].apply(', '.join).reset_index()
neighborhoods.head(10)

In [None]:
df_Neighborhood_real_estate = pd.merge(neighborhoods, real_estate, on='Neighborhood')

In [None]:
df_Neighborhood_real_estate.head(10)

In [None]:
df_Neighborhood_population.head()

In [None]:
df = df_Neighborhood_population.groupby(['Neighborhood']).sum()

In [None]:
df_Neighborhood_population_grouped = df

In [None]:
df_Neighborhood_population_grouped.head(10)

There is a issue with merge. Since one of the old neighboorhoods, "Santa Maria dos Olivais", we need to split the data between the two new neighborhoods. 
We know that the split was around 60% to the new "Olivais" neighborhood and 40% to the new "Parque das Nações" nwighborhood. 

In [None]:
df2 = pd.merge(df_Neighborhood_real_estate, df_Neighborhood_population_grouped, on = "Neighborhood")

In [None]:
colunms=df2[['Median value per m2 of dwellings sales', 'Total', '15 - 19 years', '20 - 24 years', '25 - 29 years', '30 - 34 years']].columns

In [None]:
df2[['Neighborhood', 'total score']]

In [None]:
import sys
!{sys.executable} -m pip install geocoder

In [None]:
import geocoder

In [None]:
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
ACCESS_TOKEN = user_secrets.get_secret("ACCESS_TOKEN")
CLIENT_ID = user_secrets.get_secret("CLIENT_ID")
CLIENT_SECRET = user_secrets.get_secret("CLIENT_SECRET")
google_api_key = user_secrets.get_secret("google_api_key")

VERSION = '20180605' # Foursquare API version

In [None]:
g = geocoder.google('Lisboa, Portugal', key=google_api_key)
g.latlng
print(g.latlng)
latitude, longitude = g.latlng

In [None]:
def get_latlng(postal_code):
    # initialize your variable to None
    lat_lng_coords = None
    # loop until you get the coordinates
    while(lat_lng_coords is None):
        g = geocoder.google('{}, Lisboa, Portugal'.format(postal_code), key=google_api_key)
        lat_lng_coords = g.latlng
    return lat_lng_coords[0],lat_lng_coords[1]

In [None]:
df2.head()

In [None]:
df2['coord'] = df2.Neighborhood.apply(lambda x: get_latlng(x))

In [None]:
df2.head()

In [None]:
df2['Latitude'] = df2.coord.apply(lambda x: x[0])
df2['Longitude'] = df2.coord.apply(lambda x: x[1])

In [None]:
df2.drop("coord", axis=1, inplace=True)

In [None]:
address = 'Lisbon, Portugal'

geolocator = Nominatim(user_agent="lisbon_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of the city of Lisbon are {}, {}.'.format(latitude, longitude))

In [None]:
# create map of Lisbon using latitude and longitude values
map_lisbon = folium.Map(location=[latitude, longitude], zoom_start=14)

# add markers to map
for lat, lng, Neighborhood in zip(df2['Latitude'], df2['Longitude'], df2['Neighborhood']):
    label = '{}'.format(Neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_lisbon)  
    
map_lisbon

In [None]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [None]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [None]:
downtown_venues = getNearbyVenues(names=df2['Neighborhood'], latitudes=df2['Latitude'], longitudes=df2['Longitude'])

In [None]:
print(downtown_venues.shape)

In [None]:
downtown_venues.head()

In [None]:
downtown_venues.groupby('Neighborhood').count()

In [None]:
print('There are {} uniques categories.'.format(len(downtown_venues['Venue Category'].unique())))

In [None]:
# one hot encoding
downtown_onehot = pd.get_dummies(downtown_venues[['Venue Category']], prefix="", prefix_sep="")

# drop the Neighborhood column (that doesn't have the names at the moment)
#downtown_onehot = downtown_onehot.drop(['Neighborhood'], axis = 1)

# add neighborhood column back to dataframe
downtown_onehot['Neighborhood'] = downtown_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [downtown_onehot.columns[-1]]  + list(downtown_onehot.columns[:-1])
downtown_onehot = downtown_onehot[fixed_columns]

downtown_onehot.head()

In [None]:
downtown_onehot.shape

In [None]:
downtown_grouped = downtown_onehot.groupby('Neighborhood').mean().reset_index()
downtown_grouped

In [None]:
downtown_grouped.shape

In [None]:
num_top_venues = 5

for hood in downtown_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = downtown_grouped[downtown_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

In [None]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [None]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = downtown_grouped['Neighborhood']

for ind in np.arange(downtown_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(downtown_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

In [None]:
# set number of clusters
kclusters = 5

downtown_grouped_clustering = downtown_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(downtown_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:15] 

In [None]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
downtown_merged = df2

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
downtown_merged = downtown_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

downtown_merged.head() # check the last columns!

In [None]:
df2.rename(columns={'Median value per m2 of dwellings sales': 'real estate value'}, inplace=True)
df2.info()

In [None]:
url = 'https://opendata.arcgis.com/datasets/e0ebb7f5038e4114979f73cbf66321ef_1.geojson'

neighborhoods_json=requests.get(url).json()

In [None]:
#!wget --quiet https://opendata.arcgis.com/datasets/e0ebb7f5038e4114979f73cbf66321ef_1.geojson lisbon_json
#lisbon_geo = r'lisbon_json'
def makeMap(center = [latitude, longitude], zoom = 12):
    neighMap = folium.Map(location = center, zoom_start = zoom)#, tiles='cartodbpositron')

    # choropleth map without data to outline the neighborhoods    
    choropleth = folium.Choropleth(
        geo_data = neighborhoods_json,#neighborhoods_json,
        data = df2,
        columns = ['Neighborhood', 'real estate value'],
        key_on ='feature.properties.NOME',
        name = 'choropleth',
        fill_color = 'YlOrRd',
        fill_opacity = 0.7, 
        line_opacity = 0.3,
        legend_name = 'Real Estate Value',
        line_color = 'black',
        highlight = True,
    ).add_to(neighMap)

    #choropleth.geojson.add_child(folium.features.GeoJsonTooltip(['Neighborhood'],labels=False))
    
    return neighMap

In [None]:
lisbon_map = makeMap()

# add approximate business center markers to map
for lat, lng, neighborhood in zip(df2['Latitude'], df2['Longitude'], df2['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=10,
        popup=label,
        fill=True,
        parse_html=False,
        color='blue'
    ).add_to(lisbon_map)
    
# display map
lisbon_map

In [None]:
# define a function that extracts the category of the venue

def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

## The business types criteria specified by the client: 'Restaurants', 'Cafés' and 'Bars'

![](http://)> Let's look at their frequency of occurance for all the Lisbon neighborhoods, isolating the categorical venues
These are the venue types that the client wants to have an abundant density of in the ideal store locations. I've used a violin plot from the seaborn library - it is a great way to visualise frequency distribution datasets, they display a density estimation of the underlying distribution.

In [None]:
# Categorical plot
# Explore a plot of this data (a violin plot is used which is a density estimation of the underlying distribution).
# The top 3 venue types as specified by the client for each neighborhood are used for the plotting.
import matplotlib.pyplot as plt
import seaborn as sns

fig = plt.figure(figsize=(50,25))
sns.set(font_scale=1.1)

ax = plt.subplot(3,1,1)
sns.violinplot(x="Neighborhood", y="Restaurant", data=downtown_onehot, cut=0);
plt.xlabel("")

ax = plt.subplot(3,1,2)
sns.violinplot(x="Neighborhood", y="Café", data=downtown_onehot, cut=0);
plt.xlabel("")

plt.subplot(3,1,3)
sns.violinplot(x="Neighborhood", y="Bakery", data=downtown_onehot, cut=0);

ax.text(-1.0, 3.1, 'Frequency distribution for the top 3 venue categories for each neighborhood', fontsize=60)
plt.savefig ("Distribution_Frequency_Venues_3_categories.png", dpi=240)
plt.show()

In [None]:
fig = plt.figure(figsize=(50,25))
sns.set(font_scale=1.1)

ax = plt.subplot(4,1,1)
sns.violinplot(x="Neighborhood", y="Restaurant", data=downtown_onehot, cut=0);
plt.xlabel("")

ax = plt.subplot(4,1,2)
sns.violinplot(x="Neighborhood", y="Café", data=downtown_onehot, cut=0);
plt.xlabel("")

plt.subplot(4,1,3)
sns.violinplot(x="Neighborhood", y="Bakery", data=downtown_onehot, cut=0);

plt.subplot(4,1,4)
sns.violinplot(x="Neighborhood", y="Bookstore", data=downtown_onehot, cut=0);

ax.text(-1.0, 3.1, 'Frequency distribution for the top 3 venue categories for each neighborhood (click to enlage)', fontsize=60)
plt.savefig ("Distribution_Frequency_Venues_3_categories.png", dpi=240)
plt.show()

So our candidates are 
* Alvalade
* Areeiro
* Avenidas Novas
* Campo de Ourique
* Campolide
* Parque das Nações

In [None]:
violin_data = ['Alvalade', 'Areeiro', 'Avenidas Novas', 'Campo de Ourique', 'Campolide', 'Parque das Nações']

There a re still a lot of neighborhoods to analyse. We have other source data to include in our analysis.
Where yourger people live. So let display it on the map.

In [None]:
def population_map(center = [latitude, longitude], zoom = 12):
    map = folium.Map(location = center, zoom_start = zoom)

    # choropleth map without data to outline the neighborhoods    
    choropleth = folium.Choropleth(
        geo_data = neighborhoods_json,#neighborhoods_json,
        data = df2,
        columns = ['Neighborhood', 'Total'],
        key_on ='feature.properties.NOME',
        name = 'choropleth',
        fill_color = 'YlOrRd',
        fill_opacity = 0.7, 
        line_opacity = 0.3,
        legend_name = 'Total of younger Population distribution',
        line_color = 'black',
        highlight = True,
    ).add_to(map)

    return map

younger_pop_map = population_map()

# add approximate neighborhood center markers to map
for lat, lng, neighborhood in zip(df2['Latitude'], df2['Longitude'], df2['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=10,
        popup=label,
        fill=True,
        parse_html=False,
        color='blue'
    ).add_to(younger_pop_map)
    
# display map
younger_pop_map

In [None]:
df2.head()

In [None]:
df2['real estate score'] = df2['real estate value'].apply(lambda value: 0.7 * value/df2['real estate value'].mean())
df2['population score']  = df2['Total'].apply(lambda value: 0.3 * value/df2['Total'].mean())
df2['total score'] = df2['population score'] + df2['real estate score']

In [None]:
def population_map(center = [latitude, longitude], zoom = 12):
    map = folium.Map(location = center, zoom_start = zoom)

    # choropleth map without data to outline the neighborhoods    
    choropleth = folium.Choropleth(
        geo_data = neighborhoods_json,#neighborhoods_json,
        data = df2,
        columns = ['Neighborhood', 'total score'],
        key_on ='feature.properties.NOME',
        name = 'choropleth',
        fill_color = 'YlOrRd',
        fill_opacity = 0.7, 
        line_opacity = 0.3,
        legend_name = 'Best score',
        line_color = 'black',
        highlight = True,
    ).add_to(map)

    return map

younger_pop_map = population_map()

# add approximate neighborhood center markers to map
for lat, lng, neighborhood in zip(df2['Latitude'], df2['Longitude'], df2['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=10,
        popup=label,
        fill=True,
        parse_html=False,
        color='blue'
    ).add_to(younger_pop_map)
    
# display map
younger_pop_map

In [None]:
df2.head()

In [None]:
df2.set_index(df2.Neighborhood, inplace = True)

In [None]:
df3 = df2.loc[violin_data]

In [None]:
def population_map(center = [latitude, longitude], zoom = 12):
    map = folium.Map(location = center, zoom_start = zoom)

    # choropleth map without data to outline the neighborhoods    
    choropleth = folium.Choropleth(
        geo_data = neighborhoods_json,
        data = df2,
        columns = ['Neighborhood', 'total score'],
        key_on ='feature.properties.NOME',
        name = 'choropleth',
        fill_color = 'YlOrRd',
        fill_opacity = 0.7, 
        line_opacity = 0.3,
        legend_name = 'Best score',
        line_color = 'black',
        highlight = True,
    ).add_to(map)

    return map

younger_pop_map = population_map()

# add approximate neighborhood center markers to map

# display map

for lat, lng, neighborhood in zip(df3['Latitude'], df3['Longitude'], df3['Neighborhood']):
    folium.CircleMarker(
        location=[lat, lng],
        radius=20,
        fill=True,
        parse_html=True,
        color='green',
        fill_color='#3186cc'
    ).add_to(younger_pop_map)
    
younger_pop_map

In [None]:
m = folium.Map(
    location=[latitude, longitude],
    zoom_start=12  # Limited levels of zoom for free Mapbox tiles.
)

folium.GeoJson(
   neighborhoods_json,
    name='geojson'
).add_to(m)
m