# Similar district
## Coursera Capstone Project
### Sergii Guzenko

## Introduction/Business Problem
One family with a little child decided to move from Turin, Italy to Manhattan, NY. They are looking for most suitable neighbourhood. They want to low an impact of the relocation on their life, as well as work. That's why they asked me to compare their previous home city with a new one and indicate similar neighbourhoods.

We discussed and created a criteria's list to consider during research with presence and distance to/from:
- schools/kindergartens;
- parks/playground for children;
- gyms/swimming pools;
- supermarkets/grocery shops;
- train and bus stations;
- airport;
- restaurants;
- landmarks

In the future, we can use this model to find a similar districts in another city or country to suggest
- relocation options;
- investment solutions;
- solve urban problems;

## Data type and sources
I will use data from Foursquare to qualify and cluster neighbourhoods:
- revues based on type
- distance from center of the neighborhood

I will check other sources for crime rates, subwaystations ect. <br>
Here is some examples: <br>
Chicago crime https://data.world/publicsafety/chicago-crime/file/chicago_crime_2014.csv or https://home.chicagopolice.org/statistics-data/public-arrest-data/ <br>
A subway metro stops https://en.wikipedia.org/wiki/List_of_New_York_City_Subway_stations_in_Manhattan
NYC open data for school https://data.cityofnewyork.us/Education/2017-2018-School-Locations/p6h4-mpyy

In [2]:
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

import requests # library to handle requests
import urllib.request
import time

#!conda install -c conda-forge beautifulsoup4 --y
from bs4 import BeautifulSoup

#!conda install -c conda-forge lxml --y
#from lxml import etree

#from urllib.request import urlopen

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library
from folium import plugins

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

import seaborn as sns

# import k-means from clustering stage
from sklearn.cluster import KMeans

print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    branca-0.4.0               |             py_0          26 KB  conda-forge
    ca-certificates-2020.4.5.1 |       hecc5488_0         146 KB  conda-forge
    python_abi-3.6             |          1_cp36m           4 KB  conda-forge
    certifi-2020.4.5.1         |   py36h9f0ad1d_0         151 KB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    altair-4.1.0               |             py_1         614 KB  conda-forge
    openssl-1.1.1g             |       h516909a_0         2.1 MB  conda-forge
    ------------------------------------------------------------
                       

In [3]:
torino_df = pd.read_excel('https://github.com/fint113/Coursera_Capstone/raw/a94e02e995538aa9da626a05fb04afa26afe71a9/ex_neighborhood_boundary.xlsx')
#torino_df.dropna(inplace = True) 
torino_df.head()

Unnamed: 0,ID_QUART,DENOM
0,1,Centro
1,2,San Salvario
2,3,Crocetta
3,8,Vanchiglia
4,7,Aurora


In [4]:
address = 'Corso Racconigi 28, Torino TO, Italy'
geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Italy home are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Italy home are 45.0717654, 7.6468397.


  from ipykernel import kernelapp as app


In [5]:
TU_neighborhood_latitude=latitude
TU_neighborhood_longitude=longitude

## Dial FourSquare to find venues around current residence in Singapore

In [6]:
# The code was removed by Watson Studio for sharing.

In [7]:
LIMIT = 250 # limit of number of venues returned by Foursquare API
radius = 1000 # define radius
# create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    TU_neighborhood_latitude, 
    TU_neighborhood_longitude, 
    radius, 
    LIMIT)
#url # display URL

In [8]:
# results display is hidden for report simplification 
results = requests.get(url).json()
#results

##### function that extracts the category of the venue - borrow from the Foursquare lab

In [9]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [10]:
venues = results['response']['groups'][0]['items']
TUnearby_venues = json_normalize(venues) # flatten JSON
# filter columns
filtered_columns = ['venue.location.neighborhood','venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
TUnearby_venues =TUnearby_venues.loc[:, filtered_columns]
# filter the category for each row
TUnearby_venues['venue.categories'] = TUnearby_venues.apply(get_category_type, axis=1)
# clean columns
TUnearby_venues.columns = [col.split(".")[-1] for col in TUnearby_venues.columns]

TUnearby_venues.shape

(52, 5)

In [11]:
# Venues near current Turin residence place
TUnearby_venues['neighborhood'] = 'Italian home'
TUnearby_venues.head(10)

Unnamed: 0,neighborhood,name,categories,lat,lng
0,Italian home,Osteria Antiche Sere,Piedmontese Restaurant,45.071046,7.643011
1,Italian home,Brasserie de La Mer,French Restaurant,45.071297,7.646836
2,Italian home,Vale un Perù,Peruvian Restaurant,45.07026,7.645671
3,Italian home,Bar Torrefazione Ferrucci,Coffee Shop,45.067947,7.655234
4,Italian home,Piola da Celso,Piedmontese Restaurant,45.066948,7.647337
5,Italian home,Parco della Tesoriera,Park,45.076597,7.638373
6,Italian home,Plin & Tajarin,Piedmontese Restaurant,45.073978,7.657748
7,Italian home,Hamburgeria,Burger Joint,45.065308,7.647515
8,Italian home,Wasabi,Japanese Restaurant,45.066104,7.655126
9,Italian home,Teatro Astra,Theater,45.07734,7.650184


In [12]:
#TUnearby_venues.groupby('categories').count()

In [13]:
print('There are {} uniques categories.'.format(len(TUnearby_venues['categories'].unique())))

There are 34 uniques categories.


In [14]:
# one hot encoding
TU_onehot = pd.get_dummies(TUnearby_venues[['categories']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
TU_onehot['neighborhood']=TUnearby_venues['neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [TU_onehot.columns[-1]] + list(TU_onehot.columns[:-1])
TU_onehot = TU_onehot[fixed_columns]


TU_onehot.head()

Unnamed: 0,neighborhood,Asian Restaurant,Bike Rental / Bike Share,Burger Joint,Bus Station,Café,Chinese Restaurant,Cocktail Bar,Coffee Shop,Deli / Bodega,French Restaurant,Greek Restaurant,Ice Cream Shop,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Jewelry Store,Karaoke Bar,Kebab Restaurant,Market,Metro Station,Movie Theater,Park,Peruvian Restaurant,Piadineria,Piedmontese Restaurant,Pizza Place,Plaza,Pub,Restaurant,Salon / Barbershop,Sandwich Place,Soccer Field,Sushi Restaurant,Theater
0,Italian home,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
1,Italian home,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Italian home,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0
3,Italian home,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Italian home,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0


In [15]:
TU_grouped = TU_onehot.groupby('neighborhood').mean().reset_index()
TU_grouped

Unnamed: 0,neighborhood,Asian Restaurant,Bike Rental / Bike Share,Burger Joint,Bus Station,Café,Chinese Restaurant,Cocktail Bar,Coffee Shop,Deli / Bodega,French Restaurant,Greek Restaurant,Ice Cream Shop,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Jewelry Store,Karaoke Bar,Kebab Restaurant,Market,Metro Station,Movie Theater,Park,Peruvian Restaurant,Piadineria,Piedmontese Restaurant,Pizza Place,Plaza,Pub,Restaurant,Salon / Barbershop,Sandwich Place,Soccer Field,Sushi Restaurant,Theater
0,Italian home,0.019231,0.019231,0.038462,0.038462,0.019231,0.057692,0.019231,0.019231,0.019231,0.019231,0.019231,0.019231,0.019231,0.057692,0.057692,0.019231,0.019231,0.019231,0.019231,0.019231,0.019231,0.019231,0.019231,0.019231,0.057692,0.115385,0.076923,0.019231,0.019231,0.019231,0.019231,0.019231,0.019231,0.019231


In [16]:
num_top_venues = 5

for hood in TU_grouped['neighborhood']:
    print("----"+hood+"----")
    temp = TU_grouped[TU_grouped['neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Italian home----
                    venue  freq
0             Pizza Place  0.12
1                   Plaza  0.08
2      Chinese Restaurant  0.06
3  Piedmontese Restaurant  0.06
4     Japanese Restaurant  0.06




In [17]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [20]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
TU_venues_sorted = pd.DataFrame(columns=columns)
TU_venues_sorted['Neighborhood'] = TU_grouped['neighborhood']

for ind in np.arange(TU_grouped.shape[0]):
    TU_venues_sorted.iloc[ind, 1:] = return_most_common_venues(TU_grouped.iloc[ind, :], num_top_venues)

TU_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Italian home,Pizza Place,Plaza,Piedmontese Restaurant,Chinese Restaurant,Japanese Restaurant,Italian Restaurant,Bus Station,Burger Joint,Café,Bike Rental / Bike Share


### Map of Turin residence place with venues in Neighborhood - for reference

In [22]:
# create map of Turin place  using latitude and longitude values
map_tu = folium.Map(location=[TU_neighborhood_latitude, TU_neighborhood_longitude], zoom_start=15)
# add markers to map
for lat, lng, label in zip(TUnearby_venues['lat'], TUnearby_venues['lng'], TUnearby_venues['name']):
    label = folium.Popup(label, parse_html=True)
    folium.RegularPolygonMarker(
        [lat, lng],
        number_of_sides=30,
        radius=7,
        popup=label,
        color='blue',
        fill_color='#0f0f0f',
        fill_opacity=0.6,
    ).add_to(map_tu)  
    
map_tu

## MANHATTAN NEIGHBORHOODS - DATA AND MAPPING

Cluster neighborhood data was produced with Foursquare during course lab work. A csv file was produced containing the neighborhoods around the 40 Boroughs. Now, the csv file is just read for convenience and consolidation of report.

In [23]:
# Read csv file with clustered neighborhoods with geodata
manhattan_data  = pd.read_csv('https://raw.githubusercontent.com/fint113/Coursera_Capstone/master/mh_neighboorhoods_data.csv') 
manhattan_data.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels
0,Manhattan,Marble Hill,40.876551,-73.91066,2
1,Manhattan,Chinatown,40.715618,-73.994279,2
2,Manhattan,Washington Heights,40.851903,-73.9369,4
3,Manhattan,Inwood,40.867684,-73.92121,3
4,Manhattan,Hamilton Heights,40.823604,-73.949688,0


In [24]:
manhattan_data.tail()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels
35,Manhattan,Turtle Bay,40.752042,-73.967708,3
36,Manhattan,Tudor City,40.746917,-73.971219,3
37,Manhattan,Stuyvesant Town,40.731,-73.974052,4
38,Manhattan,Flatiron,40.739673,-73.990947,3
39,Manhattan,Hudson Yards,40.756658,-74.000111,2


In [25]:
manhattan_merged = pd.read_csv('https://raw.githubusercontent.com/fint113/Coursera_Capstone/master/manhattan_merged.csv')
manhattan_merged.shape

(40, 15)

## Map of Manhattan neighborhoods with top 10 clustered venues

#### popus allow to identify each neighborhood and the cluster of venues around it in order to proceed to examine in more detail in the next cell

In [26]:
address = 'Manhattan, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Manhattan are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Manhattan are 40.7896239, -73.9598939.


In [27]:
MA_neighborhood_latitude=latitude
MA_neighborhood_longitude=longitude

kclusters=5
map_clusters = folium.Map(location=[MA_neighborhood_latitude, MA_neighborhood_longitude], zoom_start=13)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(manhattan_merged['Latitude'], manhattan_merged['Longitude'], manhattan_merged['Neighborhood'], manhattan_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=20,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
  # add markers for rental places to map

for lat, lng, label in zip(manhattan_data['Latitude'], manhattan_data['Longitude'], manhattan_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_clusters)    
           
map_clusters

## Examine a paticular Cluster - print venues

#### Cluster 1

In [28]:
#manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 0, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

#### Cluster 2

In [29]:
#manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 1, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

#### Cluster 3

In [30]:
#manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 2, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

#### Cluster 4

In [31]:
#manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 3, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

#### Cluster 5

In [32]:
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 4, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Washington Heights,Café,Bakery,Mobile Phone Shop,Pizza Place,Sandwich Place,Park,Gym,Latin American Restaurant,Tapas Restaurant,Mexican Restaurant
7,East Harlem,Mexican Restaurant,Bakery,Latin American Restaurant,Deli / Bodega,Thai Restaurant,French Restaurant,Café,Taco Place,Street Art,Steakhouse
11,Roosevelt Island,Coffee Shop,Sandwich Place,Park,Japanese Restaurant,Kosher Restaurant,Greek Restaurant,Baseball Field,Gym,Outdoors & Recreation,Dog Run
13,Lincoln Square,Theater,Gym / Fitness Center,Concert Hall,Plaza,Italian Restaurant,French Restaurant,Café,Opera House,Indie Movie Theater,Park
15,Midtown,Hotel,Theater,Coffee Shop,Steakhouse,Food Truck,Cocktail Bar,Clothing Store,Spa,Bookstore,Sporting Goods Shop
19,East Village,Ice Cream Shop,Bar,Wine Bar,Mexican Restaurant,Cocktail Bar,Pizza Place,Coffee Shop,Chinese Restaurant,Speakeasy,Vegetarian / Vegan Restaurant
20,Lower East Side,Chinese Restaurant,Coffee Shop,Café,Bakery,Latin American Restaurant,Park,Cocktail Bar,Japanese Restaurant,Pizza Place,Ramen Restaurant
21,Tribeca,American Restaurant,Italian Restaurant,Park,Spa,Café,Boutique,Wine Bar,Coffee Shop,Greek Restaurant,Gym
22,Little Italy,Bakery,Café,Yoga Studio,Cocktail Bar,Sandwich Place,Salon / Barbershop,Pizza Place,Ice Cream Shop,Seafood Restaurant,Chinese Restaurant
25,Manhattan Valley,Coffee Shop,Bar,Pizza Place,Chinese Restaurant,Indian Restaurant,Italian Restaurant,Thai Restaurant,Deli / Bodega,Mexican Restaurant,Yoga Studio


#### After examining several cluster data , I concluded that cluster # 5 resembles closer the Italian place, therefore providing guidance as to where to look for the future apartment

# Map of Manhattan schools

In [33]:
MA_schools_df = pd.read_csv('https://data.cityofnewyork.us/api/views/p6h4-mpyy/rows.csv')
MA_schools_df = MA_schools_df[['LOCATION_NAME','Location 1','LOCATION_CATEGORY_DESCRIPTION']]
MA_schools_df.columns = ['school_name','location','school_type']
MA_schools_df.dropna(inplace = True)
MA_schools_df.head()

Unnamed: 0,school_name,location,school_type
0,P.S. 015 Roberto Clemente,"333 EAST 4 STREET\nMANHATTAN, NY 10009\n(40.72...",Elementary
1,P.S. 019 Asher Levy,"185 1 AVENUE\nMANHATTAN, NY 10003\n(40.730009,...",Elementary
2,P.S. 020 Anna Silver,"166 ESSEX STREET\nMANHATTAN, NY 10002\n(40.721...",Elementary
3,P.S. 034 Franklin D. Roosevelt,"730 EAST 12 STREET\nMANHATTAN, NY 10009\n(40.7...",K-8
4,The STAR Academy - P.S.63,"121 EAST 3 STREET\nMANHATTAN, NY 10009\n(40.72...",Elementary


In [34]:
split1 = MA_schools_df['location'].str.split(r'\n()', expand=True)
MA_schools_df.drop(columns='location', inplace=True)
split1.head()

Unnamed: 0,0,1,2,3,4
0,333 EAST 4 STREET,,"MANHATTAN, NY 10009",,"(40.722075, -73.978747)"
1,185 1 AVENUE,,"MANHATTAN, NY 10003",,"(40.730009, -73.984496)"
2,166 ESSEX STREET,,"MANHATTAN, NY 10002",,"(40.721305, -73.986312)"
3,730 EAST 12 STREET,,"MANHATTAN, NY 10009",,"(40.726008, -73.975058)"
4,121 EAST 3 STREET,,"MANHATTAN, NY 10009",,"(40.72444, -73.986214)"


In [35]:
MA_schools_df[['address1','address2']] = split1[[0,2]]
MA_schools_df[['latitude','longitude']] = split1[4].str.split('[(,)]',expand=True)[[1,2]].astype('float64')
print(MA_schools_df.shape)
MA_schools_df.head()

(1822, 6)


Unnamed: 0,school_name,school_type,address1,address2,latitude,longitude
0,P.S. 015 Roberto Clemente,Elementary,333 EAST 4 STREET,"MANHATTAN, NY 10009",40.722075,-73.978747
1,P.S. 019 Asher Levy,Elementary,185 1 AVENUE,"MANHATTAN, NY 10003",40.730009,-73.984496
2,P.S. 020 Anna Silver,Elementary,166 ESSEX STREET,"MANHATTAN, NY 10002",40.721305,-73.986312
3,P.S. 034 Franklin D. Roosevelt,K-8,730 EAST 12 STREET,"MANHATTAN, NY 10009",40.726008,-73.975058
4,The STAR Academy - P.S.63,Elementary,121 EAST 3 STREET,"MANHATTAN, NY 10009",40.72444,-73.986214


In [40]:
MA_schools_df = MA_schools_df[MA_schools_df.latitude != 0]
#MA_schools_df = MA_schools_df[MA_schools_df.address2 != 'MANHATTAN, NY']
MA_schools_df['address2'] = MA_schools_df['address2'].map(lambda x: x.rstrip('[1-9]'))
#MA_schools_df.sort_values(by=['longitude'], ascending=False).head()
from folium import plugins
print(MA_schools_df.shape)
MA_schools_df.head()

(1821, 6)


Unnamed: 0,school_name,school_type,address1,address2,latitude,longitude
0,P.S. 015 Roberto Clemente,Elementary,333 EAST 4 STREET,"MANHATTAN, NY 1000",40.722075,-73.978747
1,P.S. 019 Asher Levy,Elementary,185 1 AVENUE,"MANHATTAN, NY 10003",40.730009,-73.984496
2,P.S. 020 Anna Silver,Elementary,166 ESSEX STREET,"MANHATTAN, NY 10002",40.721305,-73.986312
3,P.S. 034 Franklin D. Roosevelt,K-8,730 EAST 12 STREET,"MANHATTAN, NY 1000",40.726008,-73.975058
4,The STAR Academy - P.S.63,Elementary,121 EAST 3 STREET,"MANHATTAN, NY 1000",40.72444,-73.986214


In [159]:
map_schools = folium.Map(location=[MA_neighborhood_latitude, MA_neighborhood_longitude], zoom_start=13)

for lat, lng, label in zip(MA_schools_df['latitude'], MA_schools_df['longitude'], MA_schools_df['school_name']+ ', '+MA_schools_df['school_type']):
    folium.Marker(
        location=[lat, lng],
        icon=None,
        popup=label,
    ).add_to(map_schools)
    
map_schools

In [157]:
# create map of Manhattan schools using latitude and longitude values
map_schools = folium.Map(location=[MA_neighborhood_latitude, MA_neighborhood_longitude], zoom_start=13)
# add markers to map

for lat, lng, label in zip(MA_schools_df['latitude'], MA_schools_df['longitude'], MA_schools_df['school_name']+ ', '+MA_schools_df['school_type']):
    label = folium.Popup(label, parse_html=True)
    folium.RegularPolygonMarker(
        [lat, lng],
        number_of_sides=30,
        radius=7,
        popup=label,
        color='blue',
        fill_color='#0f0f0f',
        fill_opacity=0.6,
    ).add_to(map_schools)  
    
map_schools



In [None]:
# Munich
https://www.citypopulation.de/en/germany/munchen/admin/
    
# Rottardam
https://www.citypopulation.de/en/netherlands/randstadzuid/

# Berlin
https://www.citypopulation.de/en/germany/berlin/admin/
    


This notebook is part of a course on **Coursera** called *Applied Data Science Capstone*