   # Introduction/Business Problem:

Clearly define a problem or an idea of your choice, where you would need to leverage the Foursquare location data to solve or execute. Remember that data science problems always target an audience and are meant to help a group of stakeholders solve a problem, so make sure that you explicitly describe your audience and why they would care about your problem.

This submission will eventually become your Introduction/Business Problem section in your final report. So I recommend that you push the report (having your Introduction/Business Problem section only for now) to your Github repository and submit a link to it.

**Warsaw is a 517.24 km2 capital of Poland. While it has been founded in IXth century its current urban form is a result of severe destruction during World War II (including almost complete destruction of all infrastructure on the left bank of Vistula river). As a capital city of boming economy it gorws rapidly in all direction, including suburbs with new commercial and apartment building being developed on daily basis. This poses a challange for its 1.8 mil population (and additional people migrating into Warsaw) of:**

**Does any of its 18 districts (boroughs) cluster - meaning that even though you like south you might consider living in the north?**

**The target audience is any person already living in Warsaw, who consider relocating as well as any person considering migrating into Warsaw.**

Describe the data that you will be using to solve the problem or execute your idea. Remember that you will need to use the Foursquare location data to solve the problem or execute your idea. You can absolutely use other datasets in combination with the Foursquare location data. So make sure that you provide adequate explanation and discussion, with examples, of the data that you will be using, even if it is only Foursquare location data.

This submission will eventually become your Data section in your final report. So I recommend that you push the report (having your Data section) to your Github repository and submit a link to it.

**I will be using Foursquare location data to determine position/ranking of the available restaurants all across Warsaw.
This data will be combined with customly found data on kindergarden/preschool/high school/post high school location along with Warsaw theaters position to allow for better district clustering (and help author of this notebook find his place to live).**

**Author Note: Naturally much of the data used will have Polish elements, which author tried to translated to English as much as he could, but some challanges may arise. **

# Data:

Naturally, the project will utilize FourSquare API in order to acquire as much of the venue data as it is possible. Rather than acquiring the data based on neighbourhood's (called 'Districts' in this notebook) central position and then proceeding on based on venue's location, the notebook utilize the location based on other venues acquired from different data sources.

Those data sources consist of Warsaw's infrastructure data, specifically schools, educational facilities, kindergartens etc. along with places of culture (in this instance - theaters). If the data does not come with the geographical position, a separate code is used to obtain that data.

The end result is a detailed, comprehensive summary of important Warsaw infromation, clearly presented using Folium maps.

Data API currently developed by Warsaw City:
- https://api.um.warszawa.pl/#

Data API currently developed by Polish Government:
- https://www.dane.gov.pl/institution/65,miasto-stoleczne-warszawa?page=1&per_page=5&q=&sort=-verified

Dataset provied by the City of Warsaw:
- https://edukacja.warszawa.pl/placowki/przedszkola

GeoJson representing position of Warsaw districts:
- https://github.com/andilabs/warszawa-dzielnice-geojson

# Importing main libraries and dependencies:

In [1]:
import os # For operating system operations
import math
import numpy as np # library to handle data in a vectorized manner
import time

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files from APIs

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

# Import BeautifulSoup  for web scraping
from bs4 import BeautifulSoup

# For geocoordinates acquisition:
from geopy.geocoders import Nominatim
from geopy.extra.rate_limiter import RateLimiter

# Importing shapely to check if specified location (lat, lon) is within convex polygon:
from shapely.geometry import MultiPoint, Point, Polygon

print('Libraries imported.')

Libraries imported.


In [2]:
# Processing credentials:
with open('../MainProjectDatafiles/Credentials.txt') as credentialsFile:
    credentialsData = credentialsFile.readlines()
credentialsData = [eachItem.replace('\n', '').split(',') for eachItem in credentialsData]

In [3]:
# Declaring a file with Warsaw's .geojson:
district_geo = u'../MainProjectDatafiles/warszawa-dzielnice.geojson'
# Source and reference: https://github.com/andilabs/warszawa-dzielnice-geojson

In [4]:
districtsOfWarsaw = ['Bemowo', 'Białołęka', 'Bielany', 'Mokotów', 'Ochota', 'Praga Południe',
                     'Praga Północ', 'Rembertów', 'Targówek', 'Ursus', 'Ursynów', 'Wawer',
                     'Wesoła', 'Wilanów', 'Wola', 'Włochy', 'poza Warszawą', 'Śródmieście',
                     'Żoliborz']

# Handling for known encoding issues:
districtsOfWarsawDist = {'Bemowo': 'Bemowo', 'BiaÅ‚oÅ‚Ä™ka': 'Białołęka', 'Bielany': 'Bielany', 
                         'MokotÃ³w': 'Mokotów', 'Ochota': 'Ochota', 'Praga PoÅ‚udnie': 'Praga Południe',
                         'Praga PÃ³Å‚noc': 'Praga Północ', 'RembertÃ³w':'Rembertów', 'TargÃ³wek': 'Targówek',
                         'Ursus': 'Ursus', 'UrsynÃ³w': 'Ursynów', 'Wawer': 'Wawer', 'WesoÅ‚a': 'Wesoła', 
                         'WilanÃ³w': 'Wilanów', 'Wola': 'Wola', 'WÅ‚ochy': 'Włochy', 
                         'poza Warszawa': 'poza Warszawą', 'ÅšrÃ³dmieÅ›cie': 'Śródmieście', 'Å»oliborz': 'Żoliborz'}

# Extracting json string based data from JSON:
with open(district_geo, 'r') as f:
    data = json.load(f)

# Extracting GeoJSON data:
districtsGeoPolygon = []
nameOfDF = data['type']
featuresGeometryType = data['features'][0]['type']
for eachItem in data['features']:
    districtName = eachItem['properties']['name']
    districtID = eachItem['properties']['cartodb_id']
    districtPolygonVertices = eachItem['geometry']['coordinates'][0][0]
    # Calculating average district position:
    xCoord, yCoord = 0, 0
    for eachItem in districtPolygonVertices:
        xCoord += eachItem[0]
        yCoord += eachItem[1]
    xCoord = xCoord/len(districtPolygonVertices)
    yCoord = yCoord/len(districtPolygonVertices)
    polygon = MultiPoint(districtPolygonVertices).convex_hull #polygon = Polygon(districtPolygonVertices)
    if districtName!='Warszawa': # A city name itself is excluded as it is too broad 
        districtsGeoPolygon.append([districtsOfWarsawDist[districtName], polygon, [xCoord, yCoord]])

# Part 1a - Webscraping and data processing:

In [5]:
os.listdir()

['.ipynb_checkpoints',
 'BIP_Education.pickle',
 'Coursera_Capstone_MainProject6.ipynb',
 'WarsawDistrictClusters.html',
 'WarsawVenues.html']

Helper functions:

### Processing main school datafile:

In [6]:
# Processing school data from the offline excel file:
if 'schools1.pickle' not in os.listdir('../MainProjectDatafiles/'):
    schools1 = pd.ExcelFile('../MainProjectDatafiles/Szkoly.xlsx').parse('30.09.2018 r.')
    schools1.rename(columns={'Typ': 'Type', 'Typ nadrzędny':'Type_Main',
                       'Nazwa':'Name', 'Patron':'Patron',
                       'Samodzielna jednostka':'Independent Unit',
                       'Nazwa organizacji':'OrganisationName',
                       'Filia':'Agency', 
                       'Związanie organizacyjne':'OrganisationalConnections',
                       'Niepełnosprawność dominująca':'MainHandicap',
                       'Delegatury':'Representatives',
                       'Powiat':'Borough[Powiat]',
                       'Gmina':'Borough[Gmina]',
                       'Szkoła/Placówka jest':'TypeOfSchool',
                       'Kategoria uczniów':'SchoolAttendeesCathegory',
                       'Specyfika szkoły':'SchoolType',
                       'Organ Prowadzący - Typ':'ManagingBody',
                       'Rodzaj gminy TERYT':'BoroughType',
                       'Rodzaj miejscowości':'CityType',
                       'Miejscowość':'CityName',
                       'Ulica':'StreetName',
                       'Nr domu':'BuildingNumber',
                       'Kod pocztowy':'Zip_Code',
                       'Telefon z nr kier ':'Telephone',
                       'E-mail':'eMail',
                       'Strona WWW':'Website'}, inplace=True) # Renaming columns
    schools1 = schools1[schools1['CityName']=='WARSZAWA']
    
    geolocator = Nominatim(user_agent="my-application")
    latitudes, longitudes, addresses = [], [], []
    for (eachStreet, eachBuildingNumber, 
         eachCityName, eachZip_Code) in zip(schools1['StreetName'], schools1['BuildingNumber'], 
                                            schools1['CityName'], schools1['Zip_Code']):
        try:
            location = geolocator.geocode(f'{eachStreet} {eachBuildingNumber} {eachZip_Code} {eachCityName}')
            time.sleep(1)
            if location != None:
                latitudes.append(location.latitude)
                longitudes.append(location.longitude)
                addresses.append(location.address)
                print(location.latitude, location.longitude, location.address)
            else:
                latitudes.append(None)
                longitudes.append(None)
                addresses.append(None)
                print('None, None, None')
        except Exception as e:
            latitudes.append(None)
            longitudes.append(None)
            addresses.append(None)
            print(e)
    
    schools1['Address'] = addresses
    schools1['Latitude'] = latitudes
    schools1['Longitudes'] = longitudes
    schools1.to_pickle('../MainProjectDatafiles/schools1.pickle')  # where to save it, usually as a .pkl
    
else:
    schools1 = pd.read_pickle('../MainProjectDatafiles/schools1.pickle')
    if (len(schools1[schools1['Latitude'].isnull()]))>0:
        geolocator = Nominatim(user_agent="my-application")
        latitudes, longitudes, addresses = [], [], []
        for (eachStreet, eachBuildingNumber, 
             eachCityName, eachZip_Code, 
             eachLat, eachLong, eachAddress) in zip(schools1['StreetName'], schools1['BuildingNumber'],
                                       schools1['CityName'], schools1['Zip_Code'],
                                       schools1['Latitude'], schools1['Longitudes'], schools1['Address']):
            if math.isnan(eachLong): 
                try:
                    location = geolocator.geocode(f'{eachStreet} {eachBuildingNumber} {eachZip_Code} {eachCityName}')
                    time.sleep(1)
                    if location != None:
                        latitudes.append(location.latitude)
                        longitudes.append(location.longitude)
                        addresses.append(location.address)
                    else:
                        latitudes.append(None)
                        longitudes.append(None)
                        addresses.append(None)
                        print('None, None, None')
                except Exception as e:
                    latitudes.append(None)
                    longitudes.append(None)
                    addresses.append(None)
                    print(e)
            else:
                latitudes.append(eachLat)
                longitudes.append(eachLong)
                addresses.append(eachAddress)

        schools1['Address'] = addresses
        schools1['Latitude'] = latitudes
        schools1['Longitudes'] = longitudes
        schools1.dropna(subset=["Latitude"], axis=0, inplace=True) # simply drop whole row with NaN in "Latitude" column
        schools1.to_pickle('../MainProjectDatafiles/schools1.pickle')  # save as a pickle file

if 'District' not in schools1.columns:
    # Reference #1: https://stackoverflow.com/questions/48263802/finding-location-using-geojson-file-using-python
    # Reference #2: https://streamhacker.com/2010/03/23/python-point-in-polygon-shapely/
    # Reference #3: https://gis.stackexchange.com/questions/173835/point-in-polygon-geojson-using-shapely-python-returning-incorrect-results

    districts = [] # District
    for (latitude, longitude) in zip(schools1['Latitude'], schools1['Longitudes']):
        point = Point(longitude, latitude) # coords is a list of (x, y) tuple
        for eachSet in districtsGeoPolygon:
            districtContainsPoint = ((point.within(eachSet[1]) or polygon.contains(eachSet[1])))
            if districtContainsPoint: 
                districts.append(eachSet[0])
                break # This should not be a problem, but I am processing it just in case

    schools1['District'] = districts 
    schools1.to_pickle('../MainProjectDatafiles/schools1.pickle')
    schools1 = pd.read_pickle('../MainProjectDatafiles/schools1.pickle')
    
# Tests to see what can be done with single and double quotes:
#schools1.replace("'", "_", inplace = True)
#schools1 = schools1.applymap(lambda x: x.replace('"', ''))
#schools1.replace({'\"': ''}, regex=True)
schools1.head()

Unnamed: 0,Type,Type_Main,Name,Patron,Independent Unit,OrganisationName,Agency,OrganisationalConnections,MainHandicap,Representatives,Borough[Powiat],Borough[Gmina],TypeOfSchool,SchoolAttendeesCathegory,SchoolType,ManagingBody,BoroughType,CityType,CityName,StreetName,BuildingNumber,Zip_Code,Telephone,eMail,Website,Address,Latitude,Longitudes,District
0,Bednarska Szkoła Realna,Szkoła ponadgimnazjalna/ponadpodstawowa,Bednarska Szkoła Realna,,tak,Bednarska Szkoła Realna,nie,brak związania,,W,Powiat m. st. Warszawa,M. st. Warszawa,niepubliczna o uprawnieniach szkoły publicznej,Dzieci lub młodzież,brak specyfiki,Stowarzyszenia,gmina miejska,miasto powyżej 5 tys.mieszkańców,WARSZAWA,Kawalerii,5,00-468,,sekretariat@bsr.edu.pl,,"5, Kawalerii, XII, Śródmieście, Warszawa, woje...",52.217744,21.039742,Śródmieście
6,Biblioteki pedagogiczne,Inna placówka systemu oświaty lub placówka spo...,Pedagogiczna Biblioteka Wojewódzka w Warszawie,Komisja Edukacji Narodowej,tak,Pedagogiczna Biblioteka Wojewódzka w Warszawie,nie,,,W,Powiat m. st. Warszawa,M. st. Warszawa,publiczna,Bez kategorii,,Samorząd województwa,gmina miejska,miasto powyżej 5 tys.mieszkańców,WARSZAWA,Gocławska,4,03-810,228104664.0,sekretariat@pbw.waw.pl,www.pbw.waw.pl,Pedagogiczna Biblioteka Wojewódzka imienia Kom...,52.247195,21.063722,Praga Południe
23,Branżowa szkoła I stopnia,Szkoła ponadgimnazjalna/ponadpodstawowa,Branżowa Szkoła Samochodowa I stopnia nr 2,,nie,Zespół Szkół Samochodowych i Licealnych Nr 1,nie,brak związania,,W,Powiat m. st. Warszawa,M. st. Warszawa,publiczna,Dzieci lub młodzież,brak specyfiki,Miasto na prawach powiatu,gmina miejska,miasto powyżej 5 tys.mieszkańców,WARSZAWA,Szczęśliwicka,56,02-353,228240545.0,zssamoch@zssamoch.internetdsl.pl,,"Technikum nr 7, 56, Szczęśliwicka, Szczęśliwic...",52.213719,20.967551,Ochota
24,Branżowa szkoła I stopnia,Szkoła ponadgimnazjalna/ponadpodstawowa,Branżowa Szkoła I stopnia nr 39,,nie,Zespół Szkół nr 32,nie,brak związania,,W,Powiat m. st. Warszawa,M. st. Warszawa,publiczna,Dzieci lub młodzież,brak specyfiki,Miasto na prawach powiatu,gmina miejska,miasto powyżej 5 tys.mieszkańców,WARSZAWA,Ożarowska,71,01-408,228364062.0,zs32@edu.um.warszawa.pl,www.zs32.edu.pl,"71, Ożarowska, Osiedle Towarzystwa Osiedli Rob...",52.24596,20.949187,Wola
33,Branżowa szkoła I stopnia,Szkoła ponadgimnazjalna/ponadpodstawowa,Branżowa szkoła I stopnia nr 37 im. Jana Karsk...,Jan Karski,nie,Zespół Szkól Nr 42,nie,brak związania,,W,Powiat m. st. Warszawa,M. st. Warszawa,publiczna,Dzieci lub młodzież,brak specyfiki,Miasto na prawach powiatu,gmina miejska,miasto powyżej 5 tys.mieszkańców,WARSZAWA,Dzieci Warszawy,42,02-495,226626281.0,zs42@karski.edu.pl,www.zsnr42.edu.pl,"Szkoła Podstawowa nr 360, 42, Dzieci Warszawy,...",52.19707,20.899764,Ursus


In [7]:
# Saving schools1 data in th format corresponding to FourSquare API data:
if 'warsaw_schools.pickle' not in os.listdir('../MainProjectDatafiles/'):
    warsaw_schools = schools1[['District', 'Name', 'Latitude', 'Longitudes']]
    warsaw_schools.rename(columns={'Latitude': 'Neighborhood Latitude', 'Longitudes': 'Neighborhood Longitude',
                                   'Name': 'Venue'}, inplace=True)
    warsaw_schools['Venue Latitude'] = schools1['Latitude']
    warsaw_schools['Venue Longitude'] = schools1['Longitudes']
    warsaw_schools['Venue Category'] = schools1['Type_Main']
    warsaw_schools.reset_index(drop=True, inplace=True)
    N_lat, N_long = [], []
    for eachItem in warsaw_schools['District']:
        for eachDist in districtsGeoPolygon:
            if eachItem==eachDist[0]:
                N_lat.append(eachDist[2][1]) 
                N_long.append(eachDist[2][0])
    warsaw_schools['Neighborhood Latitude'] = N_lat
    warsaw_schools['Neighborhood Longitude'] = N_long       
    warsaw_schools.to_pickle('../MainProjectDatafiles/warsaw_schools.pickle')  # save as a pickle file
    warsaw_schools = pd.read_pickle('../MainProjectDatafiles/warsaw_schools.pickle')
else:
    warsaw_schools = pd.read_pickle('../MainProjectDatafiles/warsaw_schools.pickle')
warsaw_schools.head()

Unnamed: 0,District,Venue,Neighborhood Latitude,Neighborhood Longitude,Venue Latitude,Venue Longitude,Venue Category
0,Śródmieście,Bednarska Szkoła Realna,52.236558,21.01659,52.217744,21.039742,Szkoła ponadgimnazjalna/ponadpodstawowa
1,Praga Południe,Pedagogiczna Biblioteka Wojewódzka w Warszawie,52.244749,21.074174,52.247195,21.063722,Inna placówka systemu oświaty lub placówka spo...
2,Ochota,Branżowa Szkoła Samochodowa I stopnia nr 2,52.21405,20.96109,52.213719,20.967551,Szkoła ponadgimnazjalna/ponadpodstawowa
3,Wola,Branżowa Szkoła I stopnia nr 39,52.231552,20.947744,52.24596,20.949187,Szkoła ponadgimnazjalna/ponadpodstawowa
4,Ursus,Branżowa szkoła I stopnia nr 37 im. Jana Karsk...,52.195898,20.876551,52.19707,20.899764,Szkoła ponadgimnazjalna/ponadpodstawowa


In [8]:
# Display main info:
schools1.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1580 entries, 0 to 6161
Data columns (total 29 columns):
Type                         1580 non-null object
Type_Main                    1580 non-null object
Name                         1580 non-null object
Patron                       461 non-null object
Independent Unit             1580 non-null object
OrganisationName             1580 non-null object
Agency                       1580 non-null object
OrganisationalConnections    1421 non-null object
MainHandicap                 45 non-null object
Representatives              1580 non-null object
Borough[Powiat]              1580 non-null object
Borough[Gmina]               1580 non-null object
TypeOfSchool                 1580 non-null object
SchoolAttendeesCathegory     1580 non-null object
SchoolType                   1421 non-null object
ManagingBody                 1580 non-null object
BoroughType                  1580 non-null object
CityType                     1580 non-null obj

### Making sure that NaNs are removed:

In [9]:
# Display nan rows - not containing longitude, latitude or an address:
nan_rows = schools1[schools1['Latitude'].isnull()]
print(len(nan_rows), len(schools1))
nan_rows.head()

0 1580


Unnamed: 0,Type,Type_Main,Name,Patron,Independent Unit,OrganisationName,Agency,OrganisationalConnections,MainHandicap,Representatives,Borough[Powiat],Borough[Gmina],TypeOfSchool,SchoolAttendeesCathegory,SchoolType,ManagingBody,BoroughType,CityType,CityName,StreetName,BuildingNumber,Zip_Code,Telephone,eMail,Website,Address,Latitude,Longitudes,District


In [10]:
print(len(nan_rows), len(schools1))

0 1580


### Alternative source of school data:

In [11]:
'''
URL = 'https://api.um.warszawa.pl/api/action/datastore_search?resource_id=1cae4865-bb17-4944-a222-0d0cdc377951&limit=5' 
response = requests.post(URL)
jData = json.loads(response.content)
for eachItem in jData['result'].keys():
    print(eachItem, len(eachItem))
jData
'''

"\nURL = 'https://api.um.warszawa.pl/api/action/datastore_search?resource_id=1cae4865-bb17-4944-a222-0d0cdc377951&limit=5' \nresponse = requests.post(URL)\njData = json.loads(response.content)\nfor eachItem in jData['result'].keys():\n    print(eachItem, len(eachItem))\njData\n"

### Processing schools from official system:

In [12]:
# https://edukacja.warszawa.pl/placowki/przedszkola
if 'BIP_Education_df.pickle' not in os.listdir('../MainProjectDatafiles/'):
    BIP_Education_df = pd.read_csv('../MainProjectDatafiles/schools2.csv')
    BIP_Education_df.rename(columns={'Skrót nazwy placówki': 'LocNameShort', 'Skrót nazwy jednostki': 'UnitNameShort',
                                     'Regon placówki': 'LocName', 'Regon szkoły': 'InstitutioName',
                                     'RSPO placówki': 'LocName_RSPO', 'Rspo': 'RSPO',
                                     'Nazwa placówki': 'FullNameLoc', 'Nazwa szkoły/ jednostki': 'FullNameInstitu',
                                     'Typ placówki': 'LocationType', 'Typ jednostki': 'UnitType',
                                     'Kategoria uczniów': 'AttendeeCathegory', 'Jednostka specjalna': 'OverseeingUnit',
                                     'Bezpośredni nadzór': 'DirectOversee', 'Miejscowość': 'CityName',
                                     'Dzielnica': 'District', 'Miejski System Informacji': 'CityInformationSystem',
                                     'Ulica nazwa': 'StreetName', 'Ulica nr': 'BuildingNumber',
                                     'Nr Kodu': 'Zip_Code', 'Imię dyrektora': 'PrincipalFirstName', 
                                     'Nazwisko dyrektora': 'PrincipalLastName',
                                     'Telefon': 'Telephone', 'Faks': 'Fax', 'WWW': 'Website', 'Strona BIP': 'BIP_site',
                                     'E-mail': 'eMail', 'Numer jednostki': 'unitNumber',
                                     'Placówka prowadzi internat/bursę': 'MaintainsInternatOrBursa',
                                     'jednostka/zakład budż': 'BudgetUnit', 'Liczba uczniów': 'NumberOfStudents',
                                     'Liczba dzieci w oddziałach przedszkolnych': 'NumberOfKindekartenChildren',
                                     'SUMA - Liczba uczniów i dzieci  ': 'SumOfKinderChildrenAndStudents',
                                     'Liczba oddziałów - uczniowie': 'UnitsSumStudents',
                                     'Liczba oddziałów przedszkolnych': 'UnitsSumKindegartenChildren',
                                     'SUMA - Liczba oddziałów  ': 'SumNumberOfUnits',
                                     'Oddziały dwujęzyczne': 'BilingualUnits',
                                     'Oddziały integracyjne': 'IntegrationUnits',
                                     'Oddziały sportowe': 'SportUnits',
                                     'Oddziały mistrzostwa sportowego': 'MasterOfSportsUnits',
                                     'Oddziały międzynarodowe': 'InternationalUnits',
                                     'Oddziały specjalne': 'SpecialUnits',
                                     'Oddziały specjalne przysposabiające do pracy': 'ProWorkUnits',
                                     'Oddziały terapeutyczne': 'TherapeuticalUnits',
                                     'Oddziały eksperymentalne': 'ExperimentalUnits',
                                     'uwagi': 'Comments',
                                     'Czy w zespole?': 'IsItCombineWIthOtherSchools',
                                     'Typ jednostki (generalizacja)': 'TypeOfUnit[Generalization]',
                                     'Uwzględniać na liście jednostek samodzielnych i złożonych?': 'IsOnIndependentAndComplexUnitsList',
                                     'Szkoły i przedszkola': 'IsSchoolsandPreschools',
                                     'samodzielna lub zespół - Czy szkoła lub przedszkole w składzie?': 'IsSchoolOrKindergartenInComplex',
                                     'samodzielna lub zespół - Czy szkoła w składzie?': 'IsSchoolInComplex',
                                     'samodzielna lub zespół - Czy szkoła ponadpodstawowa w składzie?': 'IsMIddleSchoolInComplex',
                                     'samodzielna lub zespół - Czy przedszkole lub zespół z przedszkolem w składzie?': 'IsKindergartenInComplex',
                                     'samodzielna lub zespół - Czy szkoła podstawowa (w tym muzyczna) w składzie?': 'IsPrimarySchoolInComplex',
                                     'samodzielna lub zespół - Czy liceum ogólnokształcące w składzie?': 'IsHighSchoolInComplex',
                                     'samodzielna lub zespół - Czy technikum w składzie?': 'IsTechnicalSchoolInComplex',
                                     'samodzielna lub zespół - Czy branżowa szkoła I stopnia w składzie?': 'IsTradeSchoolInComplex',
                                     'samodzielna lub zespół - Czy szkoła policealna w składzie?': 'IsPostHighSchoolInComplex',
                                     'samodzielna lub zespół - Czy jednostka specjalna w składzie?': 'IsSpecialSchoolInComplex',
                                     'Edukacyjna Wartość Dodana': 'EducationalValueAdded',
                                     'RSPO - wyszukiwarka': 'RSPO_Search',
                                     'obwód szkoły podstawowej': 'SchoolVicinity',
                                     'Teren działania poradni': 'AreaOfSupportUnitCoverage',
                                     'Niepełnosprawność dominująca': 'DominatingHendicapp',
                                     'Jednostka "przyszpitalna"': 'HospitalUnit',
                                     'długość geograficzna': 'Longitude_Base',
                                     'szerokość geograficzna': 'Latitude_Base',
                                     'Wars i Sawa': 'Has_[Wars_i_Sawa]_certificate',
                                     'na obszarze Zintegrowanego Programu Rewitalizacji m.st. Warszaw do 2022 (Praga Północ, Praga Południe, Targówek)': 'IsOnWarsaw2022RevitalizationArea',
                                     'Identyfikacja w systemie sprawozdawczości (ankietyBE)': 'BEpoolIdentification'}, 
                            inplace=True) # Renaming columns


    geolocator = Nominatim(user_agent="my-application")
    latitudes, longitudes, addresses = [], [], []
    for (eachStreet, eachBuildingNumber, 
         eachCityName, eachZip_Code,
         Latitude_Base, Longitude_Base) in zip(BIP_Education_df['StreetName'], BIP_Education_df['BuildingNumber'],
                                               BIP_Education_df['CityName'], BIP_Education_df['Zip_Code'],
                                               BIP_Education_df['Latitude_Base'], BIP_Education_df['Longitude_Base']
                                              ):
            try:
                location = geolocator.geocode(f'{eachStreet} {eachBuildingNumber} {eachZip_Code} {eachCityName}')
                time.sleep(1)
                if location != None:
                    latitudes.append(location.latitude)
                    longitudes.append(location.longitude)
                    addresses.append(location.address)
                    print(location.latitude, location.longitude, location.address)
                else:
                    latitudes.append(Latitude_Base)
                    longitudes.append(Longitude_Base)
                    addresses.append(f'{eachStreet} {eachBuildingNumber} {eachZip_Code} {eachCityName}')
                    print(f'Baseline address: {Latitude_Base}, {Longitude_Base}, {eachStreet} {eachBuildingNumber} {eachZip_Code} {eachCityName}')
            except Exception as e:
                latitudes.append(None)
                longitudes.append(None)
                addresses.append(None)
                print(e)
        
    BIP_Education_df['Address'] = addresses
    BIP_Education_df['Latitude'] = latitudes
    BIP_Education_df['Longitudes'] = longitudes
    BIP_Education_df.to_pickle('../MainProjectDatafiles/BIP_Education_df.pickle')  # save as a pickle file
    BIP_Education_df.dropna(subset=["Latitude"], axis=0, inplace=True) # simply drop whole row with NaN in "Latitude" column
else:
    BIP_Education_df = pd.read_pickle('../MainProjectDatafiles/BIP_Education_df.pickle')
    BIP_Education_df.dropna(subset=["Latitude"], axis=0, inplace=True) # simply drop whole row with NaN in "Latitude" column
    
BIP_Education_df.head()

Unnamed: 0,LocNameShort,UnitNameShort,LocName,InstitutioName,LocName_RSPO,RSPO,FullNameLoc,FullNameInstitu,LocationType,UnitType,AttendeeCathegory,OverseeingUnit,DirectOversee,CityName,District,CityInformationSystem,StreetName,BuildingNumber,Zip_Code,PrincipalFirstName,PrincipalLastName,Telephone,Fax,Website,BIP_site,eMail,unitNumber,MaintainsInternatOrBursa,BudgetUnit,NumberOfStudents,NumberOfKindekartenChildren,SumOfKinderChildrenAndStudents,UnitsSumStudents,UnitsSumKindegartenChildren,SumNumberOfUnits,BilingualUnits,IntegrationUnits,SportUnits,MasterOfSportsUnits,InternationalUnits,SpecialUnits,ProWorkUnits,TherapeuticalUnits,ExperimentalUnits,Comments,IsItCombineWIthOtherSchools,TypeOfUnit[Generalization],IsOnIndependentAndComplexUnitsList,IsSchoolsandPreschools,IsSchoolOrKindergartenInComplex,IsSchoolInComplex,IsMIddleSchoolInComplex,IsKindergartenInComplex,IsPrimarySchoolInComplex,IsHighSchoolInComplex,IsTechnicalSchoolInComplex,IsTradeSchoolInComplex,IsPostHighSchoolInComplex,IsSpecialSchoolInComplex,EducationalValueAdded,RSPO_Search,SchoolVicinity,AreaOfSupportUnitCoverage,DominatingHendicapp,HospitalUnit,Longitude_Base,Latitude_Base,Has_[Wars_i_Sawa]_certificate,IsOnWarsaw2022RevitalizationArea,BEpoolIdentification,Address,Latitude,Longitudes
0,_AGRYKOLA,_AGRYKOLA_MOS_1,17316253,146735464,87072,115214,"Warszawskie Centrum Sportu Młodzieżowego ""Agry...","Międzyszkolny Ośrodek Sportowy nr 1 ""Agrykola""",zespół lub jednostka złożona,międzyszkolny ośrodek sportowy,dzieci lub młodzież,nie,Biuro Edukacji,Warszawa,Śródmieście,Ujazdów,ul. Myśliwiecka,9,00-459,Mirosław,Robak,226229107,226229106,http://agrykola-noclegi.pl,http://www.agrykola-noclegi.pl/pl/s307/BIP.html,sekretariat.agrykola@gmail.com,1.0,nie,jednostka budżetowa,,,,,,,,,,,,,,,,,w zespole,jednostka inna niż przedszkole lub szkoła,tak,,,,,,,,,,,,,https://rspo.men.gov.pl/rspo/115214,,Poradnia Psychologiczno-Pedagogiczna nr 11,,,21.035181,52.22057,,,1059,ul. Myśliwiecka 9 00-459 Warszawa,52.22057,21.035181
1,_AGRYKOLA,_AGRYKOLA_O_ST,17316253,280566859,87072,86896,"Warszawskie Centrum Sportu Młodzieżowego ""Agry...",Pozaszkolna Placówka Specjalistyczna – Międzys...,zespół lub jednostka złożona,pozaszkolna placówka specjalistyczna,bez kategorii,nie,Biuro Edukacji,Pozezdrze,poza Warszawą,,Stręgielek,40,11-610,Mirosław,Robak,874279012,874279014,http://omega.mazury.info,http://www.bip.omega.mazury.info,stregielek@edu.um.warszawa.pl,,nie,jednostka budżetowa,,,,,,,,,,,,,,,,,w zespole,jednostka inna niż przedszkole lub szkoła,nie,,,,,,,,,,,,,https://rspo.men.gov.pl/rspo/86896,,,,,,,,,1112,"Stręgielek, gmina Pozezdrze, powiat węgorzewsk...",54.174352,21.869453
2,_AGRYKOLA,_AGRYKOLA_SSM_1,17316253,146735398,87072,115215,"Warszawskie Centrum Sportu Młodzieżowego ""Agry...",Szkolne Schronisko Młodzieżowe nr 1,zespół lub jednostka złożona,szkolne schronisko młodzieżowe,dzieci lub młodzież,nie,Biuro Edukacji,Warszawa,Śródmieście,Ujazdów,ul. Myśliwiecka,9,00-459,Mirosław,Robak,226229110,226229105,http://agrykola-noclegi.pl,http://www.agrykola-noclegi.pl/pl/s307/BIP.html,recepcja@agrykola-noclegi.pl,1.0,nie,jednostka budżetowa,,,,,,,,,,,,,,,,,w zespole,jednostka inna niż przedszkole lub szkoła,tak,,,,,,,,,,,,,https://rspo.men.gov.pl/rspo/115215,,Poradnia Psychologiczno-Pedagogiczna nr 11,,,21.035181,52.22057,,,1060,ul. Myśliwiecka 9 00-459 Warszawa,52.22057,21.035181
3,_BS_04,_BS_04,11782304,11782304,86891,86891,Bursa Szkolna nr 4,Bursa Szkolna nr 4,bursa internat,bursa internat,dzieci lub młodzież,nie,Biuro Edukacji,Warszawa,Wola,Koło,ul. Księcia Janusza,45/47,01-452,Piotr,Iwiński,228361813,228773364,http://www.bursa4.waw.pl,http://bursa4.bip.um.warszawa.pl,bursa4@bursa4.waw.pl,4.0,tak,jednostka budżetowa,,,,,,,,,,,,,,,,,samodzielna,jednostka inna niż przedszkole lub szkoła,tak,,,,,,,,,,,,,https://rspo.men.gov.pl/rspo/86891,,Poradnia Psychologiczno-Pedagogiczna nr 2,,,20.939062,52.245195,,,484,"Technikum Budowlane nr 5, 45/47, Księcia Janus...",52.244422,20.939565
4,_BS_05,_BS_05,192407,192407,86886,86886,Bursa nr 5 im. ppłk mgr inż. Grażyny Lipińskiej,Bursa nr 5 im. ppłk mgr inż. Grażyny Lipińskiej,bursa internat,bursa internat,dzieci lub młodzież,nie,Biuro Edukacji,Warszawa,Praga Południe,Grochów,ul. Zagójska,3,04-160,Marta,Baj,228108488,228108488,http://www.bursa5.edu.pl,http://bursa5.bip.um.warszawa.pl,bursa@bursa.edu.pl,5.0,tak,jednostka budżetowa,,,,,,,,,,,,,,,,,samodzielna,jednostka inna niż przedszkole lub szkoła,tak,,,,,,,,,,,,,https://rspo.men.gov.pl/rspo/86886,,Poradnia Psychologiczno – Pedagogiczna nr 16,,,21.092077,52.241321,,,119,ul. Zagójska 3 04-160 Warszawa,52.241321,21.092077


In [13]:
# Saving BIP_Education_df data in th format corresponding to FourSquare API data:
if 'warsaw_BIP_Education.pickle' not in os.listdir('../MainProjectDatafiles/'):
    warsaw_BIP_Education = BIP_Education_df[['District', 'FullNameInstitu', 'Latitude', 'Longitudes']]
    warsaw_BIP_Education = warsaw_BIP_Education[warsaw_BIP_Education['District']!='poza Warszawą']
    warsaw_BIP_Education.rename(columns={'Latitude': 'Neighborhood Latitude', 'Longitudes': 'Neighborhood Longitude',
                                   'FullNameInstitu': 'Venue'}, inplace=True)
    warsaw_BIP_Education['Venue Latitude'] = BIP_Education_df['Latitude']
    warsaw_BIP_Education['Venue Longitude'] = BIP_Education_df['Longitudes']
    warsaw_BIP_Education['Venue Category'] = BIP_Education_df['UnitType']
    warsaw_BIP_Education.reset_index(drop=True, inplace=True)
    N_lat, N_long = [], []
    for eachItem in BIP_Education_df['District']:
        for eachDist in districtsGeoPolygon:
            if eachItem==eachDist[0]:
                N_lat.append(eachDist[2][1]) 
                N_long.append(eachDist[2][0])
    warsaw_BIP_Education['Neighborhood Latitude'] = N_lat
    warsaw_BIP_Education['Neighborhood Longitude'] = N_long       
    warsaw_BIP_Education.to_pickle('BIP_Education.pickle')  # save as a pickle file
    warsaw_BIP_Education = pd.read_pickle('../MainProjectDatafiles/BIP_Education.pickle')
else:
    warsaw_BIP_Education = pd.read_pickle('../MainProjectDatafiles/BIP_Education.pickle')
warsaw_BIP_Education.head()

Unnamed: 0,District,Venue,Neighborhood Latitude,Neighborhood Longitude,Venue Latitude,Venue Longitude,Venue Category
0,Śródmieście,"Międzyszkolny Ośrodek Sportowy nr 1 ""Agrykola""",52.236558,21.01659,52.22057,21.035181,międzyszkolny ośrodek sportowy
1,Śródmieście,Szkolne Schronisko Młodzieżowe nr 1,52.236558,21.01659,52.22057,21.035181,szkolne schronisko młodzieżowe
2,Wola,Bursa Szkolna nr 4,52.231552,20.947744,52.244422,20.939565,bursa internat
3,Praga Południe,Bursa nr 5 im. ppłk mgr inż. Grażyny Lipińskiej,52.244749,21.074174,52.241321,21.092077,bursa internat
4,Wola,Bursa nr 6,52.231552,20.947744,52.248368,20.976765,bursa internat


### Processing theater data:

In [14]:
if 'WarsawTheater_DF.pickle' not in os.listdir('../MainProjectDatafiles/'):
    # Adding dataframes for Warsaw's theaters. Source: https://api.um.warszawa.pl/#
    APIKEY = credentialsData[1][1]
    URL = f'https://api.um.warszawa.pl/api/action/wfsstore_get/?id=e26218cb-61ec-4ccb-81cc-fd19a6fee0f8&apikey={APIKEY}'
    response = requests.post(URL)
    jData = json.loads(response.content)
    for eachItem in jData['result'].keys():
        print(eachItem, len(eachItem))
    
    # Processing 'featureMemberProperties':
    pd.DataFrame(jData['result']['featureMemberProperties'])

    # Processing 'featureMemberList' object:
    theaterArray, headers = [], []
    headers = ['GEOMETRY', 'LATITUDE', 'LONGITUDE', 'OBJECTID', 'ULICA', 'NUMER', 'KOD', 'OPIS', 'DZIELNICA', 'TEL_FAX', 'WWW', 'MAIL', 'AKTU_DAN']
    for eachItemIndex, eachItem in enumerate(jData['result']['featureMemberList']):
        #headers = ['GEOMETRY', 'LATITUDE', 'LONGITUDE'] + [i['key'] for i in eachItem['properties']]
        currentRowData = [eachItem['geometry']['type'], eachItem['geometry']['coordinates'][0]['latitude'], eachItem['geometry']['coordinates'][0]['longitude']]
        addRowData = {}
        for eachItem in eachItem['properties']: addRowData[eachItem['key']]=eachItem['value']
        for i in headers[3:]:
            if i in addRowData.keys(): currentRowData.append(addRowData[i])
            else: currentRowData.append(np.nan)
        theaterArray.append(currentRowData)

    WarsawTheater_DF = pd.DataFrame(theaterArray, columns=headers)
    WarsawTheater_DF = WarsawTheater_DF.drop('GEOMETRY', axis=1) # Dropping a worthless column from a dataframe

    WarsawTheater_DF.to_pickle('../MainProjectDatafiles/WarsawTheater_DF.pickle')
    WarsawTheater_DF = pd.read_pickle('../MainProjectDatafiles/WarsawTheater_DF.pickle')
else:
    WarsawTheater_DF = pd.read_pickle('../MainProjectDatafiles/WarsawTheater_DF.pickle')
    
WarsawTheater_DF.rename(columns={'LATITUDE': 'Latitude', 'LONGITUDE': 'Longitude', 'OBJECTID': 'ObjectID',
                                 'ULICA': 'StreetName', 'NUMER': 'BuildingNumber', 'KOD': 'Zip_Code',
                                 'OPIS': 'Description', 'DZIELNICA': 'District', 'TEL_FAX': 'TEL_FAX',
                                 'WWW': 'WebsiteWWW', 'MAIL': 'eMail', 'AKTU_DAN': 'LastDataUpdate'}, 
                        inplace=True) # Renaming columns for non-English speakers
WarsawTheater_DF.head()

Unnamed: 0,Latitude,Longitude,ObjectID,StreetName,BuildingNumber,Zip_Code,Description,District,TEL_FAX,WebsiteWWW,eMail,LastDataUpdate
0,52.216505,21.022159,344,Litewska,3,00-589,Teatr Syrena,Śródmieście,"22 101 16 16, 22 101 16 13",http://www.teatrsyrena.pl/,,czerwiec 2014
1,52.240616,20.998012,367,Elektoralna,12,,Mazowieckie Centrum Kultury i Sztuki,Śródmieście,22 586 42 59,www.teatrpraga.pl,sekretariat@teatrpraga.pl,czerwiec 2014
2,52.227977,21.026202,339,M. Konopnickiej,6,00-491,Teatr IMKA,Śródmieście,22 339 05 20,http://www.teatr-imka.pl/,mailto:rezerwacja@teatr-imka.pl,czerwiec 2014
3,52.23268,20.991759,321,Żelazna,51/53,00-841,Teatr Scena Prezentacje,Wola,22 620 82 88 / 22 620 34 90,http://www.teatrprezentacje.pl/,,czerwiec 2014
4,52.252501,21.008513,366,Rynek Nowego Miasta,5/7,00-229,Teatr WARSawy,Śródmieście,509 780 261,www.teatrkonsekwentny.pl,,czerwiec 2014


In [15]:
# Saving BIP_Education_df data in th format corresponding to FourSquare API data:
if 'warsaw_theaters.pickle' not in os.listdir('../MainProjectDatafiles/'):
    warsaw_theaters = WarsawTheater_DF[['District', 'Description', 'Latitude', 'Longitude']]
    warsaw_theaters = warsaw_theaters[warsaw_theaters['District']!='poza Warszawą']
    warsaw_theaters.rename(columns={'Latitude': 'Neighborhood Latitude', 'Longitude': 'Neighborhood Longitude',
                                   'Description': 'Venue'}, inplace=True)
    warsaw_theaters['Venue Latitude'] = WarsawTheater_DF['Latitude']
    warsaw_theaters['Venue Longitude'] = WarsawTheater_DF['Longitude']
    warsaw_theaters['Venue Category'] = 'theater'
    warsaw_theaters.reset_index(drop=True, inplace=True)
    N_lat, N_long = [], []
    for eachItem in warsaw_theaters['District']:
        if eachItem=='Praga-Północ': eachItem='Praga Północ'
        if eachItem=='Warszawa': eachItem='Śródmieście'  
        if eachItem=='Praga-Południe': eachItem='Praga Południe' 
        for eachDist in districtsGeoPolygon:
            if eachItem==eachDist[0]:
                N_lat.append(eachDist[2][1]) 
                N_long.append(eachDist[2][0])
    warsaw_theaters['Neighborhood Latitude'] = N_lat
    warsaw_theaters['Neighborhood Longitude'] = N_long   
    warsaw_theaters.replace('Praga-Południe', 'Praga Południe', inplace = True) # Replacing dataframe values with a specified item
    warsaw_theaters.replace('Praga-Północ', 'Praga Północ', inplace = True) # Replacing dataframe values with a specified item
    warsaw_theaters.replace('Warszawa', 'Śródmieście', inplace = True) # Replacing dataframe values with a specified item
    warsaw_theaters.to_pickle('../MainProjectDatafiles/warsaw_theaters.pickle')  # save as a pickle file
    warsaw_theaters = pd.read_pickle('../MainProjectDatafiles/warsaw_theaters.pickle')
else:
    warsaw_theaters = pd.read_pickle('../MainProjectDatafiles/warsaw_theaters.pickle')
warsaw_theaters.head()

Unnamed: 0,District,Venue,Neighborhood Latitude,Neighborhood Longitude,Venue Latitude,Venue Longitude,Venue Category
0,Śródmieście,Teatr Syrena,52.236558,21.01659,52.216505,21.022159,theater
1,Śródmieście,Mazowieckie Centrum Kultury i Sztuki,52.236558,21.01659,52.240616,20.998012,theater
2,Śródmieście,Teatr IMKA,52.236558,21.01659,52.227977,21.026202,theater
3,Wola,Teatr Scena Prezentacje,52.231552,20.947744,52.23268,20.991759,theater
4,Śródmieście,Teatr WARSawy,52.236558,21.01659,52.252501,21.008513,theater


### Processing FourSquare API data:

In [16]:
# Specyfing credentials:
CLIENT_ID = credentialsData[0][1] # your Foursquare ID
CLIENT_SECRET = credentialsData[0][2] # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 200

In [17]:
# Using course based function to repeat the venue acquisition across all neigborhoods:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        
        # cCreate the API request URL:
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # Make the GET request:
        try:
            results = requests.get(url).json()["response"]['groups'][0]['items']
        except Exception as e:
            print(requests.get(url).json()["response"])
            print(e)
            
        print(name, len(results))
        
        # Returns only relevant information for each nearby venue:
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [18]:
# Running the above function to acquire venue data:
if 'warsaw_venues.pickle' not in os.listdir('../MainProjectDatafiles/'):
    warsaw_venues = getNearbyVenues(names=schools1['District'],
                                       latitudes=schools1['Latitude'],
                                       longitudes=schools1['Longitudes'])
    warsaw_venues.rename(columns={'Neighborhood': 'District'}, inplace=True)
    warsaw_venues = warsaw_venues.drop_duplicates('Venue')
    warsaw_venues.reset_index(drop=True, inplace=True)
    N_lat, N_long = [], []
    for eachItem in warsaw_venues['District']:
        for eachDist in districtsGeoPolygon:
            if eachItem==eachDist[0]:
                N_lat.append(eachDist[2][1]) 
                N_long.append(eachDist[2][0])
    warsaw_venues['Neighborhood Latitude'] = N_lat
    warsaw_venues['Neighborhood Longitude'] = N_long       
    warsaw_venues.to_pickle('../MainProjectDatafiles/warsaw_venues.pickle')  # save as a pickle file
    warsaw_venues = pd.read_pickle('../MainProjectDatafiles/warsaw_venues.pickle')
else:
    warsaw_venues = pd.read_pickle('../MainProjectDatafiles/warsaw_venues.pickle')

In [19]:
# Checking the size of the resulting dataframe:
print(warsaw_venues.shape)
warsaw_venues.head()

(3469, 7)


Unnamed: 0,District,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Śródmieście,52.236558,21.01659,Palace On The Isle (Pałac Łazienkowski (Pałac ...,52.21486,21.035599,Palace
1,Śródmieście,52.236558,21.01659,Stadion Miejski Legii Warszawa im. Marszałka J...,52.22082,21.040685,Soccer Stadium
2,Śródmieście,52.236558,21.01659,Polskie Radio Program 3,52.22025,21.036175,Radio Station
3,Śródmieście,52.236558,21.01659,Agrykola,52.219628,21.033156,Park
4,Śródmieście,52.236558,21.01659,Pałac Myślewicki,52.215619,21.03834,Palace


In [20]:
# Checking how many venues were returned for each neighborhood:
warsaw_venues.groupby(['District']).count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
District,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Bemowo,105,105,105,105,105,105
Białołęka,143,143,143,143,143,143
Bielany,141,141,141,141,141,141
Mokotów,433,433,433,433,433,433
Ochota,173,173,173,173,173,173
Praga Południe,268,268,268,268,268,268
Praga Północ,134,134,134,134,134,134
Rembertów,19,19,19,19,19,19
Targówek,68,68,68,68,68,68
Ursus,61,61,61,61,61,61


In [21]:
len(warsaw_venues['Venue'].unique())

3469

In [22]:
# Let's find out how many unique categories can be curated from all the returned venues:
print('There are {} uniques categories.'.format(len(warsaw_venues['Venue Category'].unique())))

There are 337 uniques categories.


# Part 2 - Data visualization:

Creating a map of Warsaw with neighborhoods superimposed on top:

In [23]:
print(schools1.shape)
print(BIP_Education_df.shape)
print(WarsawTheater_DF.shape)
print(warsaw_venues.shape)

(1580, 29)
(954, 73)
(48, 12)
(3469, 7)


In [24]:
# Folium references: https://python-visualization.github.io/folium/quickstart.html#GeoJSON/TopoJSON-Overlays

address = 'Warsaw, Poland'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
w_latitude = location.latitude
w_longitude = location.longitude
print(f'The geograpical coordinate of Warsaw are: {w_latitude}, {w_longitude}.')

# Create a map of Warsaw using latitude and longitude values:
mapWarsaw = folium.Map(location=[w_latitude, w_longitude], zoom_start=11)

# style = {'fillColor': '#00000000', 'color': '#00000000'} # transparent GeoJson
style = {'fillColor': '#00FFFFFF', 'lineColor': '#00FFFFFF', 'opacity': 0.25, 'fillOpacity': 0.05} # https://leafletjs.com/reference-1.5.0.html#path-option
folium.GeoJson(district_geo,
               name='geojson',
               style_function=lambda x: style
              ).add_to(mapWarsaw) 

# Add school1 markers to map:
for lat, lng, schoolName, schoolType in zip(schools1['Latitude'],
                                            schools1['Longitudes'],
                                            schools1['Name'],
                                            schools1['Type']):
    #print(index)
    schoolName = schoolName.replace('\"', '').replace("\'", "")
    label = (f'{schoolName}, {schoolType}')
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [float(lat), float(lng)],
        radius=2,
        popup=label,
        color='orange',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(mapWarsaw)  

# Add BIP_Education_df markers to map:
for lat, lng, schoolName, schoolType, district in zip(BIP_Education_df['Latitude'],
                                                      BIP_Education_df['Longitudes'],
                                                      BIP_Education_df['FullNameLoc'],
                                                      BIP_Education_df['UnitType'],
                                                      BIP_Education_df['District']):
    label = (f'{schoolName}, {schoolType}, {district}')
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=2,
        popup=label,
        color='green',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(mapWarsaw)      

# Add theaters markers to map:
for lat, lng, t_name, neighborhood in zip(WarsawTheater_DF['Latitude'], WarsawTheater_DF['Longitude'], 
                                          WarsawTheater_DF['Description'], WarsawTheater_DF['District']):
    label = (f'{t_name}, {neighborhood}')
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=2,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(mapWarsaw)  

# Add venues markers to map:
for lat, lng, v_name, district, cathegory in zip(warsaw_venues['Venue Latitude'], warsaw_venues['Venue Longitude'], 
                                      warsaw_venues['Venue'], warsaw_venues['District'], warsaw_venues['Venue Category']):
    label = (f'{v_name}, {district}, {cathegory}')
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=2,
        popup=label,
        color='red',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.2,
        parse_html=False).add_to(mapWarsaw) 
    
# walkaround for folium-on-jupyter not displaying more than 500 dataset:
# reference: https://github.com/python-visualization/folium/issues/812#issuecomment-437483792
def embed_map(m):
    from IPython.display import IFrame
    m.save('WarsawVenues.html')
    return IFrame('WarsawVenues.html', width='100%', height='750px')

embed_map(mapWarsaw)

The geograpical coordinate of Warsaw are: 52.2337172, 21.07141112883227.


### Mergind all dataframe into one for clustering:

In [25]:
print(warsaw_schools.shape)
print(warsaw_BIP_Education.shape)
print(warsaw_theaters.shape)
print(warsaw_venues.shape)

(1580, 7)
(948, 7)
(48, 7)
(3469, 7)


In [26]:
frames = [warsaw_schools, warsaw_BIP_Education, warsaw_theaters, warsaw_venues]
warsaw_venues = pd.concat(frames)
warsaw_venues.reset_index(drop=True, inplace=True)
warsaw_venues.head(3)

Unnamed: 0,District,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Category,Venue Latitude,Venue Longitude
0,Śródmieście,52.236558,21.01659,Bednarska Szkoła Realna,Szkoła ponadgimnazjalna/ponadpodstawowa,52.2177,21.0397
1,Praga Południe,52.244749,21.074174,Pedagogiczna Biblioteka Wojewódzka w Warszawie,Inna placówka systemu oświaty lub placówka spo...,52.2472,21.0637
2,Ochota,52.21405,20.96109,Branżowa Szkoła Samochodowa I stopnia nr 2,Szkoła ponadgimnazjalna/ponadpodstawowa,52.2137,20.9676


In [27]:
warsaw_venues.tail(3)

Unnamed: 0,District,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Category,Venue Latitude,Venue Longitude
6042,Wawer,52.186094,21.183994,Mazowiecki Park Krajobrazowy im. Czesława Łaszka,Forest,52.2235,21.1682
6043,Wawer,52.186094,21.183994,Gorka Delmacha,Park,52.2234,21.1683
6044,Wawer,52.186094,21.183994,Odrodzenia,Bus Station,52.2217,21.1628


In [28]:
warsaw_venues.groupby(['District']).count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Category,Venue Latitude,Venue Longitude
District,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Bemowo,206,206,206,206,206,206
Białołęka,312,312,312,312,312,312
Bielany,303,303,303,303,303,303
Mokotów,762,762,762,762,762,762
Ochota,309,309,309,309,309,309
Praga Południe,506,506,506,506,506,506
Praga Północ,250,250,250,250,250,250
Rembertów,51,51,51,51,51,51
Targówek,182,182,182,182,182,182
Ursus,128,128,128,128,128,128


# Part 3 - Comparing Neighbourhoods by Available Resources:

In [29]:
# One hot encoding:
warsaw_onehot = pd.get_dummies(warsaw_venues[['Venue Category']], prefix="", prefix_sep="")

# Add 'District' column back to dataframe:
warsaw_onehot['District'] = warsaw_venues['District'] 

# Move 'District' column to the first column"
fixed_columns = [warsaw_onehot.columns[-1]] + list(warsaw_onehot.columns[:-1])
warsaw_onehot = warsaw_onehot[fixed_columns]

warsaw_onehot.head()

Unnamed: 0,District,Accessories Store,Adult Boutique,African Restaurant,Airport,Airport Service,American Restaurant,Amphitheater,Arcade,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Auto Dealership,Auto Garage,Auto Workshop,Automotive Shop,BBQ Joint,Baby Store,Bakery,Bank,Bar,Baseball Field,Basketball Court,Bay,Beach,Beach Bar,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bike Rental / Bike Share,Bike Shop,Bistro,Board Shop,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Bowling Alley,Boxing Gym,Breakfast Spot,Brewery,Bridal Shop,Bubble Tea Shop,Buffet,Building,Bulgarian Restaurant,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Cafeteria,Café,Camera Store,Campground,Candy Store,Cantonese Restaurant,Caribbean Restaurant,Casino,Castle,Caucasian Restaurant,Cave,Cemetery,Centrum Kształcenia Ustawicznego,Chinese Restaurant,Chocolate Shop,Church,Circus,City Hall,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Academic Building,College Cafeteria,College Gym,College Library,College Soccer Field,College Stadium,Comedy Club,Comfort Food Restaurant,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Creperie,Cultural Center,Cupcake Shop,Czech Restaurant,Dairy Store,Dance Studio,Deli / Bodega,Dentist's Office,Department Store,Design Studio,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Spot,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Football Stadium,Forest,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Gluten-free Restaurant,Go Kart Track,Golf Course,Gourmet Shop,Government Building,Greek Restaurant,Grilled Meat Restaurant,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Gym Pool,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Historic Site,History Museum,Hobby Shop,Hookah Bar,Hostel,Hotel,Hotel Bar,Hotel Pool,Hungarian Restaurant,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indoor Play Area,Inna placówka systemu oświaty lub placówka spoza systemu oświaty,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Kebab Restaurant,Korean Restaurant,Kosher Restaurant,Lake,Laser Tag,Laundromat,Lebanese Restaurant,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Mac & Cheese Joint,Market,Martial Arts Dojo,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Motel,Motorcycle Shop,Mountain,Movie Theater,Moving Target,Multiplex,Museum,Music Store,Music Venue,Młodzieżowy Ośrodek Socjoterapii,Młodzieżowy Ośrodek Wychowawczy,Neighborhood,New American Restaurant,Night Market,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Other Nightlife,Outdoor Sculpture,Outdoors & Recreation,Outlet Mall,Outlet Store,Paintball Field,Palace,Paper / Office Supplies Store,Park,Parking,Pedestrian Plaza,Pelmeni House,Performing Arts Venue,Perfume Shop,Peruvian Restaurant,Pet Service,Pet Store,Pharmacy,Photography Studio,Pie Shop,Pizza Place,Planetarium,Platform,Playground,Plaza,Polish Restaurant,Pool,Pool Hall,Print Shop,"Przedszkole, szkoła podstawowa, gimnazjum",Pub,Public Art,RV Park,Racetrack,Radio Station,Ramen Restaurant,Recording Studio,Recreation Center,Rental Car Location,Rest Area,Restaurant,River,Road,Rock Club,Russian Restaurant,SOW,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shabu-Shabu Restaurant,Shoe Repair,Shoe Store,Shop & Service,Shopping Mall,Skate Park,Skating Rink,Ski Area,Ski Trail,Smoke Shop,Snack Place,Soccer Field,Soccer Stadium,South Indian Restaurant,Spa,Spanish Restaurant,Specjalny Ośrodek Szkolno-Wychowawczy,Sporting Goods Shop,Sports Bar,Sports Club,Stables,Stadium,Steakhouse,Street Art,Street Food Gathering,Supermarket,Surf Spot,Sushi Restaurant,Szkoła artystyczna,Szkoła ponadgimnazjalna/ponadpodstawowa,Taco Place,Tapas Restaurant,Tattoo Parlor,Tea Room,Tennis Court,Tennis Stadium,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Tibetan Restaurant,Tiki Bar,Toy / Game Store,Trade School,Trail,Train,Train Station,Tram Station,Trattoria/Osteria,Tunnel,Turkish Restaurant,Udon Restaurant,Ukrainian Restaurant,University,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Water Park,Whisky Bar,Wine Bar,Wine Shop,Women's Store,Yoga Studio,Zoo,Zoo Exhibit,biuro finansów oświaty,bursa internat,centrum kształcenia zawodowego,liceum ogólnokształcące,międzyszkolny ośrodek sportowy,młodzieżowy dom kultury,ognisko pracy pozaszkolnej,ogród jordanowski,ośrodek rewalidacyjno-wychowawczy,pałac młodzieży,placówka artystyczna (ognisko),placówka doskonalenia nauczycieli,poradnia psychologiczno-pedagogiczna,pozaszkolna placówka specjalistyczna,przedszkole,specjalistyczna poradnia psychologiczno-pedagogiczna,szkolne schronisko młodzieżowe,szkoła branżowa I stopnia,szkoła podstawowa,szkoła podstawowa artystyczna,szkoła policealna,szkoła specjalna przysposabiająca do pracy,technikum,theater,zespół lub jednostka złożona
0,Śródmieście,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Praga Południe,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Ochota,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Wola,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Ursus,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [30]:
warsaw_onehot.tail()

Unnamed: 0,District,Accessories Store,Adult Boutique,African Restaurant,Airport,Airport Service,American Restaurant,Amphitheater,Arcade,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Auto Dealership,Auto Garage,Auto Workshop,Automotive Shop,BBQ Joint,Baby Store,Bakery,Bank,Bar,Baseball Field,Basketball Court,Bay,Beach,Beach Bar,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bike Rental / Bike Share,Bike Shop,Bistro,Board Shop,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Bowling Alley,Boxing Gym,Breakfast Spot,Brewery,Bridal Shop,Bubble Tea Shop,Buffet,Building,Bulgarian Restaurant,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Cafeteria,Café,Camera Store,Campground,Candy Store,Cantonese Restaurant,Caribbean Restaurant,Casino,Castle,Caucasian Restaurant,Cave,Cemetery,Centrum Kształcenia Ustawicznego,Chinese Restaurant,Chocolate Shop,Church,Circus,City Hall,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Academic Building,College Cafeteria,College Gym,College Library,College Soccer Field,College Stadium,Comedy Club,Comfort Food Restaurant,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Creperie,Cultural Center,Cupcake Shop,Czech Restaurant,Dairy Store,Dance Studio,Deli / Bodega,Dentist's Office,Department Store,Design Studio,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Spot,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Football Stadium,Forest,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Gluten-free Restaurant,Go Kart Track,Golf Course,Gourmet Shop,Government Building,Greek Restaurant,Grilled Meat Restaurant,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Gym Pool,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Historic Site,History Museum,Hobby Shop,Hookah Bar,Hostel,Hotel,Hotel Bar,Hotel Pool,Hungarian Restaurant,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indoor Play Area,Inna placówka systemu oświaty lub placówka spoza systemu oświaty,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Kebab Restaurant,Korean Restaurant,Kosher Restaurant,Lake,Laser Tag,Laundromat,Lebanese Restaurant,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Mac & Cheese Joint,Market,Martial Arts Dojo,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Motel,Motorcycle Shop,Mountain,Movie Theater,Moving Target,Multiplex,Museum,Music Store,Music Venue,Młodzieżowy Ośrodek Socjoterapii,Młodzieżowy Ośrodek Wychowawczy,Neighborhood,New American Restaurant,Night Market,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Other Nightlife,Outdoor Sculpture,Outdoors & Recreation,Outlet Mall,Outlet Store,Paintball Field,Palace,Paper / Office Supplies Store,Park,Parking,Pedestrian Plaza,Pelmeni House,Performing Arts Venue,Perfume Shop,Peruvian Restaurant,Pet Service,Pet Store,Pharmacy,Photography Studio,Pie Shop,Pizza Place,Planetarium,Platform,Playground,Plaza,Polish Restaurant,Pool,Pool Hall,Print Shop,"Przedszkole, szkoła podstawowa, gimnazjum",Pub,Public Art,RV Park,Racetrack,Radio Station,Ramen Restaurant,Recording Studio,Recreation Center,Rental Car Location,Rest Area,Restaurant,River,Road,Rock Club,Russian Restaurant,SOW,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shabu-Shabu Restaurant,Shoe Repair,Shoe Store,Shop & Service,Shopping Mall,Skate Park,Skating Rink,Ski Area,Ski Trail,Smoke Shop,Snack Place,Soccer Field,Soccer Stadium,South Indian Restaurant,Spa,Spanish Restaurant,Specjalny Ośrodek Szkolno-Wychowawczy,Sporting Goods Shop,Sports Bar,Sports Club,Stables,Stadium,Steakhouse,Street Art,Street Food Gathering,Supermarket,Surf Spot,Sushi Restaurant,Szkoła artystyczna,Szkoła ponadgimnazjalna/ponadpodstawowa,Taco Place,Tapas Restaurant,Tattoo Parlor,Tea Room,Tennis Court,Tennis Stadium,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Tibetan Restaurant,Tiki Bar,Toy / Game Store,Trade School,Trail,Train,Train Station,Tram Station,Trattoria/Osteria,Tunnel,Turkish Restaurant,Udon Restaurant,Ukrainian Restaurant,University,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Water Park,Whisky Bar,Wine Bar,Wine Shop,Women's Store,Yoga Studio,Zoo,Zoo Exhibit,biuro finansów oświaty,bursa internat,centrum kształcenia zawodowego,liceum ogólnokształcące,międzyszkolny ośrodek sportowy,młodzieżowy dom kultury,ognisko pracy pozaszkolnej,ogród jordanowski,ośrodek rewalidacyjno-wychowawczy,pałac młodzieży,placówka artystyczna (ognisko),placówka doskonalenia nauczycieli,poradnia psychologiczno-pedagogiczna,pozaszkolna placówka specjalistyczna,przedszkole,specjalistyczna poradnia psychologiczno-pedagogiczna,szkolne schronisko młodzieżowe,szkoła branżowa I stopnia,szkoła podstawowa,szkoła podstawowa artystyczna,szkoła policealna,szkoła specjalna przysposabiająca do pracy,technikum,theater,zespół lub jednostka złożona
6040,Bielany,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6041,Wawer,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6042,Wawer,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6043,Wawer,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6044,Wawer,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


Examine the new dataframe size:

In [31]:
warsaw_onehot.shape

(6045, 372)

Groupping rows by neighborhood and by taking the mean of the frequency of occurrence of each category:

In [32]:
warsaw_grouped = warsaw_onehot.groupby('District').mean().reset_index()
warsaw_grouped

Unnamed: 0,District,Accessories Store,Adult Boutique,African Restaurant,Airport,Airport Service,American Restaurant,Amphitheater,Arcade,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Auto Dealership,Auto Garage,Auto Workshop,Automotive Shop,BBQ Joint,Baby Store,Bakery,Bank,Bar,Baseball Field,Basketball Court,Bay,Beach,Beach Bar,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bike Rental / Bike Share,Bike Shop,Bistro,Board Shop,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Bowling Alley,Boxing Gym,Breakfast Spot,Brewery,Bridal Shop,Bubble Tea Shop,Buffet,Building,Bulgarian Restaurant,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Cafeteria,Café,Camera Store,Campground,Candy Store,Cantonese Restaurant,Caribbean Restaurant,Casino,Castle,Caucasian Restaurant,Cave,Cemetery,Centrum Kształcenia Ustawicznego,Chinese Restaurant,Chocolate Shop,Church,Circus,City Hall,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Academic Building,College Cafeteria,College Gym,College Library,College Soccer Field,College Stadium,Comedy Club,Comfort Food Restaurant,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Creperie,Cultural Center,Cupcake Shop,Czech Restaurant,Dairy Store,Dance Studio,Deli / Bodega,Dentist's Office,Department Store,Design Studio,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Spot,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Football Stadium,Forest,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Gluten-free Restaurant,Go Kart Track,Golf Course,Gourmet Shop,Government Building,Greek Restaurant,Grilled Meat Restaurant,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Gym Pool,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Historic Site,History Museum,Hobby Shop,Hookah Bar,Hostel,Hotel,Hotel Bar,Hotel Pool,Hungarian Restaurant,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indoor Play Area,Inna placówka systemu oświaty lub placówka spoza systemu oświaty,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Kebab Restaurant,Korean Restaurant,Kosher Restaurant,Lake,Laser Tag,Laundromat,Lebanese Restaurant,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Mac & Cheese Joint,Market,Martial Arts Dojo,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Motel,Motorcycle Shop,Mountain,Movie Theater,Moving Target,Multiplex,Museum,Music Store,Music Venue,Młodzieżowy Ośrodek Socjoterapii,Młodzieżowy Ośrodek Wychowawczy,Neighborhood,New American Restaurant,Night Market,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Other Nightlife,Outdoor Sculpture,Outdoors & Recreation,Outlet Mall,Outlet Store,Paintball Field,Palace,Paper / Office Supplies Store,Park,Parking,Pedestrian Plaza,Pelmeni House,Performing Arts Venue,Perfume Shop,Peruvian Restaurant,Pet Service,Pet Store,Pharmacy,Photography Studio,Pie Shop,Pizza Place,Planetarium,Platform,Playground,Plaza,Polish Restaurant,Pool,Pool Hall,Print Shop,"Przedszkole, szkoła podstawowa, gimnazjum",Pub,Public Art,RV Park,Racetrack,Radio Station,Ramen Restaurant,Recording Studio,Recreation Center,Rental Car Location,Rest Area,Restaurant,River,Road,Rock Club,Russian Restaurant,SOW,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shabu-Shabu Restaurant,Shoe Repair,Shoe Store,Shop & Service,Shopping Mall,Skate Park,Skating Rink,Ski Area,Ski Trail,Smoke Shop,Snack Place,Soccer Field,Soccer Stadium,South Indian Restaurant,Spa,Spanish Restaurant,Specjalny Ośrodek Szkolno-Wychowawczy,Sporting Goods Shop,Sports Bar,Sports Club,Stables,Stadium,Steakhouse,Street Art,Street Food Gathering,Supermarket,Surf Spot,Sushi Restaurant,Szkoła artystyczna,Szkoła ponadgimnazjalna/ponadpodstawowa,Taco Place,Tapas Restaurant,Tattoo Parlor,Tea Room,Tennis Court,Tennis Stadium,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Tibetan Restaurant,Tiki Bar,Toy / Game Store,Trade School,Trail,Train,Train Station,Tram Station,Trattoria/Osteria,Tunnel,Turkish Restaurant,Udon Restaurant,Ukrainian Restaurant,University,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Water Park,Whisky Bar,Wine Bar,Wine Shop,Women's Store,Yoga Studio,Zoo,Zoo Exhibit,biuro finansów oświaty,bursa internat,centrum kształcenia zawodowego,liceum ogólnokształcące,międzyszkolny ośrodek sportowy,młodzieżowy dom kultury,ognisko pracy pozaszkolnej,ogród jordanowski,ośrodek rewalidacyjno-wychowawczy,pałac młodzieży,placówka artystyczna (ognisko),placówka doskonalenia nauczycieli,poradnia psychologiczno-pedagogiczna,pozaszkolna placówka specjalistyczna,przedszkole,specjalistyczna poradnia psychologiczno-pedagogiczna,szkolne schronisko młodzieżowe,szkoła branżowa I stopnia,szkoła podstawowa,szkoła podstawowa artystyczna,szkoła policealna,szkoła specjalna przysposabiająca do pracy,technikum,theater,zespół lub jednostka złożona
0,Bemowo,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004854,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004854,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004854,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004854,0.0,0.0,0.004854,0.0,0.0,0.009709,0.0,0.0,0.063107,0.0,0.0,0.0,0.0,0.024272,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004854,0.004854,0.0,0.0,0.0,0.0,0.004854,0.0,0.0,0.0,0.0,0.0,0.004854,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004854,0.004854,0.0,0.0,0.0,0.0,0.0,0.0,0.009709,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.009709,0.0,0.004854,0.004854,0.0,0.009709,0.004854,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.009709,0.0,0.0,0.009709,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004854,0.0,0.0,0.0,0.0,0.0,0.019417,0.009709,0.009709,0.004854,0.004854,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004854,0.0,0.009709,0.0,0.0,0.0,0.0,0.009709,0.004854,0.0,0.004854,0.019417,0.0,0.009709,0.004854,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004854,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004854,0.0,0.0,0.004854,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004854,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004854,0.0,0.0,0.004854,0.024272,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.029126,0.0,0.0,0.038835,0.0,0.0,0.0,0.0,0.0,0.26699,0.004854,0.0,0.0,0.004854,0.0,0.0,0.0,0.004854,0.004854,0.004854,0.009709,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.009709,0.0,0.0,0.0,0.0,0.0,0.0,0.004854,0.0,0.0,0.0,0.0,0.0,0.004854,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.024272,0.0,0.0,0.0,0.0,0.004854,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004854,0.009709,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004854,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004854,0.0,0.0,0.004854,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004854,0.0,0.101942,0.0,0.0,0.0,0.058252,0.0,0.0,0.0,0.0,0.0,0.0
1,Białołęka,0.0,0.003205,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003205,0.0,0.0,0.0,0.00641,0.00641,0.003205,0.003205,0.003205,0.0,0.003205,0.003205,0.003205,0.0,0.00641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003205,0.0,0.003205,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003205,0.0,0.012821,0.073718,0.00641,0.003205,0.0,0.0,0.012821,0.0,0.0,0.0,0.0,0.0,0.0,0.003205,0.0,0.0,0.0,0.0,0.003205,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003205,0.00641,0.0,0.0,0.0,0.003205,0.0,0.0,0.003205,0.009615,0.0,0.00641,0.0,0.0,0.0,0.003205,0.0,0.0,0.003205,0.0,0.0,0.0,0.0,0.003205,0.0,0.0,0.0,0.0,0.0,0.003205,0.003205,0.0,0.0,0.00641,0.0,0.012821,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003205,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.009615,0.0,0.009615,0.0,0.0,0.003205,0.0,0.0,0.003205,0.003205,0.0,0.0,0.0,0.003205,0.00641,0.0,0.0,0.0,0.0,0.0,0.003205,0.0,0.0,0.019231,0.0,0.016026,0.003205,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003205,0.0,0.0,0.003205,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003205,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003205,0.003205,0.003205,0.0,0.0,0.0,0.0,0.016026,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003205,0.00641,0.0,0.0,0.019231,0.0,0.0,0.009615,0.0,0.0,0.003205,0.003205,0.0,0.413462,0.003205,0.0,0.0,0.003205,0.0,0.0,0.0,0.0,0.0,0.0,0.003205,0.003205,0.003205,0.0,0.0,0.0,0.0,0.0,0.003205,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016026,0.0,0.003205,0.0,0.0,0.0,0.0,0.003205,0.0,0.0,0.0,0.0,0.0,0.003205,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.00641,0.0,0.003205,0.0,0.016026,0.0,0.0,0.0,0.0,0.0,0.0,0.003205,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003205,0.003205,0.0,0.003205,0.0,0.0,0.0,0.0,0.0,0.00641,0.003205,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003205,0.0,0.0,0.003205,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003205,0.0,0.032051,0.0,0.0,0.0,0.051282,0.0,0.0,0.0,0.0,0.0,0.0
2,Bielany,0.0,0.0,0.0,0.0033,0.0033,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016502,0.0033,0.0033,0.0,0.0,0.0,0.0,0.0,0.0033,0.0,0.0033,0.0,0.0033,0.0,0.0,0.0,0.0,0.0033,0.0033,0.0,0.0,0.006601,0.0,0.0033,0.0,0.0,0.006601,0.0033,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.009901,0.046205,0.013201,0.0,0.0,0.0,0.023102,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.006601,0.0,0.0,0.0,0.0,0.0,0.0033,0.0,0.009901,0.0,0.0,0.0033,0.0,0.0,0.0033,0.0,0.0,0.0,0.0,0.0,0.0033,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0033,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0033,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0033,0.0033,0.009901,0.0,0.006601,0.0,0.0033,0.009901,0.0033,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0033,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.006601,0.0,0.013201,0.019802,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.006601,0.0,0.0033,0.0,0.0,0.0,0.0033,0.006601,0.0,0.0,0.023102,0.0,0.0033,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0033,0.0,0.0,0.0,0.0,0.0,0.0033,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013201,0.0033,0.0033,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0033,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0033,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0033,0.029703,0.0,0.0,0.0,0.006601,0.0,0.0,0.0,0.0,0.006601,0.0,0.0,0.016502,0.0,0.0,0.0033,0.0,0.0,0.0,0.0,0.0,0.224422,0.0033,0.0,0.0033,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0033,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0033,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0033,0.0033,0.0,0.0,0.0033,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0033,0.0,0.0,0.006601,0.0033,0.0033,0.0,0.056106,0.0,0.0,0.0,0.0,0.0033,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0033,0.016502,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0033,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0033,0.0033,0.0,0.033003,0.0,0.006601,0.0,0.0,0.0,0.0,0.0,0.0,0.0033,0.0033,0.09571,0.0,0.0,0.0033,0.059406,0.0,0.0033,0.0,0.009901,0.0033,0.0033
3,Mokotów,0.001312,0.0,0.0,0.0,0.0,0.002625,0.0,0.001312,0.001312,0.002625,0.0,0.001312,0.0,0.007874,0.003937,0.0,0.0,0.0,0.0,0.001312,0.0,0.007874,0.0,0.006562,0.001312,0.0,0.0,0.0,0.0,0.001312,0.0,0.0,0.0,0.0,0.001312,0.001312,0.005249,0.0,0.0,0.0,0.0,0.001312,0.001312,0.0,0.005249,0.0,0.0,0.001312,0.0,0.0,0.0,0.009186,0.001312,0.006562,0.034121,0.009186,0.0,0.001312,0.001312,0.028871,0.0,0.0,0.0,0.001312,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005249,0.0,0.0,0.0,0.0,0.001312,0.003937,0.001312,0.011811,0.0,0.001312,0.001312,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.009186,0.001312,0.0,0.001312,0.002625,0.0,0.001312,0.0,0.003937,0.0,0.0,0.0,0.006562,0.001312,0.013123,0.0,0.0,0.0,0.0,0.0,0.0,0.001312,0.009186,0.003937,0.001312,0.0,0.001312,0.0,0.005249,0.003937,0.002625,0.001312,0.001312,0.001312,0.003937,0.0,0.001312,0.0,0.001312,0.0,0.001312,0.0,0.0,0.001312,0.0,0.001312,0.0,0.0,0.0,0.001312,0.0,0.0,0.0,0.0,0.001312,0.0,0.001312,0.0,0.0,0.013123,0.001312,0.010499,0.009186,0.001312,0.0,0.0,0.0,0.0,0.001312,0.001312,0.001312,0.0,0.001312,0.011811,0.0,0.0,0.0,0.0,0.009186,0.002625,0.0,0.0,0.02231,0.0,0.024934,0.003937,0.001312,0.0,0.002625,0.0,0.002625,0.001312,0.0,0.001312,0.0,0.0,0.0,0.001312,0.0,0.001312,0.002625,0.0,0.003937,0.001312,0.0,0.001312,0.0,0.0,0.0,0.002625,0.0,0.0,0.002625,0.0,0.001312,0.0,0.0,0.0,0.0,0.001312,0.001312,0.001312,0.001312,0.002625,0.001312,0.0,0.001312,0.0,0.0,0.0,0.0,0.001312,0.001312,0.0,0.0,0.001312,0.0,0.0,0.001312,0.001312,0.0,0.0,0.0,0.0,0.0,0.026247,0.0,0.0,0.0,0.001312,0.0,0.0,0.0,0.0,0.007874,0.001312,0.0,0.007874,0.0,0.0,0.002625,0.002625,0.007874,0.0,0.0,0.0,0.183727,0.002625,0.0,0.0,0.001312,0.0,0.001312,0.0,0.001312,0.0,0.0,0.013123,0.001312,0.0,0.0,0.0,0.0,0.0,0.001312,0.0,0.002625,0.001312,0.0,0.001312,0.0,0.002625,0.0,0.0,0.0,0.0,0.002625,0.0,0.001312,0.0,0.0,0.001312,0.002625,0.001312,0.001312,0.0,0.005249,0.0,0.001312,0.0,0.0,0.0,0.001312,0.0,0.001312,0.0,0.0,0.001312,0.0,0.011811,0.0,0.044619,0.0,0.0,0.0,0.001312,0.002625,0.0,0.003937,0.005249,0.001312,0.001312,0.0,0.0,0.001312,0.001312,0.001312,0.0,0.001312,0.009186,0.0,0.0,0.0,0.0,0.001312,0.0,0.005249,0.007874,0.0,0.0,0.001312,0.0,0.0,0.0,0.0,0.001312,0.001312,0.0,0.0,0.026247,0.0,0.001312,0.002625,0.001312,0.001312,0.0,0.0,0.0,0.002625,0.0,0.064304,0.002625,0.0,0.005249,0.041995,0.0,0.011811,0.001312,0.006562,0.006562,0.001312
4,Ochota,0.0,0.0,0.0,0.0,0.0,0.003236,0.0,0.003236,0.0,0.0,0.0,0.0,0.0,0.006472,0.003236,0.0,0.0,0.0,0.0,0.0,0.0,0.009709,0.0,0.006472,0.0,0.003236,0.0,0.0,0.0,0.0,0.003236,0.003236,0.0,0.0,0.0,0.0,0.003236,0.0,0.0,0.0,0.0,0.0,0.003236,0.0,0.003236,0.0,0.0,0.0,0.0,0.0,0.0,0.003236,0.0,0.003236,0.029126,0.006472,0.0,0.0,0.003236,0.032362,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003236,0.006472,0.0,0.0,0.0,0.0,0.0,0.003236,0.0,0.006472,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003236,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003236,0.003236,0.0,0.003236,0.0,0.006472,0.0,0.003236,0.0,0.0,0.0,0.0,0.0,0.003236,0.0,0.0,0.006472,0.0,0.0,0.003236,0.0,0.006472,0.003236,0.0,0.0,0.0,0.0,0.003236,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003236,0.0,0.0,0.003236,0.003236,0.0,0.003236,0.0,0.0,0.0,0.003236,0.0,0.0,0.0,0.003236,0.0,0.009709,0.0,0.006472,0.016181,0.003236,0.0,0.0,0.0,0.003236,0.0,0.0,0.003236,0.0,0.003236,0.016181,0.0,0.0,0.0,0.0,0.003236,0.0,0.0,0.0,0.032362,0.0,0.019417,0.0,0.0,0.0,0.0,0.0,0.0,0.003236,0.0,0.003236,0.003236,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003236,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003236,0.0,0.0,0.0,0.0,0.006472,0.0,0.003236,0.0,0.0,0.0,0.012945,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032362,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003236,0.0,0.0,0.02589,0.0,0.0,0.0,0.006472,0.0,0.009709,0.003236,0.0,0.161812,0.016181,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012945,0.0,0.0,0.0,0.003236,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003236,0.0,0.003236,0.0,0.0,0.003236,0.0,0.003236,0.003236,0.0,0.003236,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003236,0.0,0.009709,0.0,0.006472,0.0,0.006472,0.0,0.0,0.0,0.003236,0.0,0.003236,0.0,0.055016,0.0,0.0,0.0,0.0,0.009709,0.003236,0.0,0.006472,0.003236,0.0,0.0,0.0,0.003236,0.0,0.0,0.003236,0.0,0.009709,0.0,0.0,0.003236,0.003236,0.0,0.0,0.003236,0.009709,0.0,0.0,0.003236,0.0,0.0,0.0,0.0,0.0,0.003236,0.0,0.0,0.022654,0.003236,0.003236,0.003236,0.003236,0.0,0.0,0.0,0.0,0.003236,0.0,0.064725,0.006472,0.0,0.003236,0.038835,0.0,0.009709,0.0,0.012945,0.006472,0.0
5,Praga Południe,0.0,0.0,0.001976,0.0,0.0,0.001976,0.0,0.0,0.0,0.0,0.001976,0.0,0.0,0.005929,0.001976,0.0,0.001976,0.0,0.0,0.0,0.0,0.007905,0.0,0.003953,0.0,0.0,0.0,0.005929,0.0,0.0,0.001976,0.001976,0.001976,0.001976,0.0,0.0,0.003953,0.001976,0.001976,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.001976,0.001976,0.0,0.005929,0.047431,0.005929,0.001976,0.0,0.0,0.023715,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003953,0.001976,0.001976,0.0,0.001976,0.0,0.001976,0.001976,0.003953,0.011858,0.0,0.0,0.001976,0.0,0.001976,0.0,0.001976,0.0,0.001976,0.0,0.001976,0.0,0.001976,0.0,0.001976,0.0,0.0,0.001976,0.001976,0.0,0.0,0.0,0.007905,0.0,0.007905,0.001976,0.0,0.0,0.0,0.0,0.0,0.0,0.003953,0.005929,0.001976,0.001976,0.001976,0.0,0.005929,0.005929,0.0,0.0,0.0,0.0,0.007905,0.001976,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.001976,0.001976,0.0,0.003953,0.001976,0.0,0.0,0.0,0.0,0.0,0.0,0.005929,0.0,0.011858,0.0,0.013834,0.009881,0.001976,0.001976,0.001976,0.0,0.0,0.0,0.0,0.0,0.0,0.001976,0.001976,0.0,0.0,0.001976,0.0,0.003953,0.005929,0.0,0.0,0.027668,0.0,0.023715,0.001976,0.0,0.001976,0.0,0.0,0.003953,0.0,0.0,0.003953,0.0,0.0,0.0,0.005929,0.0,0.003953,0.001976,0.0,0.001976,0.0,0.0,0.001976,0.0,0.0,0.007905,0.003953,0.0,0.0,0.001976,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.001976,0.001976,0.0,0.001976,0.0,0.0,0.005929,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011858,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.003953,0.0,0.001976,0.009881,0.0,0.001976,0.003953,0.005929,0.005929,0.0,0.0,0.0,0.167984,0.001976,0.0,0.0,0.001976,0.0,0.001976,0.0,0.001976,0.0,0.0,0.019763,0.0,0.001976,0.0,0.0,0.0,0.0,0.0,0.0,0.001976,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.001976,0.0,0.0,0.0,0.0,0.001976,0.0,0.005929,0.0,0.0,0.0,0.001976,0.0,0.001976,0.0,0.0,0.0,0.0,0.001976,0.0,0.0,0.001976,0.0,0.005929,0.0,0.071146,0.0,0.0,0.001976,0.0,0.001976,0.0,0.003953,0.005929,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005929,0.009881,0.0,0.0,0.001976,0.0,0.001976,0.0,0.003953,0.001976,0.001976,0.0,0.0,0.0,0.0,0.001976,0.0,0.0,0.001976,0.001976,0.001976,0.025692,0.001976,0.0,0.007905,0.0,0.0,0.0,0.0,0.0,0.003953,0.0,0.073123,0.0,0.0,0.009881,0.045455,0.0,0.005929,0.003953,0.011858,0.001976,0.0
6,Praga Północ,0.0,0.0,0.0,0.0,0.0,0.0,0.004,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.008,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.024,0.0,0.004,0.0,0.0,0.0,0.0,0.004,0.0,0.0,0.0,0.0,0.0,0.012,0.0,0.0,0.004,0.0,0.0,0.0,0.0,0.004,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004,0.04,0.012,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004,0.0,0.0,0.0,0.0,0.0,0.004,0.0,0.004,0.0,0.0,0.0,0.0,0.0,0.0,0.004,0.0,0.0,0.0,0.004,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004,0.0,0.0,0.012,0.0,0.0,0.0,0.0,0.0,0.004,0.0,0.004,0.0,0.0,0.0,0.0,0.0,0.0,0.012,0.0,0.004,0.0,0.004,0.008,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004,0.004,0.0,0.004,0.004,0.0,0.0,0.0,0.0,0.004,0.008,0.0,0.0,0.008,0.02,0.0,0.0,0.0,0.0,0.004,0.0,0.0,0.0,0.02,0.0,0.008,0.0,0.0,0.0,0.004,0.0,0.004,0.0,0.0,0.0,0.0,0.0,0.0,0.004,0.0,0.004,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.008,0.0,0.004,0.0,0.0,0.004,0.0,0.0,0.0,0.0,0.0,0.0,0.004,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.004,0.0,0.008,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004,0.0,0.0,0.004,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004,0.008,0.0,0.0,0.004,0.0,0.0,0.0,0.008,0.008,0.004,0.0,0.0,0.136,0.004,0.004,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016,0.0,0.012,0.0,0.004,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004,0.0,0.0,0.0,0.0,0.004,0.004,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.008,0.0,0.008,0.0,0.132,0.0,0.0,0.0,0.0,0.0,0.0,0.004,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004,0.0,0.012,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.004,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.004,0.02,0.004,0.0,0.0,0.024,0.0,0.0,0.0,0.004,0.0,0.0,0.0,0.0,0.004,0.0,0.056,0.0,0.0,0.012,0.032,0.0,0.012,0.004,0.016,0.008,0.0
7,Rembertów,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.078431,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.313725,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.098039,0.0,0.0,0.0,0.098039,0.019608,0.0,0.0,0.0,0.0,0.0
8,Targówek,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005495,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005495,0.0,0.0,0.0,0.0,0.0,0.005495,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.005495,0.0,0.0,0.0,0.010989,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005495,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005495,0.005495,0.005495,0.0,0.0,0.005495,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005495,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010989,0.0,0.0,0.0,0.010989,0.0,0.005495,0.0,0.0,0.005495,0.0,0.005495,0.016484,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010989,0.0,0.010989,0.005495,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005495,0.0,0.0,0.0,0.0,0.0,0.0,0.005495,0.0,0.005495,0.021978,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016484,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005495,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005495,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005495,0.0,0.0,0.010989,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005495,0.005495,0.0,0.0,0.016484,0.0,0.005495,0.0,0.016484,0.0,0.005495,0.0,0.0,0.236264,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005495,0.0,0.005495,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005495,0.0,0.0,0.0,0.0,0.0,0.0,0.005495,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005495,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005495,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.005495,0.0,0.005495,0.032967,0.0,0.0,0.0,0.005495,0.0,0.0,0.0,0.0,0.005495,0.0,0.120879,0.0,0.0,0.005495,0.087912,0.0,0.010989,0.0,0.010989,0.005495,0.0
9,Ursus,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.007812,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.023438,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054688,0.007812,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.007812,0.0,0.0,0.0,0.0,0.015625,0.0,0.023438,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.007812,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.007812,0.0,0.0,0.0,0.0,0.0,0.015625,0.0,0.0,0.0,0.0,0.007812,0.0,0.007812,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.007812,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.007812,0.0,0.007812,0.007812,0.0,0.0,0.007812,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015625,0.0,0.0,0.0,0.0,0.015625,0.0,0.0,0.0,0.03125,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.007812,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.007812,0.007812,0.0,0.0,0.0,0.0,0.0,0.0,0.007812,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.007812,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.023438,0.0,0.0,0.007812,0.007812,0.0,0.0,0.0,0.0,0.257812,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.007812,0.0,0.0,0.0,0.0,0.0,0.007812,0.0,0.007812,0.0,0.0,0.0,0.039062,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.007812,0.015625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.007812,0.0,0.0,0.0,0.0,0.007812,0.0,0.0,0.0,0.007812,0.0,0.0,0.015625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.007812,0.0,0.085938,0.0,0.0,0.007812,0.0625,0.0,0.0,0.0,0.007812,0.0,0.0


Confirming the new size:

In [33]:
warsaw_grouped.shape

(18, 372)

Printing each neighborhood along with the top 5 most common venues:

In [34]:
num_top_venues = 5

for hood in warsaw_grouped['District']:
    print("----"+hood+"----")
    temp = warsaw_grouped[warsaw_grouped['District'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Bemowo----
                                       venue  freq
0  Przedszkole, szkoła podstawowa, gimnazjum  0.27
1                                przedszkole  0.10
2                                Bus Station  0.06
3                          szkoła podstawowa  0.06
4                                 Playground  0.04


----Białołęka----
                                       venue  freq
0  Przedszkole, szkoła podstawowa, gimnazjum  0.41
1                                Bus Station  0.07
2                          szkoła podstawowa  0.05
3                                przedszkole  0.03
4                                       Park  0.02


----Bielany----
                                       venue  freq
0  Przedszkole, szkoła podstawowa, gimnazjum  0.22
1                                przedszkole  0.10
2                          szkoła podstawowa  0.06
3    Szkoła ponadgimnazjalna/ponadpodstawowa  0.06
4                                Bus Station  0.05


----Mokotów----
           

A function to sort the venues in descending order:

In [35]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Creating the new dataframe and displaying the top 10 venues for each neighborhood:

In [36]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# Create columns according to number of top venues:
columns = ['District']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['District'] = warsaw_grouped['District']

for ind in np.arange(warsaw_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(warsaw_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,District,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bemowo,"Przedszkole, szkoła podstawowa, gimnazjum",przedszkole,Bus Station,szkoła podstawowa,Playground,Pizza Place,Park,Café,Szkoła ponadgimnazjalna/ponadpodstawowa,Inna placówka systemu oświaty lub placówka spo...
1,Białołęka,"Przedszkole, szkoła podstawowa, gimnazjum",Bus Station,szkoła podstawowa,przedszkole,Pizza Place,Inna placówka systemu oświaty lub placówka spo...,Park,Italian Restaurant,Szkoła ponadgimnazjalna/ponadpodstawowa,Shopping Mall
2,Bielany,"Przedszkole, szkoła podstawowa, gimnazjum",przedszkole,szkoła podstawowa,Szkoła ponadgimnazjalna/ponadpodstawowa,Bus Station,liceum ogólnokształcące,Park,Café,Inna placówka systemu oświaty lub placówka spo...,Gym / Fitness Center
3,Mokotów,"Przedszkole, szkoła podstawowa, gimnazjum",przedszkole,Szkoła ponadgimnazjalna/ponadpodstawowa,szkoła podstawowa,Bus Station,Café,Park,liceum ogólnokształcące,Italian Restaurant,Inna placówka systemu oświaty lub placówka spo...
4,Ochota,"Przedszkole, szkoła podstawowa, gimnazjum",przedszkole,Szkoła ponadgimnazjalna/ponadpodstawowa,szkoła podstawowa,Park,Inna placówka systemu oświaty lub placówka spo...,Café,Bus Station,Pizza Place,liceum ogólnokształcące


Running k-means to cluster the neighborhood into clusters so that at least 2 items are in each cluster:

In [37]:
# Set number of clusters to make sure that at least two items exist in one cluster:
kclusters = 3

warsaw_grouped_clustering = warsaw_grouped.drop('District', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(warsaw_grouped_clustering)

# check cluster labels generated for each row in the dataframe
print(kmeans.labels_)
print(len(kmeans.labels_ ))

[1 2 1 0 0 0 0 2 1 1 1 1 2 1 0 1 0 0]
18


Creating a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood:

In [38]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

warsaw_merged = warsaw_venues

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
warsaw_merged = warsaw_merged.join(neighborhoods_venues_sorted.set_index('District'), on='District')

warsaw_merged.head(2)

Unnamed: 0,District,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Category,Venue Latitude,Venue Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Śródmieście,52.236558,21.01659,Bednarska Szkoła Realna,Szkoła ponadgimnazjalna/ponadpodstawowa,52.2177,21.0397,0,Szkoła ponadgimnazjalna/ponadpodstawowa,"Przedszkole, szkoła podstawowa, gimnazjum",Café,przedszkole,Restaurant,Inna placówka systemu oświaty lub placówka spo...,theater,Italian Restaurant,Cocktail Bar,Plaza
1,Praga Południe,52.244749,21.074174,Pedagogiczna Biblioteka Wojewódzka w Warszawie,Inna placówka systemu oświaty lub placówka spo...,52.2472,21.0637,0,"Przedszkole, szkoła podstawowa, gimnazjum",przedszkole,Szkoła ponadgimnazjalna/ponadpodstawowa,Bus Station,szkoła podstawowa,Inna placówka systemu oświaty lub placówka spo...,liceum ogólnokształcące,Italian Restaurant,Café,Restaurant


In [39]:
warsaw_merged.dropna(subset=["Cluster Labels"], axis=0, inplace=True) # dropping neighborhoods with no venues provided
warsaw_merged.reset_index(drop=True) # resetting the index for better visuals
warsaw_merged.head(2) # Displaing the final dataset

Unnamed: 0,District,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Category,Venue Latitude,Venue Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Śródmieście,52.236558,21.01659,Bednarska Szkoła Realna,Szkoła ponadgimnazjalna/ponadpodstawowa,52.2177,21.0397,0,Szkoła ponadgimnazjalna/ponadpodstawowa,"Przedszkole, szkoła podstawowa, gimnazjum",Café,przedszkole,Restaurant,Inna placówka systemu oświaty lub placówka spo...,theater,Italian Restaurant,Cocktail Bar,Plaza
1,Praga Południe,52.244749,21.074174,Pedagogiczna Biblioteka Wojewódzka w Warszawie,Inna placówka systemu oświaty lub placówka spo...,52.2472,21.0637,0,"Przedszkole, szkoła podstawowa, gimnazjum",przedszkole,Szkoła ponadgimnazjalna/ponadpodstawowa,Bus Station,szkoła podstawowa,Inna placówka systemu oświaty lub placówka spo...,liceum ogólnokształcące,Italian Restaurant,Café,Restaurant


In [40]:
#warsaw_merged_grouped = pd.DataFrame()
warsaw_merged_grouped_ = warsaw_merged.groupby(['District', 'Cluster Labels', 'Neighborhood Latitude', 'Neighborhood Longitude']).agg('sum').index
warsaw_merged_grouped = pd.DataFrame(columns=['District', 'Cluster Labels', 'Neighborhood Latitude', 'Neighborhood Longitude'])
for eachItem in range(len(warsaw_merged_grouped_)): warsaw_merged_grouped.loc[eachItem] = warsaw_merged_grouped_[eachItem]
warsaw_merged_grouped

Unnamed: 0,District,Cluster Labels,Neighborhood Latitude,Neighborhood Longitude
0,Bemowo,1,52.239606,20.899061
1,Białołęka,2,52.328444,20.998763
2,Bielany,1,52.296865,20.92907
3,Mokotów,0,52.189544,21.047263
4,Ochota,0,52.21405,20.96109
5,Praga Południe,0,52.244749,21.074174
6,Praga Północ,0,52.265506,21.034593
7,Rembertów,2,52.259174,21.154148
8,Targówek,1,52.283011,21.047594
9,Ursus,1,52.195898,20.876551


# Results:

Obtained analysis has shown that the best cluster set is a set of 3 items, as it is the only one, which allows clustering of multiple districts in one cluster. 

This kind of grouping is also based in reality - naturally, the inner city districts have better infrastructure, mainly due to higher density of population and longer historical background. That being said, the best fit of 3 clusters is unexpected and is mainly attributed to the data distribution.

Visualizing the resulting clusters:

In [41]:
import math

# Create map of Warsaw:
map_clusters = folium.Map(location=[52.2337172, 21.07141112883227], zoom_start=10)

# Set color scheme for the clusters:
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# Add markers to the map:
markers_colors = []
for lat, lon, poi, cluster in zip(warsaw_merged_grouped['Neighborhood Latitude'], warsaw_merged_grouped['Neighborhood Longitude'], 
                                  warsaw_merged_grouped['District'], warsaw_merged_grouped['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(map_clusters)

# This is not stricly necessary as folium renders map below without a problem (after all it is just 17 points)
# However, at the same time, this allows direct visualizations of the outputs maps via .htmls
def embed_map(m):
    from IPython.display import IFrame
    m.save('WarsawDistrictClusters.html')
    return IFrame('WarsawDistrictClusters.html', width='100%', height='750px')

embed_map(map_clusters)

### Examining the resultant clusters:

Cluster 1 (index=0):

In [42]:
warsaw_merged.loc[warsaw_merged['Cluster Labels'] == 0, warsaw_merged.columns[[1] + list(range(5, warsaw_merged.shape[1]))]].head(5)

Unnamed: 0,Neighborhood Latitude,Venue Latitude,Venue Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,52.236558,52.2177,21.0397,0,Szkoła ponadgimnazjalna/ponadpodstawowa,"Przedszkole, szkoła podstawowa, gimnazjum",Café,przedszkole,Restaurant,Inna placówka systemu oświaty lub placówka spo...,theater,Italian Restaurant,Cocktail Bar,Plaza
1,52.244749,52.2472,21.0637,0,"Przedszkole, szkoła podstawowa, gimnazjum",przedszkole,Szkoła ponadgimnazjalna/ponadpodstawowa,Bus Station,szkoła podstawowa,Inna placówka systemu oświaty lub placówka spo...,liceum ogólnokształcące,Italian Restaurant,Café,Restaurant
2,52.21405,52.2137,20.9676,0,"Przedszkole, szkoła podstawowa, gimnazjum",przedszkole,Szkoła ponadgimnazjalna/ponadpodstawowa,szkoła podstawowa,Park,Inna placówka systemu oświaty lub placówka spo...,Café,Bus Station,Pizza Place,liceum ogólnokształcące
3,52.231552,52.246,20.9492,0,"Przedszkole, szkoła podstawowa, gimnazjum",przedszkole,Szkoła ponadgimnazjalna/ponadpodstawowa,szkoła podstawowa,liceum ogólnokształcące,Café,Inna placówka systemu oświaty lub placówka spo...,Bus Station,technikum,Italian Restaurant
5,52.236558,52.2495,21.0077,0,Szkoła ponadgimnazjalna/ponadpodstawowa,"Przedszkole, szkoła podstawowa, gimnazjum",Café,przedszkole,Restaurant,Inna placówka systemu oświaty lub placówka spo...,theater,Italian Restaurant,Cocktail Bar,Plaza


Cluster 2 (index=1):

In [43]:
warsaw_merged.loc[warsaw_merged['Cluster Labels'] == 1, warsaw_merged.columns[[1] + list(range(5, warsaw_merged.shape[1]))]].head(5)

Unnamed: 0,Neighborhood Latitude,Venue Latitude,Venue Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,52.195898,52.1971,20.8998,1,"Przedszkole, szkoła podstawowa, gimnazjum",przedszkole,szkoła podstawowa,Bus Station,Szkoła ponadgimnazjalna/ponadpodstawowa,Inna placówka systemu oświaty lub placówka spo...,Italian Restaurant,Park,Boutique,Coffee Shop
9,52.239606,52.2444,20.9396,1,"Przedszkole, szkoła podstawowa, gimnazjum",przedszkole,Bus Station,szkoła podstawowa,Playground,Pizza Place,Park,Café,Szkoła ponadgimnazjalna/ponadpodstawowa,Inna placówka systemu oświaty lub placówka spo...
16,52.283011,52.2677,21.0691,1,"Przedszkole, szkoła podstawowa, gimnazjum",przedszkole,szkoła podstawowa,Bus Station,Szkoła ponadgimnazjalna/ponadpodstawowa,liceum ogólnokształcące,Inna placówka systemu oświaty lub placówka spo...,Food & Drink Shop,Plaza,Liquor Store
19,52.296865,52.2708,20.9647,1,"Przedszkole, szkoła podstawowa, gimnazjum",przedszkole,szkoła podstawowa,Szkoła ponadgimnazjalna/ponadpodstawowa,Bus Station,liceum ogólnokształcące,Park,Café,Inna placówka systemu oświaty lub placówka spo...,Gym / Fitness Center
20,52.239606,52.2444,20.9396,1,"Przedszkole, szkoła podstawowa, gimnazjum",przedszkole,Bus Station,szkoła podstawowa,Playground,Pizza Place,Park,Café,Szkoła ponadgimnazjalna/ponadpodstawowa,Inna placówka systemu oświaty lub placówka spo...


Cluster 3 (index=2):

In [44]:
warsaw_merged.loc[warsaw_merged['Cluster Labels'] == 2, warsaw_merged.columns[[1] + list(range(5, warsaw_merged.shape[1]))]].head(5)

Unnamed: 0,Neighborhood Latitude,Venue Latitude,Venue Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
18,52.225167,52.2229,21.2431,2,"Przedszkole, szkoła podstawowa, gimnazjum",Szkoła ponadgimnazjalna/ponadpodstawowa,szkoła podstawowa,Bus Station,przedszkole,Inna placówka systemu oświaty lub placówka spo...,Pizza Place,Athletics & Sports,Diner,Café
38,52.328444,52.294,21.0203,2,"Przedszkole, szkoła podstawowa, gimnazjum",Bus Station,szkoła podstawowa,przedszkole,Pizza Place,Inna placówka systemu oświaty lub placówka spo...,Park,Italian Restaurant,Szkoła ponadgimnazjalna/ponadpodstawowa,Shopping Mall
42,52.328444,52.2956,21.0166,2,"Przedszkole, szkoła podstawowa, gimnazjum",Bus Station,szkoła podstawowa,przedszkole,Pizza Place,Inna placówka systemu oświaty lub placówka spo...,Park,Italian Restaurant,Szkoła ponadgimnazjalna/ponadpodstawowa,Shopping Mall
50,52.225167,52.219,21.2172,2,"Przedszkole, szkoła podstawowa, gimnazjum",Szkoła ponadgimnazjalna/ponadpodstawowa,szkoła podstawowa,Bus Station,przedszkole,Inna placówka systemu oświaty lub placówka spo...,Pizza Place,Athletics & Sports,Diner,Café
54,52.328444,52.3138,20.9799,2,"Przedszkole, szkoła podstawowa, gimnazjum",Bus Station,szkoła podstawowa,przedszkole,Pizza Place,Inna placówka systemu oświaty lub placówka spo...,Park,Italian Restaurant,Szkoła ponadgimnazjalna/ponadpodstawowa,Shopping Mall


# Discusssion:

Clustering into 3 groups has been made based on acquired data. The notebook has utilized lots of infrastructure data, but more could be obtained in the future. This includes shop, mall, hospital, cinema, park, grocery story infromation and multiple other sources that could be used to make the map more comprehensive. 

Adidtional infromation, such as traffic infromation, real estate prices, heatmap of available commerical space and multiple more could be used to create a good estimate on the Warsaw's real estate price. This in turn could be used to determine real estate development opportunities as well as would allow user to find a price outlier for apartment price, eventually turning this notebook into valuable business opportunity.

# Conclusion:

It seems that district clustering approach has provided a good, comprehensive, based in reality clustering divistion between Warsaw's districts. Inner districts are indeed the most expensive ones, followed by Vistula's (Warsaw main river) left bank outer districts and finally - outer east Vistula's bank districts.

The inner city was specified as cluster 1 and combines districts of: Mokotów, Ochota, Praga Południe, Praga Północ, Wola, Śródmieście and Żoliborz.
Outer regions were clustered into cluster 2 and consist of districts of: Bemowo, Bielany, Targówek, Ursus, Ursynów, Wawer, Wilanów & Włochy.
Finally the remaining neighborhoods were clustered into set 3 and consist of districts: Białołęka, Rembertów & Wesoła.

### Technical Note:

If you would like to see folium map based visualization please use https://nbviewer.jupyter.org/ on this notebook's URL. GitHub cannot render the .html maps in the viewer mode/