<h1>Opening a retail-entertainment center in Ukraine</h1>

<h3>1. Import libraries</h3>

In [1]:
!conda install -c conda-forge geopy folium=0.5.0 geocoder --yes 

print('Libraries installed')

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Libraries installed


In [2]:
import requests # library to handle requests
import json

import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import geocoder # to get coordinates

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas import json_normalize

import folium # plotting library

# import k-means from clustering stage
from sklearn.cluster import KMeans

# colors
import matplotlib.cm as cm
import matplotlib.colors as colors

print('Libraries imported.')

Libraries imported.


<h3>2. Reading data</h3>

I prepared the data about districts beforehand. We have infromation about city, district, district latitude and Longitude.

In [3]:
districts = pd.read_csv('Ukraine_cities_n_districts.csv', sep=",")
districts.head(5)

Unnamed: 0,City,District,Latitude,Longitude
0,Kyiv,Holossijiw,50.340463,30.552943
1,Kyiv,Sviatoshynskyi,50.465895,30.340899
2,Kyiv,Solomianskyi,50.428417,30.45239
3,Kyiv,Obolonskyi,50.501431,30.497548
4,Kyiv,Podilskyi,50.504455,30.423153


In [4]:
districts.shape

(53, 4)

In [5]:
districts.dtypes

City          object
District      object
Latitude     float64
Longitude    float64
dtype: object

<h3>3. Exploring dataset. Creating a map of Ukraine with districts superimposed on top</h3>

<h4>Use geopy library to get the latitude and longitude values of Ukrane</h4>

In [6]:
address = 'Ukraine'

geolocator = Nominatim(user_agent="ukr")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Ukraine are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Ukraine are 49.4871968, 31.2718321.


You have to zoom for better look at the data.

In [7]:
# create map of Toronto using latitude and longitude values
map_ukr = folium.Map(location=[latitude, longitude], zoom_start=6)

# add markers to map
for lat, lng, district in zip(districts['Latitude'], districts['Longitude'], districts['District']):
    label = '{}'.format(districts)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_ukr)  
    
map_ukr

In [8]:
map_ukr.save('Ukraine_cities_n_districts.html')

<h3>4. Exploring dataset with Foursquare API</h3>

<h4>Define Foursquare Credentials and Version</h4>

In [9]:
# Load credentials
with open('credentials.json') as json_file:
    credentials = json.load(json_file)

CLIENT_ID = credentials['CLIENT_ID'] # your Foursquare ID
CLIENT_SECRET = credentials['CLIENT_SECRET'] # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

<h4>Let's create a function to get venues for all the districts</h4>

In [10]:
def getNearbyVenues(cities, names, latitudes, longitudes, radius, LIMIT):
    
    venues_list=[]
    for city, name, lat, lng in zip(cities, names, latitudes, longitudes):
        print('{}, {}'.format(city,name))
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            city,
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = [
        'City',
        'District', 
        'District Latitude',
        'District Longitude',
        'Venue',
        'Venue Latitude',
        'Venue Longitude',
        'Venue Category'
    ]
    
    return(nearby_venues)

<h4>Retriving top 100 venues in 2500 meters (avg. size of district)</h4>

In [11]:
districts_venues = getNearbyVenues(
    cities=districts['City'],
    names=districts['District'],
    latitudes=districts['Latitude'],
    longitudes=districts['Longitude'],
    radius=3000,
    LIMIT=100
)

Kyiv, Holossijiw
Kyiv, Sviatoshynskyi
Kyiv, Solomianskyi
Kyiv, Obolonskyi
Kyiv, Podilskyi
Kyiv, Pecherskyi
Kyiv, Shevchenkivskyi
Kyiv, Darnytskyi
Kyiv, Dniprovskyi
Kyiv, Desnianskyi
Kharkiv, Shevchenkivskyi
Kharkiv, Kyivskyi
Kharkiv, Slobidskyi
Kharkiv, Osnovianskyi
Kharkiv, Kholodnohirskyi
Kharkiv, Moskovskyi
Kharkiv, Novobavarskyi
Kharkiv, Industrialnyi
Kharkiv, Nemyshlyanskyi
Odessa, Suvorovsky
Odessa, Prymorsky
Odessa, Malynovsky
Odessa, Kyivsky
Dnipro, Amur-Nyzhnodniprovskyi
Dnipro, Shevchenkivskyi
Dnipro, Sobornyi
Dnipro, Industrialnyi
Dnipro, Tsentralnyi
Dnipro, Chechelivskyi
Dnipro, Novokodatskyi
Dnipro, Samarskyi
Zaporizhia, Oleksandrivskyi
Zaporizhia, Komunarskyi
Zaporizhia, Dneprovsky
Zaporizhia, Voznesenskyi
Zaporizhia, Khortytskyi
Zaporizhia, Shevchenkivskyi
Lviv, Halytskyi
Lviv, Zaliznychnyi
Lviv, Lychakivskyi
Lviv, Frankivskyi
Lviv, Shevchenkivskyi
Lviv, Sykhivskyi
Kryvyi Rih, Dolgintsevskiy
Kryvyi Rih, Inhuletskyi
Kryvyi Rih, Metalurhiynyy
Kryvyi Rih, Pokrovsky
Kryvyi R

<h4>Let's check out result</h4>

In [12]:
districts_venues.head(10)

Unnamed: 0,City,District,District Latitude,District Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Kyiv,Holossijiw,50.340463,30.552943,Ingul-kart,50.340922,30.552466,Go Kart Track
1,Kyiv,Holossijiw,50.340463,30.552943,Wine House,50.338828,30.554026,Wine Shop
2,Kyiv,Holossijiw,50.340463,30.552943,Multiplex (Мультиплекс),50.341234,30.55081,Multiplex
3,Kyiv,Holossijiw,50.340463,30.552943,PROLESKI CLUB,50.343436,30.545772,Ski Trail
4,Kyiv,Holossijiw,50.340463,30.552943,Тактильний зоопарк Єнотія,50.342856,30.544204,Zoo
5,Kyiv,Holossijiw,50.340463,30.552943,Ramada Encore Kyiv,50.340487,30.551002,Hotel
6,Kyiv,Holossijiw,50.340463,30.552943,PLANETTOYS,50.343332,30.545728,Toy / Game Store
7,Kyiv,Holossijiw,50.340463,30.552943,Fiori Il Ristorante,50.34301,30.55223,Italian Restaurant
8,Kyiv,Holossijiw,50.340463,30.552943,ART фабрика,50.343226,30.54556,Arcade
9,Kyiv,Holossijiw,50.340463,30.552943,Cool Jumper,50.343252,30.544658,Arcade


In [13]:
districts_venues.shape

(3691, 8)

In [14]:
districts_venues.dtypes

City                   object
District               object
District Latitude     float64
District Longitude    float64
Venue                  object
Venue Latitude        float64
Venue Longitude       float64
Venue Category         object
dtype: object

In [15]:
# checking out for duplicates
districts_venues.duplicated(subset=["Venue","Venue Latitude","Venue Longitude"], keep='first').any()

True

In [16]:
# remove duplicates
districts_venues.drop_duplicates(subset=["Venue","Venue Latitude","Venue Longitude"],keep='first',inplace=True) 

In [17]:
districts_venues.shape

(3350, 8)

<h4>Let's find out how many unique Venue Category presented in data</h4>

In [18]:
print('There are {} uniques categories.'.format(len(districts_venues['Venue Category'].unique())))

There are 339 uniques categories.


<h4>Let's find out names of unique Venue Category presented in data</h4>

In [19]:
# print out the list of categories
districts_venues['Venue Category'].unique()

array(['Go Kart Track', 'Wine Shop', 'Multiplex', 'Ski Trail', 'Zoo',
       'Hotel', 'Toy / Game Store', 'Italian Restaurant', 'Arcade',
       'Theme Park', 'Automotive Shop', 'Gas Station', 'Coffee Shop',
       'Soccer Field', 'Athletics & Sports', 'Restaurant', 'Café',
       'Shopping Mall', 'Dessert Shop', 'Racetrack',
       'Eastern European Restaurant', 'Supermarket', 'Burger Joint',
       'Historic Site', 'Roller Rink', 'Molecular Gastronomy Restaurant',
       'Fried Chicken Joint', 'Indie Movie Theater', 'Electronics Store',
       'Bowling Alley', 'Boutique', 'Gym / Fitness Center', 'Auto Garage',
       'Auto Workshop', 'Park', 'BBQ Joint', 'Forest', 'Market',
       'Sports Club', 'Lake', 'Harbor / Marina', 'Bus Stop', 'Bay',
       'Pier', 'Beach', 'History Museum', 'Candy Store', 'Gym',
       'Adult Boutique', 'Caucasian Restaurant', 'Paintball Field',
       'Nightclub', 'Beer Store', 'Fast Food Restaurant', 'Pizza Place',
       'Gourmet Shop', 'Platform', 'Rental

<h4>Let's find out if retail-entertainment center presented in Venue Category</h4>

In [20]:
districts_venues[districts_venues['Venue Category'].str.match('Retail-entertainment center')]

Unnamed: 0,City,District,District Latitude,District Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category


None. It's turns out that there is no divisions of The Shopping center. The Shopping centers labaled as Shopping Mall.

Let's find out if Shopping Mall is presented.

In [21]:
"Shopping Mall" in districts_venues['Venue Category'].unique()

True

Let's print all the Shopping centers (Shopping Malls).

In [22]:
retail_entertainment_centers = districts_venues[districts_venues['Venue Category'].str.match('Shopping Mall')]
retail_entertainment_centers

Unnamed: 0,City,District,District Latitude,District Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
20,Kyiv,Holossijiw,50.340463,30.552943,Art Mall (ТРЦ «Art Mall»),50.343023,30.544996,Shopping Mall
25,Kyiv,Holossijiw,50.340463,30.552943,Домосфера / Domosfera,50.343931,30.552467,Shopping Mall
36,Kyiv,Holossijiw,50.340463,30.552943,ТРЦ «Атмосфера»,50.341034,30.550947,Shopping Mall
263,Kyiv,Obolonskyi,50.501431,30.497548,"Dream Town, 1 лінія",50.506977,30.498576,Shopping Mall
321,Kyiv,Podilskyi,50.504455,30.423153,ТЦ «Орнамент»,50.504502,30.437537,Shopping Mall
798,Kyiv,Dniprovskyi,50.454127,30.602018,ТРК «Проспект» (ТРЦ «Проспект»),50.455011,30.634672,Shopping Mall
847,Kyiv,Desnianskyi,50.516639,30.607183,ТРЦ «РайON»,50.516511,30.602056,Shopping Mall
891,Kyiv,Desnianskyi,50.516639,30.607183,"ТЦ ""КОРАЛЛ""",50.533629,30.60329,Shopping Mall
968,Kharkiv,Shevchenkivskyi,50.043332,36.221213,РОСТ Олексіївський / ROST Oleksiivskyi,50.060865,36.202359,Shopping Mall
1044,Kharkiv,Kyivskyi,50.042352,36.302577,ТРЦ «Дафi» / Dafi Mall (ТРЦ «Дафi»),50.026825,36.330973,Shopping Mall


Let's drop regular Shopping centers (Shopping Malls).

In [23]:
retail_entertainment_centers = retail_entertainment_centers[~retail_entertainment_centers['Venue'].str.contains("ТЦ", na=False)]
retail_entertainment_centers

Unnamed: 0,City,District,District Latitude,District Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
20,Kyiv,Holossijiw,50.340463,30.552943,Art Mall (ТРЦ «Art Mall»),50.343023,30.544996,Shopping Mall
25,Kyiv,Holossijiw,50.340463,30.552943,Домосфера / Domosfera,50.343931,30.552467,Shopping Mall
36,Kyiv,Holossijiw,50.340463,30.552943,ТРЦ «Атмосфера»,50.341034,30.550947,Shopping Mall
263,Kyiv,Obolonskyi,50.501431,30.497548,"Dream Town, 1 лінія",50.506977,30.498576,Shopping Mall
798,Kyiv,Dniprovskyi,50.454127,30.602018,ТРК «Проспект» (ТРЦ «Проспект»),50.455011,30.634672,Shopping Mall
847,Kyiv,Desnianskyi,50.516639,30.607183,ТРЦ «РайON»,50.516511,30.602056,Shopping Mall
968,Kharkiv,Shevchenkivskyi,50.043332,36.221213,РОСТ Олексіївський / ROST Oleksiivskyi,50.060865,36.202359,Shopping Mall
1044,Kharkiv,Kyivskyi,50.042352,36.302577,ТРЦ «Дафi» / Dafi Mall (ТРЦ «Дафi»),50.026825,36.330973,Shopping Mall
1050,Kharkiv,Kyivskyi,50.042352,36.302577,ТРЦ «Караван» / Karavan Mall (ТРЦ «Караван»),50.029041,36.328247,Shopping Mall
1122,Kharkiv,Slobidskyi,49.944131,36.282172,РОСТ Одеський,49.952084,36.260823,Shopping Mall


There is no way to atomate drop procedure, so I checked out each one of them manually. I dropped all not retail-entertainment centers. 

In [24]:
retail_entertainment_centers.reset_index(drop=True,inplace=True) 

In [25]:
retail_entertainment_centers

Unnamed: 0,City,District,District Latitude,District Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Kyiv,Holossijiw,50.340463,30.552943,Art Mall (ТРЦ «Art Mall»),50.343023,30.544996,Shopping Mall
1,Kyiv,Holossijiw,50.340463,30.552943,Домосфера / Domosfera,50.343931,30.552467,Shopping Mall
2,Kyiv,Holossijiw,50.340463,30.552943,ТРЦ «Атмосфера»,50.341034,30.550947,Shopping Mall
3,Kyiv,Obolonskyi,50.501431,30.497548,"Dream Town, 1 лінія",50.506977,30.498576,Shopping Mall
4,Kyiv,Dniprovskyi,50.454127,30.602018,ТРК «Проспект» (ТРЦ «Проспект»),50.455011,30.634672,Shopping Mall
5,Kyiv,Desnianskyi,50.516639,30.607183,ТРЦ «РайON»,50.516511,30.602056,Shopping Mall
6,Kharkiv,Shevchenkivskyi,50.043332,36.221213,РОСТ Олексіївський / ROST Oleksiivskyi,50.060865,36.202359,Shopping Mall
7,Kharkiv,Kyivskyi,50.042352,36.302577,ТРЦ «Дафi» / Dafi Mall (ТРЦ «Дафi»),50.026825,36.330973,Shopping Mall
8,Kharkiv,Kyivskyi,50.042352,36.302577,ТРЦ «Караван» / Karavan Mall (ТРЦ «Караван»),50.029041,36.328247,Shopping Mall
9,Kharkiv,Slobidskyi,49.944131,36.282172,РОСТ Одеський,49.952084,36.260823,Shopping Mall


In [26]:
retail_entertainment_centers = retail_entertainment_centers.drop(index=[9,10,12,13,19,20,21,22,23,27,28,29]) 

In [27]:
retail_entertainment_centers.reset_index(drop=True,inplace=True) 

In [28]:
retail_entertainment_centers

Unnamed: 0,City,District,District Latitude,District Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Kyiv,Holossijiw,50.340463,30.552943,Art Mall (ТРЦ «Art Mall»),50.343023,30.544996,Shopping Mall
1,Kyiv,Holossijiw,50.340463,30.552943,Домосфера / Domosfera,50.343931,30.552467,Shopping Mall
2,Kyiv,Holossijiw,50.340463,30.552943,ТРЦ «Атмосфера»,50.341034,30.550947,Shopping Mall
3,Kyiv,Obolonskyi,50.501431,30.497548,"Dream Town, 1 лінія",50.506977,30.498576,Shopping Mall
4,Kyiv,Dniprovskyi,50.454127,30.602018,ТРК «Проспект» (ТРЦ «Проспект»),50.455011,30.634672,Shopping Mall
5,Kyiv,Desnianskyi,50.516639,30.607183,ТРЦ «РайON»,50.516511,30.602056,Shopping Mall
6,Kharkiv,Shevchenkivskyi,50.043332,36.221213,РОСТ Олексіївський / ROST Oleksiivskyi,50.060865,36.202359,Shopping Mall
7,Kharkiv,Kyivskyi,50.042352,36.302577,ТРЦ «Дафi» / Dafi Mall (ТРЦ «Дафi»),50.026825,36.330973,Shopping Mall
8,Kharkiv,Kyivskyi,50.042352,36.302577,ТРЦ «Караван» / Karavan Mall (ТРЦ «Караван»),50.029041,36.328247,Shopping Mall
9,Kharkiv,Moskovskyi,49.998253,36.314906,ТРЦ «Французький Бульвар»,49.990234,36.289936,Shopping Mall


In [29]:
retail_entertainment_centers.shape

(27, 8)

Let's count amount of retail-entertainment centers per district.

In [30]:
districts_2 = retail_entertainment_centers.groupby(['City','District']).count().reset_index()
districts_2 = districts_2.drop(['District Latitude', 'District Longitude', 'Venue Latitude', 'Venue Longitude', 'Venue Category'], axis=1) 
districts_2.rename(columns={"Venue": "Amount"}, inplace=True)
districts_2

Unnamed: 0,City,District,Amount
0,Dnipro,Amur-Nyzhnodniprovskyi,1
1,Dnipro,Chechelivskyi,1
2,Dnipro,Shevchenkivskyi,3
3,Kharkiv,Kyivskyi,2
4,Kharkiv,Moskovskyi,1
5,Kharkiv,Shevchenkivskyi,1
6,Kherson,Suvorovskiy,3
7,Kryvyi Rih,Pokrovsky,1
8,Kryvyi Rih,Saksahanskyi,1
9,Kryvyi Rih,Tsentral'no-Gorodskoy,1


Let's add districts without retail-entertainment centers.

In [31]:
tmp = districts[districts.columns[0:2]]
tmp.insert(2, 'Amount', 0)
tmp = pd.concat([districts_2,tmp])
tmp.drop_duplicates(subset=["City","District"],keep='first',inplace=True) 
tmp.sort_values(by=['City', 'District'],inplace=True)
tmp.reset_index(drop=True,inplace=True) 
districts_2 = tmp

In [32]:
districts_2

Unnamed: 0,City,District,Amount
0,Dnipro,Amur-Nyzhnodniprovskyi,1
1,Dnipro,Chechelivskyi,1
2,Dnipro,Industrialnyi,0
3,Dnipro,Novokodatskyi,0
4,Dnipro,Samarskyi,0
5,Dnipro,Shevchenkivskyi,3
6,Dnipro,Sobornyi,0
7,Dnipro,Tsentralnyi,0
8,Kharkiv,Industrialnyi,0
9,Kharkiv,Kholodnohirskyi,0


In [33]:
tmp = []

for city, district, amount in zip(districts_2["City"], districts_2["District"], districts_2["Amount"]):
    tmp2 = districts.query('City == "{}" & District == "{}"'.format(city,district))
    tmp2.reset_index(drop=True,inplace=True) 
    tmp.append([(
        city,
        district, 
        tmp2.Latitude[0], 
        tmp2.Longitude[0],
        amount
   )])

tmp2 = pd.DataFrame([item for tmp in tmp for item in tmp])
tmp2.columns = [
    'City',
    'District', 
    'District Latitude',
    'District Longitude',
    'Amount'
]

districts_2 = tmp2
districts_2

Unnamed: 0,City,District,District Latitude,District Longitude,Amount
0,Dnipro,Amur-Nyzhnodniprovskyi,48.517912,34.998314,1
1,Dnipro,Chechelivskyi,48.435958,34.986224,1
2,Dnipro,Industrialnyi,48.522197,35.07921,0
3,Dnipro,Novokodatskyi,48.471688,34.886712,0
4,Dnipro,Samarskyi,48.483577,35.133972,0
5,Dnipro,Shevchenkivskyi,48.408799,35.012917,3
6,Dnipro,Sobornyi,48.429569,35.058036,0
7,Dnipro,Tsentralnyi,48.462277,35.02599,0
8,Kharkiv,Industrialnyi,49.933503,36.405567,0
9,Kharkiv,Kholodnohirskyi,50.006473,36.18064,0


In [34]:
districts_2.shape

(53, 5)

<h3>5. Cluster districts by k-means method</h3>

In [35]:
# set number of clusters
nclusters = 3

rec_clustering = districts_2.drop(['City', 'District', 'District Latitude', 'District Longitude'], axis = 1) 

# run k-means clustering
kmeans = KMeans(n_clusters=nclusters, random_state=1).fit(rec_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_

array([2, 2, 0, 0, 0, 1, 0, 0, 0, 0, 1, 2, 0, 0, 0, 2, 0, 0, 0, 1, 0, 0,
       0, 2, 2, 0, 2, 0, 2, 2, 1, 2, 0, 0, 0, 0, 0, 0, 2, 0, 2, 0, 2, 2,
       0, 0, 0, 0, 0, 0, 1, 0, 0], dtype=int32)

In [36]:
# add clustering labels
districts_2["Cluster"] = kmeans.labels_

In [37]:
districts_2

Unnamed: 0,City,District,District Latitude,District Longitude,Amount,Cluster
0,Dnipro,Amur-Nyzhnodniprovskyi,48.517912,34.998314,1,2
1,Dnipro,Chechelivskyi,48.435958,34.986224,1,2
2,Dnipro,Industrialnyi,48.522197,35.07921,0,0
3,Dnipro,Novokodatskyi,48.471688,34.886712,0,0
4,Dnipro,Samarskyi,48.483577,35.133972,0,0
5,Dnipro,Shevchenkivskyi,48.408799,35.012917,3,1
6,Dnipro,Sobornyi,48.429569,35.058036,0,0
7,Dnipro,Tsentralnyi,48.462277,35.02599,0,0
8,Kharkiv,Industrialnyi,49.933503,36.405567,0,0
9,Kharkiv,Kholodnohirskyi,50.006473,36.18064,0,0


<h4>Visualize results</h4>

In [38]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=6)

# set color scheme for the clusters
x = np.arange(nclusters)
ys = [i+x+(i*x)**2 for i in range(nclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(districts_2['District Latitude'], districts_2['District Longitude'], districts_2['District'], districts_2['Cluster']):
    label = folium.Popup(str(poi) + ' - Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

<h3>6. Examine Clusters</h3>

<h4>Cluster 0</h4>

In [39]:
districts_2.loc[districts_2['Cluster'] == 0]

Unnamed: 0,City,District,District Latitude,District Longitude,Amount,Cluster
2,Dnipro,Industrialnyi,48.522197,35.07921,0,0
3,Dnipro,Novokodatskyi,48.471688,34.886712,0,0
4,Dnipro,Samarskyi,48.483577,35.133972,0,0
6,Dnipro,Sobornyi,48.429569,35.058036,0,0
7,Dnipro,Tsentralnyi,48.462277,35.02599,0,0
8,Kharkiv,Industrialnyi,49.933503,36.405567,0,0
9,Kharkiv,Kholodnohirskyi,50.006473,36.18064,0,0
12,Kharkiv,Nemyshlyanskyi,49.963085,36.333047,0,0
13,Kharkiv,Novobavarskyi,49.950876,36.165208,0,0
14,Kharkiv,Osnovianskyi,49.937761,36.239656,0,0


<h4>Cluster 1</h4>

In [40]:
districts_2.loc[districts_2['Cluster'] == 1]

Unnamed: 0,City,District,District Latitude,District Longitude,Amount,Cluster
5,Dnipro,Shevchenkivskyi,48.408799,35.012917,3,1
10,Kharkiv,Kyivskyi,50.042352,36.302577,2,1
19,Kherson,Suvorovskiy,46.669802,32.60638,3,1
30,Kyiv,Holossijiw,50.340463,30.552943,3,1
50,Zaporizhia,Oleksandrivskyi,47.822009,35.169259,2,1


<h4>Cluster 2</h4>

In [41]:
districts_2.loc[districts_2['Cluster'] == 2]

Unnamed: 0,City,District,District Latitude,District Longitude,Amount,Cluster
0,Dnipro,Amur-Nyzhnodniprovskyi,48.517912,34.998314,1,2
1,Dnipro,Chechelivskyi,48.435958,34.986224,1,2
11,Kharkiv,Moskovskyi,49.998253,36.314906,1,2
15,Kharkiv,Shevchenkivskyi,50.043332,36.221213,1,2
23,Kryvyi Rih,Pokrovsky,48.019842,33.466228,1,2
24,Kryvyi Rih,Saksahanskyi,47.945405,33.411358,1,2
26,Kryvyi Rih,Tsentral'no-Gorodskoy,47.897843,33.319511,1,2
28,Kyiv,Desnianskyi,50.516639,30.607183,1,2
29,Kyiv,Dniprovskyi,50.454127,30.602018,1,2
31,Kyiv,Obolonskyi,50.501431,30.497548,1,2


<h3>7. Observations</h3>

<p>Most of retail-entertainment centers are presented with one per district, with the highest number in cluster 2. Cluster 1 has a group with high density retail-entertainment centers. The Cluster 0 contains none retail-entertainment centers.</p>
<p>I would recommend to build a new retail-entertainment center in Cluster 0 or Cluster 1. Cluster 0 would a good choice to avoid competition, while Cluster 1 would provide a competition with stable revenue. The competition in Cluster 2 is bad idea, because you just drop an income from retail-entertainment center.</p>
<p>On the other hand, we can avoid any kind of competition if we build super community retail-entertainment center and grab income from whole city.</p>