# Immigrants in Istanbul
## Capstone Project - The Battle of Neighborhoods (Week 1-2)

### Introduction

#### Background

Ever since 2011, one of the biggest humanitarian crises in history has begun to be experienced in Syria. More than 5.5 million Syrians were forced to leave their country.More than 5.5 million forced to flee from Syria's 3.2 million refugees are living in Turkey, ie more than half. The main source of urban population growth in Istanbul in the last seventy years is the migration.The Syrian refugees under temporary protection in Istanbul live in all 39 districts of Istanbul, even if they are of different density and numbers. The distribution of the Syrian refugees is very clearly concentrated on the European side. According to November 2016, 86% (411,318) of the Syrians covered by 478,850 temporary protection in Istanbul is located on the European side and 14% (67,532) on the Asian side. 

Despite the exceptions, it is seen that the places where refugees are most preferred / sheltered, poverty is common, conservatism-religiosity is evident, the social environment is resistant and relatively life is cheaper.

#### Business Problem

Local governments wants to know population of immigrants in Istanbul. They know the amount of population; but what are the reasons on choosing borough? When the population increasing growth day by day and local governments have to know more details. It will be very useful for placing new immigrants incoming. Getting more information from boroughs where the immigrants live will be expands the visions and policies on them. We will obtain more data from researches and clustering by borough with additional data, will be using ML techniques with Foursquare.

### Data Section

We will use updated statistical data including population of immigrants in Istanbul. Data from local governments will support us for the amount of living immigrants. We have a report from Immigration Administration of Istanbul. Imported as 'multeci.xlsx'. We have 3 columns as [Boroughs], [Syrian(population)], [Ratio(to population)]. To explore and target recommended locations across different venues according to the presence of amenities and essential facilities, we will access data through FourSquare API interface and arrange them as a dataframe for visualization.

We will use the presentation data from http://marmara.gov.tr/UserFiles/Attachments/2018/09/14/52dd9554-8e48-4982-8051-a148b65cbdb3.pdf. Raw data hasn't been found but the tables in the pdf file we able to use it.

### Methodology section

The Methodology section will describe the main components of our analysis and predication system. The Methodology section comprises four stages:

    1. Collect Inspection Data
    2. Explore and Understand Data
    3. Data preparation and preprocessing 
    4. Modeling

##### 1. Collect Inspection Data

After importing the necessary libraries, we covert the table as excel file:

In [1]:
import os # Operating System
import numpy as np
import pandas as pd
import datetime as dt # Datetime
import json # library to handle JSON files

!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

!conda install -c conda-forge folium=0.5.0 --yes
import folium #import folium # map rendering library

print('Libraries imported.')

Collecting package metadata: done
Solving environment: done

## Package Plan ##

  environment location: /home/jupyterlab/conda

  added / updated specs:
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2019.3.9   |       hecc5488_0         146 KB  conda-forge
    certifi-2019.3.9           |           py36_0         149 KB  conda-forge
    conda-4.6.8                |           py36_0         876 KB  conda-forge
    geographiclib-1.49         |             py_0          32 KB  conda-forge
    geopy-1.19.0               |             py_0          53 KB  conda-forge
    openssl-1.1.1b             |       h14c3975_1         4.0 MB  conda-forge
    ------------------------------------------------------------
                                           Total:         5.2 MB

The following NEW packages will be INSTALLED:

  geographiclib      conda-forge/noarch::g

In [2]:
df_mult = pd.read_excel('multeci.xlsx')

Before using data, we will have to explore and understand it.

##### 2. Explore and Understand Data

We read the dataset that we collected from excel file into a pandas' data frame and display the first five rows of it as follows:

In [3]:
df_mult.head()

Unnamed: 0,Borough,Syrian,%
0,Küçükçekmece,38278.0,5.02
1,Bağcılar,37643.0,4.97
2,Sultangazi,31426.0,6.02
3,Fatih,30747.0,7.33
4,Esenyurt,29177.0,3.92


In [4]:
df_mult.shape

(27, 3)

In [5]:
df_mult.dtypes

Borough     object
Syrian     float64
%          float64
dtype: object

In [6]:
geolocator = Nominatim()

  """Entry point for launching an IPython kernel.


In [7]:
df_mult['borough_coord'] = df_mult['Borough'].apply(geolocator.geocode).apply(lambda x: (x.latitude, x.longitude))

In [8]:
df_mult

Unnamed: 0,Borough,Syrian,%,borough_coord
0,Küçükçekmece,38278.0,5.02,"(41.021033, 28.7764316020368)"
1,Bağcılar,37643.0,4.97,"(41.0338992, 28.8578982)"
2,Sultangazi,31426.0,6.02,"(41.1092401, 28.8826142)"
3,Fatih,30747.0,7.33,"(41.014462, 28.9545506959692)"
4,Esenyurt,29177.0,3.92,"(41.0342402, 28.6800178)"
5,Başakşehir,26424.0,7.48,"(41.1027276, 28.7724595536567)"
6,Zeytinburnu,25000.0,8.63,"(40.9881179, 28.9036351)"
7,Esenler,22678.0,4.93,"(41.0332539, 28.8909528)"
8,Sultanbeyli,20192.0,6.27,"(40.9658834, 29.2723659)"
9,Avcılar,19554.0,4.59,"(40.9801353, 28.7175465)"


In [9]:
df_mult[['Latitude', 'Longitude']] = df_mult['borough_coord'].apply(pd.Series)

In [10]:
df_mult

Unnamed: 0,Borough,Syrian,%,borough_coord,Latitude,Longitude
0,Küçükçekmece,38278.0,5.02,"(41.021033, 28.7764316020368)",41.021033,28.776432
1,Bağcılar,37643.0,4.97,"(41.0338992, 28.8578982)",41.033899,28.857898
2,Sultangazi,31426.0,6.02,"(41.1092401, 28.8826142)",41.10924,28.882614
3,Fatih,30747.0,7.33,"(41.014462, 28.9545506959692)",41.014462,28.954551
4,Esenyurt,29177.0,3.92,"(41.0342402, 28.6800178)",41.03424,28.680018
5,Başakşehir,26424.0,7.48,"(41.1027276, 28.7724595536567)",41.102728,28.77246
6,Zeytinburnu,25000.0,8.63,"(40.9881179, 28.9036351)",40.988118,28.903635
7,Esenler,22678.0,4.93,"(41.0332539, 28.8909528)",41.033254,28.890953
8,Sultanbeyli,20192.0,6.27,"(40.9658834, 29.2723659)",40.965883,29.272366
9,Avcılar,19554.0,4.59,"(40.9801353, 28.7175465)",40.980135,28.717547


In [11]:
df = df_mult.drop(columns=['borough_coord'])

In [12]:
df

Unnamed: 0,Borough,Syrian,%,Latitude,Longitude
0,Küçükçekmece,38278.0,5.02,41.021033,28.776432
1,Bağcılar,37643.0,4.97,41.033899,28.857898
2,Sultangazi,31426.0,6.02,41.10924,28.882614
3,Fatih,30747.0,7.33,41.014462,28.954551
4,Esenyurt,29177.0,3.92,41.03424,28.680018
5,Başakşehir,26424.0,7.48,41.102728,28.77246
6,Zeytinburnu,25000.0,8.63,40.988118,28.903635
7,Esenler,22678.0,4.93,41.033254,28.890953
8,Sultanbeyli,20192.0,6.27,40.965883,29.272366
9,Avcılar,19554.0,4.59,40.980135,28.717547


In [13]:
address = 'Istanbul, TR'

geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Istanbul City are {}, {}.'.format(latitude, longitude))

  This is separate from the ipykernel package so we can avoid doing imports until


The geograpical coordinate of Istanbul City are 41.0096334, 28.9651646.


In [14]:

# create map of Istanbul using latitude and longitude values
map_istanbul = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, syr, borough in zip(df['Latitude'], df['Longitude'], df['Syrian'], df['Borough']):
    label = '{}, {}'.format(borough, syr)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_istanbul)  
    
map_istanbul

In [15]:

#Define Foursquare Credentials and Version

CLIENT_ID = 'FQA0EA5TCV1Z211O4H150LX2CNMX1D5HSXMI5BULIKYOIER2' # Foursquare ID
CLIENT_SECRET = 'XGXBTL3AJSBR4SPL0DNFI1P5AUWQJ5B0JSSVJOP5DTY5X4ZZ' # Foursquare Secret
VERSION = '20181206' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: FQA0EA5TCV1Z211O4H150LX2CNMX1D5HSXMI5BULIKYOIER2
CLIENT_SECRET:XGXBTL3AJSBR4SPL0DNFI1P5AUWQJ5B0JSSVJOP5DTY5X4ZZ


In [16]:
def getNearbyVenues(names, latitudes, longitudes, radius=500, LIMIT=100):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Borough', 
                  'Borough Latitude', 
                  'Borough Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [18]:
location_venues = getNearbyVenues(names=df['Borough'],
                                   latitudes=df['Latitude'],
                                   longitudes=df['Longitude']
                                  )

Küçükçekmece
Bağcılar
Sultangazi
Fatih
Esenyurt
Başakşehir
Zeytinburnu
Esenler
Sultanbeyli
Avcılar
Arnavutköy
Bahçelievler
Gaziosmanpaşa
Şişli
Ümraniye
Kâğıthane
Güngören
Sancaktepe
Beyoğlu
Bayrampaşa
Eyüp
Beylikdüzü
Büyükçekmece
Pendik
Bakırköy
Ataşehir
Kadıköy


In [19]:
location_venues

Unnamed: 0,Borough,Borough Latitude,Borough Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Küçükçekmece,41.021033,28.776432,Etnospor Kültür Festivali,41.022587,28.776108,Performing Arts Venue
1,Küçükçekmece,41.021033,28.776432,Sevilin kurutemizleme ve butik,41.018425,28.772099,Dry Cleaner
2,Küçükçekmece,41.021033,28.776432,taştepespor tesisleri,41.022219,28.773107,Soccer Stadium
3,Küçükçekmece,41.021033,28.776432,Taştepe Mavran Elit Cafe,41.022356,28.771942,Garden Center
4,Küçükçekmece,41.021033,28.776432,Taştepe Parkı,41.022355,28.771499,Park
5,Küçükçekmece,41.021033,28.776432,Taştepe,41.022998,28.773793,Rest Area
6,Küçükçekmece,41.021033,28.776432,Fırın Cake Café,41.023662,28.771962,Café
7,Küçükçekmece,41.021033,28.776432,ÇETİN TERAS,41.023123,28.774250,BBQ Joint
8,Küçükçekmece,41.021033,28.776432,Home,41.017916,28.779994,Amphitheater
9,Küçükçekmece,41.021033,28.776432,Pastella,41.023671,28.772007,Dessert Shop


In [20]:
location_venues.groupby('Borough').count()

Unnamed: 0_level_0,Borough Latitude,Borough Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Borough,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Arnavutköy,74,74,74,74,74,74
Ataşehir,37,37,37,37,37,37
Avcılar,100,100,100,100,100,100
Bahçelievler,1,1,1,1,1,1
Bakırköy,35,35,35,35,35,35
Bayrampaşa,44,44,44,44,44,44
Bağcılar,85,85,85,85,85,85
Başakşehir,11,11,11,11,11,11
Beylikdüzü,69,69,69,69,69,69
Beyoğlu,100,100,100,100,100,100


In [21]:
# get the List of Unique Categories
print('There are {} uniques categories.'.format(len(location_venues['Venue Category'].unique())))

There are 217 uniques categories.


In [22]:
location_venues.shape

(1426, 7)

In [23]:
# one hot encoding
venues_onehot = pd.get_dummies(location_venues[['Venue Category']], prefix="", prefix_sep="")

# add Borough column back to dataframe
venues_onehot['Borough'] = location_venues['Borough'] 

# move borough column to the first column
fixed_columns = [venues_onehot.columns[-1]] + list(venues_onehot.columns[:-1])

#fixed_columns
venues_onehot = venues_onehot[fixed_columns]

venues_onehot.head()

Unnamed: 0,Borough,Accessories Store,Afghan Restaurant,American Restaurant,Amphitheater,Antique Shop,Aquarium,Arcade,Arepa Restaurant,Art Gallery,...,Tunnel,Turkish Coffeehouse,Turkish Home Cooking Restaurant,Turkish Restaurant,Water Park,Waterfront,Wine Bar,Wings Joint,Women's Store,Çöp Şiş Place
0,Küçükçekmece,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Küçükçekmece,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Küçükçekmece,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Küçükçekmece,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Küçükçekmece,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [24]:
istanbul_grouped = venues_onehot.groupby('Borough').mean().reset_index()
istanbul_grouped

Unnamed: 0,Borough,Accessories Store,Afghan Restaurant,American Restaurant,Amphitheater,Antique Shop,Aquarium,Arcade,Arepa Restaurant,Art Gallery,...,Tunnel,Turkish Coffeehouse,Turkish Home Cooking Restaurant,Turkish Restaurant,Water Park,Waterfront,Wine Bar,Wings Joint,Women's Store,Çöp Şiş Place
0,Arnavutköy,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.040541,...,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.0,0.0,0.0
1,Ataşehir,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Avcılar,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.03,...,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.02,0.0,0.0
3,Bahçelievler,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Bakırköy,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Bayrampaşa,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,...,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0
6,Bağcılar,0.0,0.0,0.0,0.0,0.0,0.0,0.023529,0.0,0.011765,...,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0
7,Başakşehir,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0
8,Beylikdüzü,0.0,0.0,0.0,0.0,0.0,0.0,0.014493,0.0,0.0,...,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.0,0.0
9,Beyoğlu,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,...,0.0,0.0,0.01,0.05,0.0,0.0,0.01,0.0,0.0,0.0


In [25]:
istanbul_grouped.shape

(27, 218)

In [26]:
# What are the top 5 venues/facilities nearby profitable immigrants?#

num_top_venues = 5

for hood in istanbul_grouped['Borough']:
    print("----"+hood+"----")
    temp = istanbul_grouped[istanbul_grouped['Borough'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Arnavutköy----
                venue  freq
0  Seafood Restaurant  0.23
1                Café  0.14
2          Restaurant  0.05
3              Lounge  0.04
4         Art Gallery  0.04


----Ataşehir----
              venue  freq
0              Café  0.16
1        Restaurant  0.11
2  Doner Restaurant  0.08
3       Coffee Shop  0.05
4  Kebab Restaurant  0.05


----Avcılar----
                  venue  freq
0                  Café  0.30
1  Gym / Fitness Center  0.05
2          Dessert Shop  0.05
3        Breakfast Spot  0.04
4                   Bar  0.04


----Bahçelievler----
                       venue  freq
0                       Farm   1.0
1                 Nail Salon   0.0
2                    Meyhane   0.0
3  Middle Eastern Restaurant   0.0
4          Mobile Phone Shop   0.0


----Bakırköy----
                venue  freq
0                Café  0.26
1                Park  0.09
2          Playground  0.06
3          Restaurant  0.06
4  Athletics & Sports  0.06


----Bayrampaşa----

In [27]:
# Define a function to return the most common venues#

def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [28]:

num_top_venues = 10

indicators = ['br', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Borough']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

In [29]:
# create a new dataframe
venues_sorted = pd.DataFrame(columns=columns)
venues_sorted['Borough'] = istanbul_grouped['Borough']

for ind in np.arange(istanbul_grouped.shape[0]):
    venues_sorted.iloc[ind, 1:] = return_most_common_venues(istanbul_grouped.iloc[ind, :], num_top_venues)

In [30]:
venues_sorted.head(10)

Unnamed: 0,Borough,1br Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Arnavutköy,Seafood Restaurant,Café,Restaurant,Lounge,Art Gallery,Pastry Shop,Arts & Entertainment,Lighthouse,Boat or Ferry,Cocktail Bar
1,Ataşehir,Café,Restaurant,Doner Restaurant,Bakery,Coffee Shop,Bistro,Kebab Restaurant,Gym / Fitness Center,Meyhane,Residential Building (Apartment / Condo)
2,Avcılar,Café,Gym / Fitness Center,Dessert Shop,Bar,Breakfast Spot,Restaurant,Turkish Restaurant,Art Gallery,Jewelry Store,Boutique
3,Bahçelievler,Farm,Deli / Bodega,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Dry Cleaner,Donut Shop,Doner Restaurant,Dog Run,Dive Bar
4,Bakırköy,Café,Park,Athletics & Sports,Restaurant,Playground,Pizza Place,Shopping Mall,Dessert Shop,Kebab Restaurant,Spa
5,Bayrampaşa,Clothing Store,Café,Turkish Restaurant,Gym / Fitness Center,Buffet,Fast Food Restaurant,Cosmetics Shop,Men's Store,Dessert Shop,Breakfast Spot
6,Bağcılar,Café,Gym,Coffee Shop,Turkish Restaurant,Steakhouse,Hookah Bar,Dessert Shop,Gym / Fitness Center,Arcade,Soup Place
7,Başakşehir,Pizza Place,Butcher,Soccer Stadium,Department Store,Kebab Restaurant,Turkish Restaurant,Garden,Steakhouse,Playground,Café
8,Beylikdüzü,Café,Turkish Restaurant,Restaurant,Nightclub,Pizza Place,Gym,Dessert Shop,Bar,Supermarket,Kebab Restaurant
9,Beyoğlu,Café,Restaurant,Coffee Shop,Turkish Restaurant,Hotel,Cocktail Bar,Bar,Italian Restaurant,Theater,History Museum


In [31]:
venues_sorted.shape

(27, 11)

In [32]:
istanbul_grouped.shape

(27, 218)

In [33]:
istanbul_grouped=df

In [34]:
from sklearn.cluster import KMeans

In [35]:

#Distribute in 5 Clusters

# set number of clusters
kclusters = 5

istanbul_grouped_clustering = istanbul_grouped.drop('Borough', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(istanbul_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:50]

array([3, 3, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 4, 4, 4, 4, 4, 4, 4, 4, 1,
       1, 1, 1, 1, 1], dtype=int32)

In [36]:

#Dataframe to include Clusters

istanbul_grouped_clustering=df
istanbul_grouped_clustering.head()

Unnamed: 0,Borough,Syrian,%,Latitude,Longitude
0,Küçükçekmece,38278.0,5.02,41.021033,28.776432
1,Bağcılar,37643.0,4.97,41.033899,28.857898
2,Sultangazi,31426.0,6.02,41.10924,28.882614
3,Fatih,30747.0,7.33,41.014462,28.954551
4,Esenyurt,29177.0,3.92,41.03424,28.680018


In [37]:
df.dtypes

Borough       object
Syrian       float64
%            float64
Latitude     float64
Longitude    float64
dtype: object

In [38]:
# add clustering labels
istanbul_grouped_clustering['Cluster Labels'] = kmeans.labels_

# merge istanbul_grouped with istanbul_data to add latitude/longitude for each neighborhood
istanbul_grouped_clustering = istanbul_grouped_clustering.join(venues_sorted.set_index('Borough'), on='Borough')

istanbul_grouped_clustering.head(30) # check the last columns!

Unnamed: 0,Borough,Syrian,%,Latitude,Longitude,Cluster Labels,1br Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Küçükçekmece,38278.0,5.02,41.021033,28.776432,3,Park,Soccer Stadium,Amphitheater,Performing Arts Venue,Garden Center,BBQ Joint,Dry Cleaner,Rental Car Location,Café,Stationery Store
1,Bağcılar,37643.0,4.97,41.033899,28.857898,3,Café,Gym,Coffee Shop,Turkish Restaurant,Steakhouse,Hookah Bar,Dessert Shop,Gym / Fitness Center,Arcade,Soup Place
2,Sultangazi,31426.0,6.02,41.10924,28.882614,2,Music Venue,Factory,Farm,Tunnel,Soccer Stadium,Deli / Bodega,Dumpling Restaurant,Dry Cleaner,Donut Shop,Doner Restaurant
3,Fatih,30747.0,7.33,41.014462,28.954551,2,Café,Hotel,Turkish Restaurant,Kebab Restaurant,Steakhouse,Dessert Shop,Restaurant,Middle Eastern Restaurant,Tea Room,Historic Site
4,Esenyurt,29177.0,3.92,41.03424,28.680018,2,Café,Mobile Phone Shop,Restaurant,Electronics Store,Farm,Burger Joint,Fast Food Restaurant,Pool Hall,Bookstore,Hotel
5,Başakşehir,26424.0,7.48,41.102728,28.77246,2,Pizza Place,Butcher,Soccer Stadium,Department Store,Kebab Restaurant,Turkish Restaurant,Garden,Steakhouse,Playground,Café
6,Zeytinburnu,25000.0,8.63,40.988118,28.903635,2,Café,Turkish Restaurant,Clothing Store,Restaurant,Steakhouse,Hookah Bar,Tea Room,Dessert Shop,Ice Cream Shop,Coffee Shop
7,Esenler,22678.0,4.93,41.033254,28.890953,0,Café,Gym / Fitness Center,Gym,Soccer Stadium,Pizza Place,Pharmacy,Fried Chicken Joint,Restaurant,Fast Food Restaurant,Pub
8,Sultanbeyli,20192.0,6.27,40.965883,29.272366,0,Café,Steakhouse,Kebab Restaurant,Turkish Restaurant,Gym,Electronics Store,Mobile Phone Shop,Tea Room,Doner Restaurant,Boutique
9,Avcılar,19554.0,4.59,40.980135,28.717547,0,Café,Gym / Fitness Center,Dessert Shop,Bar,Breakfast Spot,Restaurant,Turkish Restaurant,Art Gallery,Jewelry Store,Boutique


In [39]:
# Create Map

map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, br, poi, cluster in zip(istanbul_grouped_clustering['Latitude'], istanbul_grouped_clustering['Longitude'], istanbul_grouped_clustering['Borough'], istanbul_grouped_clustering['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, br],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [40]:
istanbul_grouped_clustering.loc[istanbul_grouped_clustering['Cluster Labels'] == 0, istanbul_grouped_clustering.columns[[1] + list(range(5, istanbul_grouped_clustering.shape[1]))]].head()

Unnamed: 0,Syrian,Cluster Labels,1br Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,22678.0,0,Café,Gym / Fitness Center,Gym,Soccer Stadium,Pizza Place,Pharmacy,Fried Chicken Joint,Restaurant,Fast Food Restaurant,Pub
8,20192.0,0,Café,Steakhouse,Kebab Restaurant,Turkish Restaurant,Gym,Electronics Store,Mobile Phone Shop,Tea Room,Doner Restaurant,Boutique
9,19554.0,0,Café,Gym / Fitness Center,Dessert Shop,Bar,Breakfast Spot,Restaurant,Turkish Restaurant,Art Gallery,Jewelry Store,Boutique
10,17838.0,0,Seafood Restaurant,Café,Restaurant,Lounge,Art Gallery,Pastry Shop,Arts & Entertainment,Lighthouse,Boat or Ferry,Cocktail Bar
11,17710.0,0,Farm,Deli / Bodega,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Dry Cleaner,Donut Shop,Doner Restaurant,Dog Run,Dive Bar


In [41]:
istanbul_grouped_clustering.loc[istanbul_grouped_clustering['Cluster Labels'] == 1, istanbul_grouped_clustering.columns[[1] + list(range(5, istanbul_grouped_clustering.shape[1]))]].head()

Unnamed: 0,Syrian,Cluster Labels,1br Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
21,6.728,1,Café,Turkish Restaurant,Restaurant,Nightclub,Pizza Place,Gym,Dessert Shop,Bar,Supermarket,Kebab Restaurant
22,5555.0,1,Café,Beach,Seafood Restaurant,Bar,Turkish Restaurant,Art Gallery,Italian Restaurant,Salon / Barbershop,Restaurant,Ice Cream Shop
23,4951.0,1,Borek Place,Dessert Shop,Department Store,Snack Place,Bakery,Arts & Crafts Store,Furniture / Home Store,Rental Car Location,Fast Food Restaurant,Soccer Stadium
24,2191.0,1,Café,Park,Athletics & Sports,Restaurant,Playground,Pizza Place,Shopping Mall,Dessert Shop,Kebab Restaurant,Spa
25,1436.0,1,Café,Restaurant,Doner Restaurant,Bakery,Coffee Shop,Bistro,Kebab Restaurant,Gym / Fitness Center,Meyhane,Residential Building (Apartment / Condo)


In [42]:
istanbul_grouped_clustering.loc[istanbul_grouped_clustering['Cluster Labels'] == 2, istanbul_grouped_clustering.columns[[1] + list(range(5, istanbul_grouped_clustering.shape[1]))]].head()

Unnamed: 0,Syrian,Cluster Labels,1br Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,31426.0,2,Music Venue,Factory,Farm,Tunnel,Soccer Stadium,Deli / Bodega,Dumpling Restaurant,Dry Cleaner,Donut Shop,Doner Restaurant
3,30747.0,2,Café,Hotel,Turkish Restaurant,Kebab Restaurant,Steakhouse,Dessert Shop,Restaurant,Middle Eastern Restaurant,Tea Room,Historic Site
4,29177.0,2,Café,Mobile Phone Shop,Restaurant,Electronics Store,Farm,Burger Joint,Fast Food Restaurant,Pool Hall,Bookstore,Hotel
5,26424.0,2,Pizza Place,Butcher,Soccer Stadium,Department Store,Kebab Restaurant,Turkish Restaurant,Garden,Steakhouse,Playground,Café
6,25000.0,2,Café,Turkish Restaurant,Clothing Store,Restaurant,Steakhouse,Hookah Bar,Tea Room,Dessert Shop,Ice Cream Shop,Coffee Shop


In [43]:
istanbul_grouped_clustering.loc[istanbul_grouped_clustering['Cluster Labels'] == 3, istanbul_grouped_clustering.columns[[1] + list(range(5, istanbul_grouped_clustering.shape[1]))]].head()

Unnamed: 0,Syrian,Cluster Labels,1br Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,38278.0,3,Park,Soccer Stadium,Amphitheater,Performing Arts Venue,Garden Center,BBQ Joint,Dry Cleaner,Rental Car Location,Café,Stationery Store
1,37643.0,3,Café,Gym,Coffee Shop,Turkish Restaurant,Steakhouse,Hookah Bar,Dessert Shop,Gym / Fitness Center,Arcade,Soup Place


In [44]:
istanbul_grouped_clustering.loc[istanbul_grouped_clustering['Cluster Labels'] == 4, istanbul_grouped_clustering.columns[[1] + list(range(5, istanbul_grouped_clustering.shape[1]))]].head()

Unnamed: 0,Syrian,Cluster Labels,1br Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
13,15269.0,4,Hotel,Café,Restaurant,Turkish Restaurant,Gym / Fitness Center,Clothing Store,Bakery,Gym,Health & Beauty Service,Coffee Shop
14,14858.0,4,Clothing Store,Café,Furniture / Home Store,Coffee Shop,Bookstore,Restaurant,Cosmetics Shop,Gym / Fitness Center,Shopping Mall,Boutique
15,14216.0,4,Clothing Store,Café,Shoe Store,Turkish Restaurant,Italian Restaurant,Fast Food Restaurant,Coffee Shop,Cosmetics Shop,Electronics Store,Fried Chicken Joint
16,12727.0,4,Turkish Restaurant,Department Store,Café,Restaurant,Bakery,Music Store,Bar,Donut Shop,Dessert Shop,Pide Place
17,12072.0,4,Café,Turkish Restaurant,Cosmetics Shop,Convenience Store,Ice Cream Shop,Arcade,Snack Place,Dessert Shop,Gym,Turkish Coffeehouse


# Results and Discussion Section

The main issue that determines the process management of refugees in the 39 district municipalities in Istanbul, which host between 166 and 38,278 refugees within their borders, is naturally the numerical magnitude. The number of Syrian refugees in each of the Küçükçekmece, Bağcılar, Sultangazi, Fatih and Esenyurt districts of Istanbul is over 30 thousand. In Başakşehir, Zeytinburnu, Esenler, Sultanbeyli and Avcılar, the number of refugees is between 20 and 30 thousand; The number of refugees in Arnavutköy, Bahçelievler, Gaziosmanpaşa, Şişli, Ümraniye, Kağıthane, Güngören, Sancaktepe, Beyoğlu, Bayrampaşa and Eyüp is between 10 thousand and 20 thousand; The number of refugees in Beylikdüzü, Büyükçekmece and Pendik is between 5 thousand and 10 thousand. The number of refugees in the 10 district municipalities in Istanbul is between 1000 and 3 thousand and in the 5 district municipalities is below 1000 people.

We have a big picture where the immigrants lives. Istanbul has many culture and also very appealing city for domestic migration. We can see the differences among boroughs. Foursquare show us the decisions of immigrants where they want to live and where they want to do as a job.

We may analyze our results according to the five clusters we have produced.

 -  High inhabited immigrants boroughs (cluster 2-3) shows us the immigrants choose the developing boroughs and these boroughs is the lowest population in Istanbul in reality. Government policies also guiding the placement.
 -  For the most common venues (cluster 0-4), we can see the most job opportunities for immigrants. Local governments has many expanding on immigration policies; for example privileges on tax for opening new workplace.
 -  Adaptation is the first problem; religious is the same with immigrants but life styles are different. There are a lot of unemployed immigrants, but they can easily find a job at restaurants (especially Kebab and Turkish) and Coffee Houses.
 - This clusters may benefits the local governments and also real estate brokers. Immigrants populations affecting house and rental prices. The Turkish citizens generally deciding the place where least population of immigrants.
    


# Conclusion

To sum up, migrations depends on many reasons. Because of war, economical crisis, for a good future etc. Immigration policies have to be different other policies. In this report we able to see how important more data about immigrants and researching. Living places changing with people. It affects life style, education policies, health policies, house prices, retail and insurance.

For local citizens:

-  They can use these potential for new work places.
-  They may look at these map and easily find new employees from immigrants.
-  Homeowners and real estate brokers can determine their prices and investment.
-  Local governments can set up their crisis management and plans the project about immigrants.

For Immigrants:

-  They can see the where high taking migrations boroughs and benefits them for choosing easy adaptation places.
-  If a immigrants want to be a investor or want to make money itself, these report can help the where or what can be do.
-  Borough's life standard making the life easier or hard; because of this for a logical choices pre-researching about cities is the best way before migration.

## "Peace at home, peace in the world!" M.Kemal Ataturk

Many thanks for reading my report! And especially thanks to Coursera!
#### AYDIN KEMENT
##### Istanbul, 2019 ©