# CAPSTONE PROJECT - The Battle of the Neighborhoods

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## 1. Introduction: Business Problem <a name="introduction"></a>

In this project we will try to find optimal locations for enhancing healt service points and medical establishments. Specifically, this report will be targeted to Istanbul City Health Department interested in making a presentation to the Mayor/Governor about the medical establishment neccessities.  Department wants to discriminate neighborhoods according to counts of medical establishments and health care service capabilities,  so that he can provide an efficient guidance about fundings for new establishments or enhancing the capabilities of detected areas.

## 2. Data <a name="data"></a>

Based on definition of our problem, factors that will influence our decission are:
* number of existing medical establishments in the neighborhood (any type of health care services)
* number of beds, if any
* capabilities of emergency service or ambulance

We decided to use regularly spaced grid of locations, centered around city center, to define our neighborhoods.

Following data sources will be needed to extract/generate the required information:
* Centers of candidate areas will be generated algorithmically and approximate addresses of centers of those areas will be obtained using **Google Maps API reverse geocoding**
* Number of medical establishments and their type and location in every neighborhood will be obtained using  **Foursquare API and Istanbul Municipality Open Data Platform**
* Coordinate of Istanbul center will be obtained using **Google Maps API geocoding**
* The Data is in the form of CSV. It is provided by Istanbul Municipalty IT Department at their Open Data Platform. This is the link for the data.  
https://data.ibb.gov.tr/dataset/bd3b9489-c7d5-4ff3-897c-8667f57c70bb/resource/f2154883-68e3-41dc-b2be-a6c2eb721c9e/download/salk-kurum-ve-kurulularna-ait-bilgiler.csv

Let us import dependencies for our project

In [2]:
import pandas as pd
import numpy as np
import requests
from bs4 import BeautifulSoup
import os
from sklearn.cluster import KMeans
import folium 
from geopy.geocoders import Nominatim 
import matplotlib.cm as cm
import matplotlib.colors as colors

First we are acquiring data set from Istanbul Municipality Open Data Platform in CSV format. It is about medical establishments in Istanbul City. It includes names and types of those establishments, Boroughs, Neighborhoods, communication infos, their capabilities such as emergency service, ambulance, number of beds and geolocal informations.
We will drop unnecessary columns so we can focus our main purposes. Column names and records are in Turkish, they are cousing some problems with utf-8 encoding so we will translate them to English and manipulate some of them. We will drop some records that include only side helty services (for example dialysis centers, veterinaries, physical rehabilitation centers) We will convert 'Ambulance' and 'Emergency_Service' column records to numerical data so we can use them. 1 will refer to existence and 0 will refer to non-existence. We wil also drop rows that 'Ambulance','Emergency_Service' and 'Bed' records includes NaN values.

In [154]:
df = pd.read_csv("https://data.ibb.gov.tr/dataset/bd3b9489-c7d5-4ff3-897c-8667f57c70bb/resource/f2154883-68e3-41dc-b2be-a6c2eb721c9e/download/salk-kurum-ve-kurulularna-ait-bilgiler.csv", encoding="iso-8859-1")
df.head()

Unnamed: 0,ILCE_UAVT,ILCE_ADI,ADI,ALT_KATEGORI,ADRES,TELEFON,WEBSITESI,ACIL_SERVIS,YATAK,AMBULANS,MAHALLE,ENLEM,BOYLAM
0,1421.0,KADIKÖY,Marmara Veteriner Kliniði,Veteriner,Bahçeler Sk. No:3 FENERYOLU/KADIKÖY,0216 347 49 42,,,0.0,Yok,FENERYOLU,40.979645,29.050702
1,1327.0,FATÝH,Ýstanbul Üniversitesi-Cerrahpaþa Týp Fakültesi...,Poliklinik,ÝÜ Cerrahpaþa Týp Fakültesi Yolu No: CERRAHPAÞ...,0212 414 23 59,http://istanbultip.istanbul.edu.tr,Var,60.0,,CERRAHPAÞA,41.003546,28.942294
2,2015.0,TUZLA,Özel Saradent Aðýz ve Diþ Saðlýðý Polikliniði,Özel Aðýz Diþ Saðlýðý Merkezleri,Ýnönü Cad. ÞÝFA/TUZLA,0216 423 23 63,www.saradent.com.tr,,,,ÞÝFA,40.826815,29.355745
3,1421.0,KADIKÖY,Atalay Veteriner Kliniði,Veteriner,Sinan Ercan Cad. No:6 19 MAYIS/KADIKÖY,0216 372 02 62,,,0.0,Yok,19 MAYIS,40.969398,29.085752
4,1421.0,KADIKÖY,Medipol Üniversitesi Hastanesi Kadýköy,Üniversite Hastanesi,Lambacý Sk. No:2/1 KOÞUYOLU/KADIKÖY,0216 544 66 66,www.medipol.com.tr,Var,72.0,Var,KOÞUYOLU,41.004663,29.034348


In [155]:
df.drop(columns = ['ILCE_UAVT', 'ADRES', 'TELEFON','WEBSITESI'], inplace = True)

In [156]:
df.rename(columns={'ILCE_ADI':'Borough', 'ADI':'Name', 'ALT_KATEGORI':'Sub_Category', 'ACIL_SERVIS':'Emergency_Service', 'YATAK':'Bed', 'AMBULANS':'Ambulance', 'MAHALLE': 'Neighborhood', 'ENLEM':'Latitude',
       'BOYLAM':'Longitude' }, inplace=True)

In [179]:
# Get names of indexes for which column Age has value 30
indexNames = df[ (df['Sub_Category'] == 'Özel Aðýz Diþ Saðlýðý Merkezleri') | (df['Sub_Category'] == 'Veteriner') | (df['Sub_Category'] == 'Diyaliz Merkezi') | (df['Sub_Category'].str.contains('Rehabilitasyon'))].index
df.drop(indexNames , inplace=True)



In [166]:
df = df.dropna()

In [183]:
df['Emergency_Service'].replace(to_replace=['Var','Yok'], value=[1,0],inplace=True)
df['Ambulance'].replace(to_replace=['Var','Yok'], value=[1,0],inplace=True)
df.head()

Unnamed: 0,Borough,Name,Sub_Category,Emergency_Service,Bed,Ambulance,Neighborhood,Latitude,Longitude
4,KADIKÖY,Medipol Üniversitesi Hastanesi Kadýköy,Üniversite Hastanesi,1,72.0,1,KOÞUYOLU,41.004663,29.034348
10,ADALAR,Adalar 1 Nolu Acil Yardým Ýstasyonu,Acil Yardým Ýstasyonu,0,3.0,1,MADEN,40.874674,29.132746
21,FATÝH,Ýstanbul Eðitim ve Araþtýrma Hastanesi,Eðitim Araþtýrma Hastanesi,1,507.0,0,CERRAHPAÞA,41.003074,28.9383
43,KARTAL,Kartal Koþuyolu Yüksek Ýhtisas Eðitim ve Araþt...,Eðitim Araþtýrma Hastanesi,1,465.0,0,CEVÝZLÝ,40.915824,29.171704
44,ÞÝÞLÝ,Özel Ýac Ýstanbul Aesthetic Týp Merkezi,Týp Merkezi Özel,1,8.0,0,ESENTEPE,41.069186,29.006582


In [184]:
pd.set_option('display.max_rows', None)
df

Unnamed: 0,Borough,Name,Sub_Category,Emergency_Service,Bed,Ambulance,Neighborhood,Latitude,Longitude
4,KADIKÖY,Medipol Üniversitesi Hastanesi Kadýköy,Üniversite Hastanesi,1,72.0,1,KOÞUYOLU,41.004663,29.034348
10,ADALAR,Adalar 1 Nolu Acil Yardým Ýstasyonu,Acil Yardým Ýstasyonu,0,3.0,1,MADEN,40.874674,29.132746
21,FATÝH,Ýstanbul Eðitim ve Araþtýrma Hastanesi,Eðitim Araþtýrma Hastanesi,1,507.0,0,CERRAHPAÞA,41.003074,28.9383
43,KARTAL,Kartal Koþuyolu Yüksek Ýhtisas Eðitim ve Araþt...,Eðitim Araþtýrma Hastanesi,1,465.0,0,CEVÝZLÝ,40.915824,29.171704
44,ÞÝÞLÝ,Özel Ýac Ýstanbul Aesthetic Týp Merkezi,Týp Merkezi Özel,1,8.0,0,ESENTEPE,41.069186,29.006582
54,SÝLÝVRÝ,Silivri Devlet Hastanesi Selimpaþa Ek Hizmet B...,Devlet Hastanesi,1,35.0,1,SELÝMPAÞA,41.054333,28.379361
57,FATÝH,Ýstanbul Haseki Eðitim ve Araþtýrma Hastanesi ...,Eðitim Araþtýrma Hastanesi,1,125.0,0,HIRKA-Ý ÞERÝF,41.020465,28.937514
72,ÞÝÞLÝ,Özel Ýstanbul Cerrahi Hastanesi Fulya,Özel Hastane,1,48.0,1,TEÞVÝKÝYE,41.055628,28.997872
77,ÜSKÜDAR,Özel Üsküdar Anadolu Hastanesi,Özel Hastane,1,20.0,1,AZÝZ MAHMUT HÜDAYÝ,41.021445,29.015832
87,BEÞÝKTAÞ,Özel Estethica Levent Hastanesi,Özel Hastane,1,0.0,0,LEVENT,41.086068,29.017861


In [40]:
address = "Istanbul, ON"

geolocator = Nominatim(user_agent="istanbul_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Istanbul city are {}, {}.'.format(latitude, longitude))

istanbul_map = folium.Map(location=[latitude, longitude], zoom_start=10)

The geograpical coordinate of Istanbul city are 40.9561106, 29.0899629.


In [41]:
istanbul_map