# Battle of Neighbourhoods - Week 1

## Introduction & Business Problem

Besiktas and Kadikoy are two popular towns of Istanbul, located both sides of the Bosphorus, tourists and locals pass most of their time within. The objective of this study is to provide sum explotary information to an enterpreneur who considers opening a coffee shop or a burger joint located in one of these towns. 

### Loading necessary libraries 

In [44]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files
!pip install geopy

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

import matplotlib.cm as cm
import matplotlib.colors as colors
import matplotlib.pyplot as plt

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library



## Data Description

### Gathering necessary data for town coordinates from heroku api 

To gather the locations of districts we will use a rest api developed on herokuapp which gives locoation data of towns in Turkey.

In [45]:
#Getting the ID of Istanbul
url_cities="https://il-ilce-rest-api.herokuapp.com/v1/cities"
results_cities=requests.get(url_cities).json()
results_cities=pd.DataFrame(results_cities['data'])

index= results_cities.loc[results_cities['name'] == 'İstanbul'].index[0]
index

74

In [46]:
ist_id=results_cities['_id'].iloc[index]
ist_id

'ce941560c5a7ba9ff5cd24f5f9d75065'

In [47]:
#Towns of Istanbul
url='https://il-ilce-rest-api.herokuapp.com/v1/cities/{}/towns?fields=name,geolocation.lat,geolocation.lon'.format(ist_id)
results_ist=requests.get(url).json()

ist_towns=results_ist['data']
ist_towns=pd.DataFrame(ist_towns)
ist_towns.head()

Unnamed: 0,_id,name,geolocation
0,10dd43dbbbe3e3a4ea83b9a5a05b7383,Gaziosmanpaşa,"{'lat': '41.0734206', 'lon': '28.9015561330191'}"
1,1300c0624eb6a1f433c7b4860eb4769f,Üsküdar,"{'lat': '41.0352214', 'lon': '29.0573344413904'}"
2,1f8028830ea7d0c0932a9dc26b3ae69b,Bağcılar,"{'lat': '41.0447291', 'lon': '28.8337135105443'}"
3,25c70d3e12f9cb9cfc58990ef09e66a0,Eyüpsultan,"{'lat': '41.0460444', 'lon': '28.9253241'}"
4,374e64370d076b57d606cb7a1dd962a4,Pendik,"{'lat': '40.95637645', 'lon': '29.3545807771283'}"


In [48]:
#Indexing 'name' column so we'll find _id information of towns we are looking for easily

lat_lon=ist_towns['geolocation'].apply(pd.Series)
ist_towns=pd.concat([ist_towns,lat_lon], axis=1)
ist_towns=ist_towns.drop(['geolocation'],axis=1)
ist_towns.set_index('name', inplace=True)
ist_towns.head()

Unnamed: 0_level_0,_id,lat,lon
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Gaziosmanpaşa,10dd43dbbbe3e3a4ea83b9a5a05b7383,41.0734206,28.9015561330191
Üsküdar,1300c0624eb6a1f433c7b4860eb4769f,41.0352214,29.0573344413904
Bağcılar,1f8028830ea7d0c0932a9dc26b3ae69b,41.0447291,28.8337135105443
Eyüpsultan,25c70d3e12f9cb9cfc58990ef09e66a0,41.0460444,28.9253241
Pendik,374e64370d076b57d606cb7a1dd962a4,40.95637645,29.3545807771283


In [49]:
b_id=ist_towns.loc['Beşiktaş']['_id']
k_id=ist_towns.loc['Kadıköy']['_id']

print(b_id) #ID of Besiktas
print(k_id) #ID of Kadikoy


8990b6a1d21e9c681dba64c11ddccb9d
bccdf16204b5a81620ed39c8c69930ea


In [50]:
#Listing districts of Besiktas and Kadikoy via their IDs

k_url='https://il-ilce-rest-api.herokuapp.com/v1/towns/{}/districts'.format(k_id)
b_url='https://il-ilce-rest-api.herokuapp.com/v1/towns/{}/districts'.format(b_id)

results_k=requests.get(k_url).json()
results_b=requests.get(b_url).json()

kadikoy_n=results_k['data']
besiktas_n=results_b['data']
kadikoy_n=pd.DataFrame(kadikoy_n)
besiktas_n=pd.DataFrame(besiktas_n)
bn_neighbourhoods=pd.concat([besiktas_n,kadikoy_n], axis=0)

bn=bn_neighbourhoods.drop(['_id','city'],axis=1)      #drop unnecessary columns
bn.head

<bound method NDFrame.head of             name      town
0       Abbasağa  Beşiktaş
1        Türkali  Beşiktaş
2         Levent  Beşiktaş
3     Gayrettepe  Beşiktaş
4        Akatlar  Beşiktaş
5         Etiler  Beşiktaş
6        Ortaköy  Beşiktaş
7     Arnavutköy  Beşiktaş
8          Bebek  Beşiktaş
9        Levazım  Beşiktaş
0       Koşuyolu   Kadıköy
1        Suadiye   Kadıköy
2       Bostancı   Kadıköy
3       Caferağa   Kadıköy
4        Göztepe   Kadıköy
5       Osmanağa   Kadıköy
6      Fikirtepe   Kadıköy
7     Fenerbahçe   Kadıköy
8   Ondokuzmayıs   Kadıköy
9      Rasimpaşa   Kadıköy
10   Merdivenköy   Kadıköy
11  Sahrayıcedit   Kadıköy
12       Erenköy   Kadıköy
13   Caddebostan   Kadıköy
14     Feneryolu   Kadıköy
15     Kozyatağı   Kadıköy
16        Eğitim   Kadıköy>

In [51]:
#Converting Turkish characters to english characters
tr_chars = {'ç':'c', 'Ç':'C', 'ğ':'g', 'Ğ':'G', 'ı':'i', 'İ':'I', 'ö':'o', 'Ö':'O', 'ş':'s', 'Ş':'S', 'ü':'u', 'Ü':'U'}
bn.replace(tr_chars, regex=True, inplace=True) 



I'll use Google Maps API in order to get the coordinates of Districts

In [52]:
#Adding another column by merging these two for API query
bn['API'] = bn.name.astype(str).str.cat(bn.town.astype(str), sep='%20')

bn.head()

Unnamed: 0,name,town,API
0,Abbasaga,Besiktas,Abbasaga%20Besiktas
1,Turkali,Besiktas,Turkali%20Besiktas
2,Levent,Besiktas,Levent%20Besiktas
3,Gayrettepe,Besiktas,Gayrettepe%20Besiktas
4,Akatlar,Besiktas,Akatlar%20Besiktas


In [53]:

GOOGLE_API_KEY =  'AIzaSyBclJA7disl56kd22H8KoGCVpbo7GS-RQQ'

bk = bn.copy() #creating a backup dataframe in case I have a mistake

def extract_lat_long_via_address(address_or_zipcode): 
    lat, lng = None, None
    api_key = GOOGLE_API_KEY
    base_url = "https://maps.googleapis.com/maps/api/geocode/json"
    endpoint = f"{base_url}?address={address_or_zipcode}%20Istanbul&key={api_key}"
    # see how our endpoint includes our API key? Yes this is yet another reason to restrict the key
    r = requests.get(endpoint)
    if r.status_code not in range(200, 299):
        return None, None
    try:
        '''
        This try block incase any of our inputs are invalid. This is done instead
        of actually writing out handlers for all kinds of responses.
        '''
        results = r.json()['results'][0]
        lat = results['geometry']['location']['lat']
        lng = results['geometry']['location']['lng']
    except:
        pass
    return lat, lng
    
def enrich_with_geocoding_api(row):
    column_name = 'API'
    address_value = row[column_name]
    address_lat, address_lng = extract_lat_long_via_address(address_value)
    row['lat'] = address_lat
    row['lng'] = address_lng
    return row

bk=bk.apply(enrich_with_geocoding_api, axis=1) # axis=1 is important to use the row itself

In [54]:
bk=bk.drop(['API'],axis=1)      #drop unnecessary columns

column_names=['District','Town','Lat','Lon']
bk.columns=column_names
bk

Unnamed: 0,District,Town,Lat,Lon
0,Abbasaga,Besiktas,41.04801,29.004528
1,Turkali,Besiktas,41.047652,29.001618
2,Levent,Besiktas,41.081829,29.018351
3,Gayrettepe,Besiktas,41.064196,29.006711
4,Akatlar,Besiktas,41.085614,29.025626
5,Etiler,Besiktas,41.087042,29.037264
6,Ortakoy,Besiktas,41.05397,29.027081
7,Arnavutkoy,Besiktas,41.068056,29.043056
8,Bebek,Besiktas,41.077744,29.041629
9,Levazim,Besiktas,41.063314,29.018351


In [55]:
#split dataframe into 2 dataframes by Towns
kadikoy_df = bk[bk['Town']=="Kadikoy"]
print (kadikoy_df)

besiktas_df =bk[bk['Town'] == "Besiktas"]
print (besiktas_df)

        District     Town        Lat        Lon
0       Kosuyolu  Kadikoy  41.009087  29.038719
1        Suadiye  Kadikoy  40.963081  29.083810
2       Bostanci  Kadikoy  40.958317  29.096898
3       Caferaga  Kadikoy  40.983720  29.025626
4        Goztepe  Kadikoy  40.977156  29.066357
5       Osmanaga  Kadikoy  40.991432  29.027081
6      Fikirtepe  Kadikoy  40.994303  29.050357
7     Fenerbahce  Kadikoy  40.974286  29.043083
8   Ondokuzmayis  Kadikoy  40.974626  29.088173
9      Rasimpasa  Kadikoy  40.996066  29.027081
10   Merdivenkoy  Kadikoy  40.986425  29.066357
11  Sahrayicedit  Kadikoy  40.983180  29.082355
12       Erenkoy  Kadikoy  40.973195  29.076538
13   Caddebostan  Kadikoy  40.967927  29.061993
14     Feneryolu  Kadikoy  40.981957  29.048902
15     Kozyatagi  Kadikoy  40.969147  29.095444
16        Egitim  Kadikoy  40.989668  29.050357
     District      Town        Lat        Lon
0    Abbasaga  Besiktas  41.048010  29.004528
1     Turkali  Besiktas  41.047652  29.00161

In [56]:
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
ist_address='Istanbul,TR'
geolocator=Nominatim(user_agent="city_explorer")

location=geolocator.geocode(ist_address)
ist_latitude = location.latitude
ist_longitude = location.longitude

print('The geograpical coordinates of Istanbul are {}, {}.'.format(ist_latitude, ist_longitude))

The geograpical coordinates of Istanbul are 41.0096334, 28.9651646.


In [57]:
besiktas_map=folium.Map(location=[ist_latitude,ist_longitude],zoom_start=12)  
for name,lat,lng in zip(besiktas_df['District'],besiktas_df['Lat'],besiktas_df['Lon']):
    label='{}'.format(name)
    label=folium.Popup(label,parse_html=True)
    folium.CircleMarker(
    [lat,lng],
    radius = 7.5,
    popup=label,
    color='gray',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.75,
    parse_html=False).add_to(besiktas_map)
    folium.TileLayer('Mapbox Bright').add_to(besiktas_map)
besiktas_map

In [58]:
kadikoy_map=folium.Map(location=[ist_latitude,ist_longitude],zoom_start=12)  
for name,lat,lng in zip(kadikoy_df['District'],kadikoy_df['Lat'],kadikoy_df['Lon']):
    label='{}'.format(name)
    label=folium.Popup(label,parse_html=True)
    folium.CircleMarker(
    [lat,lng],
    radius = 7.5,
    popup=label,
    color='gray',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.75,
    parse_html=False).add_to(kadikoy_map)
    folium.TileLayer('Mapbox Bright').add_to(kadikoy_map)
kadikoy_map

## Using Foursquare API to list the coffee shops in Kadikoy & Besiktas

In [59]:
CLIENT_ID = 'AJE5HPGPHP450PV5CWF0UWIBD2X4E0MQVCA2KTNHQWAWEBWR' # your Foursquare ID
CLIENT_SECRET = '0G30IAVP533TE25Q1VEPFID25SJPA5EOEAHE5T1VNDSKNZMC' # your Foursquare Secret
VERSION = '20180604'
radius = 1000
LIMIT=10000

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: AJE5HPGPHP450PV5CWF0UWIBD2X4E0MQVCA2KTNHQWAWEBWR
CLIENT_SECRET:0G30IAVP533TE25Q1VEPFID25SJPA5EOEAHE5T1VNDSKNZMC


## Kadikoy

In [60]:
# Fiding stores in the radius of 1000 mt in Kadikoy

kadikoy_venues = []
kadikoy_df['Lat'],kadikoy_df['Lon']
for town, lat, long, name in zip(kadikoy_df['Town'], kadikoy_df['Lat'],kadikoy_df['Lon'], kadikoy_df['District']):
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        long,
        radius,
        LIMIT)
    
    results = requests.get(url).json()["response"]['groups'][0]['items']
    
    for venue in results:
        kadikoy_venues.append((
            town,
            name,
            lat, 
            long, 
            venue['venue']['name'], 
            venue['venue']['location']['lat'], 
            venue['venue']['location']['lng'],  
            venue['venue']['categories'][0]['name']))

In [61]:
kadikoy_venues=pd.DataFrame(kadikoy_venues)
kadikoy_venues.columns = ['Town','DistrictName', 'Latitude','Longitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'VenueCategory']
kadikoy_venues.head()

Unnamed: 0,Town,DistrictName,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,Kadikoy,Kosuyolu,41.009087,29.038719,Aytunç Bentürk Dance Academy,41.008095,29.039437,Dance Studio
1,Kadikoy,Kosuyolu,41.009087,29.038719,Kaen Sushi,41.007456,29.038368,Sushi Restaurant
2,Kadikoy,Kosuyolu,41.009087,29.038719,Starbucks Reserve,41.009617,29.040705,Coffee Shop
3,Kadikoy,Kosuyolu,41.009087,29.038719,Day & Night Shisha | Pub,41.00847,29.039018,Pub
4,Kadikoy,Kosuyolu,41.009087,29.038719,Cara Cafe&Lounge,41.008746,29.039124,Café


In [62]:

import re #to ignore case sensitivity
kadikoy_coffee = kadikoy_venues[kadikoy_venues['VenueCategory'].str.contains('coffee', flags = re.IGNORECASE)]   #filter categories contain "coffee"
kadikoy_burger = kadikoy_venues[kadikoy_venues['VenueCategory'].str.contains('burger', flags = re.IGNORECASE)]   #filter categories contain "burger"

kadikoy_coffee=kadikoy_coffee.drop_duplicates(subset=['VenueLatitude', 'VenueLongitude']) #drop venues listed under two Districts however have the same coordinates
kadikoy_burger=kadikoy_burger.drop_duplicates(subset=['VenueLatitude', 'VenueLongitude']) #drop venues listed under two Districts however have the same coordinates


print('There are {} coffee shops and {} burger joints in Kadikoy.'.format(kadikoy_coffee.shape[0],kadikoy_burger.shape[0]))



There are 87 coffee shops and 11 burger joints in Kadikoy.


## Besiktas

###### Fiding stores in the radius of 1000 mt in Kadikoy

besiktas_venues = []
besiktas_df['Lat'],besiktas_df['Lon']
for town, lat, long, name in zip( besiktas_df['Town'],besiktas_df['Lat'],besiktas_df['Lon'], besiktas_df['District']):
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        long,
        radius,
        LIMIT)
    
    results = requests.get(url).json()["response"]['groups'][0]['items']
    
    for venue in results:
        besiktas_venues.append((
            town,
            name,
            lat, 
            long, 
            venue['venue']['name'], 
            venue['venue']['location']['lat'], 
            venue['venue']['location']['lng'],  
            venue['venue']['categories'][0]['name']))

In [63]:
besiktas_venues=pd.DataFrame(besiktas_venues)
besiktas_venues.columns = ['Town','DistrictName', 'Latitude','Longitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'VenueCategory']

besiktas_coffee = besiktas_venues[besiktas_venues['VenueCategory'].str.contains('coffee', flags = re.IGNORECASE)]   #filter categories contain "coffee"
besiktas_burger = besiktas_venues[besiktas_venues['VenueCategory'].str.contains('burger', flags = re.IGNORECASE)]   #filter categories contain "burger"

besiktas_coffee=besiktas_coffee.drop_duplicates(subset=['VenueLatitude', 'VenueLongitude']) #drop venues listed under two Districts however have the same coordinates
besiktas_burger=besiktas_burger.drop_duplicates(subset=['VenueLatitude', 'VenueLongitude']) #drop venues listed under two Districts however have the same coordinates

print('There are {} coffee shops and {} burger joints in Besiktas.'.format(besiktas_coffee.shape[0],besiktas_burger.shape[0]))


There are 66 coffee shops and 9 burger joints in Besiktas.
