# Capstone Project - The Battle of Neighborhoods

## Prospects of a Lunch Restaurant in Seoul, Korea.

### 1. Introduction/Business Problem

My friend wants to open a lunch restaurant in Seoul. He asked me for help.

I decided to help him by doing some analysis in the city of Seoul.
I offer three options:
+ Open a restaurant near major office buildings
+ Open fast food restaurants near the transport stations
+ Open a restaurant in places with few restaurants to avoid competition

Target Audiences:
+ People who want to open a restaurant like my friend or maybe a cafe, they can see the pros and cons of the locations.
+ Tourists looking for restaurants in Seoul.
+ Someone wants to understand a piece of data science work.

### 2. Data

I make use of https://en.wikipedia.org/wiki/List_of_districts_of_Seoul page to scrap the table to create a data-frame.

After that, I get coordinates of districts by using Geopy Client and prepare data.

I will first mark the locations of the districts with Foursquare and then give the next analysis.

In [1]:
import sys
import requests
import json

import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.colors as colors


import io
from bs4 import BeautifulSoup
import pandas as pd
import numpy as np

from sklearn.cluster import KMeans

from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="Seoul_explorer", timeout = 10)

**Using BeautifulSoup to find Table**

In [2]:
response_obj = requests.get('https://en.wikipedia.org/wiki/List_of_districts_of_Seoul').text
soup = BeautifulSoup(response_obj,'lxml')
Districts_Seoul_Table = soup.find('table', {'class':'wikitable sortable'})
print(Districts_Seoul_Table.tr.text)


Name
Population
Area
Population density



**Saving the data what I need**

In [3]:
Name = []
Population =[]
Area = []
Popdensity = []

for tr in Districts_Seoul_Table.find_all('tr'):
    i = 0
    for tds in tr.find_all('td'):
        if i == 0:
            Name.append(tds.text[:-1])
        if i == 1:
            Population.append(tds.text[:-1])
        if i == 2:
            Area.append(tds.text[:-1])
        if i == 3:
            Popdensity.append(tds.text[:-1])
        i = i + 1

#A = np.column_stack((Name, Population, Area, Popdensity))
df = pd.DataFrame({"Name": Name, "Population": Population, "Area": Area, "Population_density": Popdensity})
df.to_csv('Seoul.csv', index = False)

In [4]:
import pandas as pd
df = pd.read_csv('Seoul.csv')
df.head()

Unnamed: 0,Name,Population,Area,Population_density
0,Dobong-gu (도봉구; 道峰區),355712,20.70 km²,17184/km²
1,Dongdaemun-gu (동대문구; 東大門區),376319,14.21 km²,26483/km²
2,Dongjak-gu (동작구; 銅雀區),419261,16.35 km²,25643/km²
3,Eunpyeong-gu (은평구; 恩平區),503243,29.70 km²,16944/km²
4,Gangbuk-gu (강북구; 江北區),338410,23.60 km²,14339/km²


**Dropping Korean Character in Table**

In [5]:
df[['Name','Korean_language1', 'Korean_language2']] = df['Name'].str.split(' ',expand=True)
df.drop(['Korean_language1'], axis=1, inplace=True)
df.drop(['Korean_language2'], axis=1, inplace=True)
df.head()

Unnamed: 0,Name,Population,Area,Population_density
0,Dobong-gu,355712,20.70 km²,17184/km²
1,Dongdaemun-gu,376319,14.21 km²,26483/km²
2,Dongjak-gu,419261,16.35 km²,25643/km²
3,Eunpyeong-gu,503243,29.70 km²,16944/km²
4,Gangbuk-gu,338410,23.60 km²,14339/km²


**Getting coordinates of districts by using Geopy Client and saving**

In [6]:
Latitude = []
Longitude = []

for i in df['Name']:
    location = geolocator.geocode(i)
    Latitude.append(location.latitude)
    Longitude.append(location.longitude)
    
df['Latitude'] = Latitude
df['Longitude'] = Longitude
df.head()

df.to_csv('Seoul_co.csv', index = False)

**Using Foursquare Location Data:**

In [7]:
df = pd.read_csv('Seoul_co.csv')

import folium
address = 'Seoul'

Seloc = geolocator.geocode(address)
Seoul_latitude = Seloc.latitude
Seoul_longitude = Seloc.longitude
print('The geograpical coordinates of Seoul are {}, {}.'.format(Seoul_latitude, Seoul_longitude))

The geograpical coordinates of Seoul are 37.564982549999996, 126.93921080358436.


In [8]:
map_seoul = folium.Map(location=[Seoul_latitude, Seoul_longitude], zoom_start=11)
# add markers to map

for lat, lng, label in zip(df['Latitude'], df['Longitude'], df['Name']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=9,
        popup=label,
        color='magenta',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_seoul)  

map_seoul

### Foursquare ID

In [10]:
radius=1000
LIMIT=100

In [11]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighbourhood', 
                  'Neighbourhood Latitude', 
                  'Neighbourhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [12]:
Seoul_venues = getNearbyVenues(names=df['Name'], latitudes=df['Latitude'], longitudes=df['Longitude'])

Dobong-gu
Dongdaemun-gu
Dongjak-gu
Eunpyeong-gu
Gangbuk-gu
Gangdong-gu
Gangnam-gu
Gangseo-gu
Geumcheon-gu
Guro-gu
Gwanak-gu
Gwangjin-gu
Jongno-gu
Jung-gu
Jungnang-gu
Mapo-gu
Nowon-gu
Seocho-gu
Seodaemun-gu
Seongbuk-gu
Seongdong-gu
Songpa-gu
Yangcheon-gu
Yeongdeungpo-gu
Yongsan-gu
Seoul


In [13]:
print(Seoul_venues.shape)
Seoul_venues.head()

(754, 7)


Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Dobong-gu,37.6686,127.0466,맥도날드 (McDonald's) (맥도날드),37.670196,127.043726,Fast Food Restaurant
1,Dobong-gu,37.6686,127.0466,WAGEN COFFEE,37.666922,127.045057,Café
2,Dobong-gu,37.6686,127.0466,Dunkin',37.668252,127.046433,Donut Shop
3,Dobong-gu,37.6686,127.0466,Baskin-Robbins,37.666314,127.046257,Ice Cream Shop
4,Dobong-gu,37.6686,127.0466,VIC Market (빅마켓),37.667676,127.045963,Big Box Store
