#### Capstone Project - The Battle of Neighborhoods

##### **Disclaimer: Informations ans results from this work are only simulated for the capstone project and do not reflect any actual situation in the area**

# Pre-feasibility study: Power plant location selection in the southern of Thailand

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction: Business Problem <a name="introduction"></a>

According to the latest Power Development Plan, the new power plant must be constructed to meet the increasing electricity demand locating in the southern of Thailand. The plan has been concluded in the previous stage of studies that the type of newly built power plant will be combine-cycle power plant using Natural gas and will be located within area of **Songkhla and Phattalung province**.
The operation of fossil-fired power plant cause both environmental and social effects. Therefore it is the first priority in this stage to locate plant with minimum effects towards the nearby social area.

**The aims of this study is to locate the optimal location of power plant that minimize the impact towards nearby locals.**

## Data <a name="data"></a>

This study uses the following data:
* Latitude and Longitude of each sub-district in the area from [[1]]

* Number and types of venue in the area from Foursqure.

[1]:  https://github.com/codesanook/thailand-administrative-division-province-district-subdistrict-sql?fbclid=IwAR02dmKnvJ9cEx1UXHucR60DSP8vbVukXQEdgg7R4nc-iQ5FbhTuyqVfeCg

## Methodology <a name="methodology"></a>

The methodolgies are as follow:
* Collection the sub-district latitude longitude coordinate from [1]
* Plot locations by Folium
* Collection the venue in the area from Foursqure
* Cluster the type of venues by K-means cluster
* Calculate the distance by latitude longitude coordinate

## Analysis <a name="analysis"></a>

### 1. Collects the sub-district latitude longitude coordinate

In [1]:
#import libraries
import requests
import pandas as pd
import csv
import urllib.request
from bs4 import BeautifulSoup
import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library


In [2]:
s_province = pd.read_csv("Southern of Thailand 2.csv")
s_province[:12]

Unnamed: 0,TA_ID,TAMBON,AM_ID,AMPHOE,Province,Latitude,Longitude
0,900803,Choeng Sae,9008,Krasae Sin,Songkhla,7.561,100.356
1,901107,Khlong U Ta Phao,9011,Hat Yai,Songkhla,7.045,100.446
2,900703,Takhria,9007,Ranot,Songkhla,7.789,100.269
3,900705,Ban Mai,9007,Ranot,Songkhla,7.78,100.288
4,900106,Ko Yo,9001,Muaeng Songkhla,Songkhla,7.163,100.542
5,900204,Di Luang,9002,Sathing Phra,Songkhla,7.58,100.406
6,900804,Krasae Sin,9008,Krasae Sin,Songkhla,7.609,100.307
7,900203,Sanam Chai,9002,Sathing Phra,Songkhla,7.547,100.417
8,901509,Hua Khao,9015,Singhanakhon,Songkhla,7.214,100.567
9,900209,Wat Chan,9002,Sathing Phra,Songkhla,7.382,100.461


### 2. Plot sub-districts on map

In [3]:
address = 'Songkhla'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Possible sites are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Possible sites are 6.8790221, 100.5498542.


In [4]:
map_th = folium.Map(location = [latitude, longitude], zoom_start = 9)
for lat, lng, tambon in zip(s_province['Latitude'], s_province['Longitude'], s_province['TAMBON']):
    label = '{}'.format(tambon)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_th)  
    
map_th

### 3. Acquire venue in the area via FOURSQUARE

In [5]:

CLIENT_ID = 'NPIEW4LSZMNYKEBYVXEWY11Z5TYMO3YTCCJOV5LL0RQKDTGH' # your Foursquare ID
CLIENT_SECRET = 'D1HCA04DHJNRQ5BQWWDC5S4X5JFXEDDMTWBWRF3Y4GBZCZYV' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: NPIEW4LSZMNYKEBYVXEWY11Z5TYMO3YTCCJOV5LL0RQKDTGH
CLIENT_SECRET:D1HCA04DHJNRQ5BQWWDC5S4X5JFXEDDMTWBWRF3Y4GBZCZYV


In [6]:
def getNearbyVenues(names, latitudes, longitudes):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            500, 
            200)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [7]:
th_venues = getNearbyVenues(names = s_province['TAMBON'], latitudes = s_province['Latitude'], longitudes = s_province['Longitude'])

Choeng Sae
Khlong U Ta Phao
Takhria
Ban Mai
Ko Yo
Di Luang
Krasae Sin
Sanam Chai
Hua Khao
Wat Chan
Kradang Nga
Bodan
Bodaeng
Khun Tat Wai
Cha Thing Phra
Ban Han
Hua Khao
Krasae Sin
Chalae
Wat Khanun
Pak Trae
Ram Daeng
Bo Yang
Mae Thom
Ra Wa
Krasae Sin
Na Mosri
Botru
Phang Yang
Tham Nop
Chum Phon
Pak Ro
Wat Son
Hat Yai
Ching Kho
Pa Ching
Bang Klam
Chang
Khlong Daen
Muang Ngam
Choeng Sae
Na Mom
Khlong Hae
Nam Khao
Rong
Choeng Sae
Bang Khiat
Ko Yo
Koyai
Tha Hin
Ban Khao
Thung Yai
Sa Kom
Huai Luek
Ra Not
Takhria
Daen Sa Nguan
Khao Rupchang
Tha Kham
Ko Taeo
Tha Pradu
Khuan Ru
Kho Hong
Khlong Rang
Thung Lan
Khok Muang
Thung Khamin
Thung Wang
Sadao
Khu
Khu Khut
Pa Khat
Taling Chan
Plak Nu
Phichit
Sathing Mo
Ban Mai
Tha Bon
Ban Na
Khlong Re
Phang La
Na Thap
Khlong Pia
Khu Ha Tai
Thepha
Cha Nong
Ra Not
Khu Tao
Nam Noi
Pha Wong
Khuan Lang
Pak Bang
Khuan So
Tha Pho
Saphan Mai Kaen
Khae
Ban Not
Tha Mo Sai
Rong
Cha Nae
Than Khiri
Khao Mikiat
Ban Khao
Ratta Phum
Pian
Na Wa
Sa Ba Yoi
Khlong La
Ban Ph

In [93]:
#th_venues.to_csv('songkhla.csv', index=False)

In [8]:
th_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Bo Yang,12,12,12,12,12,12
Cha Thing Phra,2,2,2,2,2,2
Chalung,1,1,1,1,1,1
Ching Kho,1,1,1,1,1,1
Di Luang,1,1,1,1,1,1
Han Thao,1,1,1,1,1,1
Hat Yai,69,69,69,69,69,69
Hua Khao,1,1,1,1,1,1
Khao Chiak,1,1,1,1,1,1
Khao Rupchang,4,4,4,4,4,4


### 4. Categorize type of venues in the area

In [9]:
th_onehot = pd.get_dummies(th_venues[['Venue Category']], prefix="", prefix_sep="")
#TN_onehot.drop(['Neighborhood'], axis = 1, inplace = True)
th_onehot['Neighborhood'] = th_venues['Neighborhood']

fixed_columns = [th_onehot.columns[-1]]+ list(th_onehot.columns[:-1])
th_onehot = th_onehot[fixed_columns]

th_grouped = th_onehot.groupby('Neighborhood').mean().reset_index()

th_grouped

Unnamed: 0,Neighborhood,Airport Service,Arcade,Arts & Crafts Store,Asian Restaurant,Bakery,Basketball Court,Beach,Bookstore,Breakfast Spot,...,Som Tum Restaurant,Spa,Stables,Steakhouse,Tea Room,Thai Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant
0,Bo Yang,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Cha Thing Phra,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Chalung,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
3,Ching Kho,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0
4,Di Luang,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Han Thao,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Hat Yai,0.0,0.0,0.0,0.014493,0.0,0.014493,0.0,0.014493,0.0,...,0.014493,0.014493,0.0,0.014493,0.028986,0.072464,0.0,0.0,0.0,0.014493
7,Hua Khao,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Khao Chiak,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Khao Rupchang,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [96]:
#th_grouped.to_csv('songkhla_thgroup.csv', index=False)

In [10]:
import numpy as np

def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = th_grouped['Neighborhood']

for ind in np.arange(th_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(th_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted[0:40]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bo Yang,Noodle House,Snack Place,Japanese Restaurant,Hotel,Restaurant,Diner,Food,Café,Flea Market,Dim Sum Restaurant
1,Cha Thing Phra,Coffee Shop,Arts & Crafts Store,Vegetarian / Vegan Restaurant,Flower Shop,Dim Sum Restaurant,Diner,Electronics Store,Fast Food Restaurant,Flea Market,Food
2,Chalung,Trail,Vegetarian / Vegan Restaurant,Flower Shop,Dessert Shop,Dim Sum Restaurant,Diner,Electronics Store,Fast Food Restaurant,Flea Market,Food
3,Ching Kho,Thai Restaurant,Vegetarian / Vegan Restaurant,Flower Shop,Dessert Shop,Dim Sum Restaurant,Diner,Electronics Store,Fast Food Restaurant,Flea Market,Food
4,Di Luang,Seafood Restaurant,Vegetarian / Vegan Restaurant,Flea Market,Dessert Shop,Dim Sum Restaurant,Diner,Electronics Store,Fast Food Restaurant,Flower Shop,Convenience Store
5,Han Thao,Coffee Shop,Vegetarian / Vegan Restaurant,Department Store,Ice Cream Shop,Hotel,Halal Restaurant,Food Truck,Food & Drink Shop,Food,Flower Shop
6,Hat Yai,Hotel,Noodle House,Convenience Store,Thai Restaurant,Café,Seafood Restaurant,Tea Room,Pub,Chinese Restaurant,Coffee Shop
7,Hua Khao,Mountain,Vegetarian / Vegan Restaurant,Department Store,Ice Cream Shop,Hotel,Halal Restaurant,Food Truck,Food & Drink Shop,Food,Flower Shop
8,Khao Chiak,Airport Service,Japanese Restaurant,Ice Cream Shop,Hotel,Halal Restaurant,Food Truck,Food & Drink Shop,Food,Flower Shop,Flea Market
9,Khao Rupchang,Resort,Lake,Observatory,Flea Market,Dessert Shop,Dim Sum Restaurant,Diner,Electronics Store,Fast Food Restaurant,Vegetarian / Vegan Restaurant


### 5. Cluster types of sub-distritcts into 6 types by K-means clustering

In [11]:
# set number of clusters
kclusters = 6

th_grouped_clustering = th_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(th_grouped_clustering) 
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

th_merged = s_province

neighborhoods_venues_sorted.rename(columns = {'Neighborhood': 'TAMBON'}, inplace = True)
# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood

th_merged = th_merged.join(neighborhoods_venues_sorted.set_index('TAMBON'), on='TAMBON')
#th_merged = pd.merge(th_merged, )
#df2 = pd.merge(df,neighborhoods_venues_sorted, how='inner', on = '')

th_merged # check the last columns!

Unnamed: 0,TA_ID,TAMBON,AM_ID,AMPHOE,Province,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,900803,Choeng Sae,9008,Krasae Sin,Songkhla,7.561,100.356,,,,,,,,,,,
1,901107,Khlong U Ta Phao,9011,Hat Yai,Songkhla,7.045,100.446,1.0,Middle Eastern Restaurant,Vegetarian / Vegan Restaurant,Department Store,Ice Cream Shop,Hotel,Halal Restaurant,Food Truck,Food & Drink Shop,Food,Flower Shop
2,900703,Takhria,9007,Ranot,Songkhla,7.789,100.269,,,,,,,,,,,
3,900705,Ban Mai,9007,Ranot,Songkhla,7.780,100.288,,,,,,,,,,,
4,900106,Ko Yo,9001,Muaeng Songkhla,Songkhla,7.163,100.542,0.0,Seafood Restaurant,Vegetarian / Vegan Restaurant,Flea Market,Dessert Shop,Dim Sum Restaurant,Diner,Electronics Store,Fast Food Restaurant,Flower Shop,Convenience Store
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
198,930109,Lam Pam,9301,Mueang Phatthalung,Phatthalung,7.648,100.188,,,,,,,,,,,
199,930203,Khlong Cha Loem,9302,Kong Ra,Phatthalung,7.349,99.958,,,,,,,,,,,
200,930604,Ko Mak,9306,Pak Phayun,Phatthalung,7.435,100.329,,,,,,,,,,,
201,930402,Tanot,9304,Tamot,Phatthalung,7.278,100.007,,,,,,,,,,,


In [12]:
th_merged.dropna(subset = ['Cluster Labels'], axis = 0, inplace = True)

### 6. Plots sub-districts with cluster indication on map

In [13]:
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=7)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(th_merged['Latitude'], th_merged['Longitude'], th_merged['Province'], th_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    clu = int(cluster)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[clu-1],
        fill=True,
        fill_color=rainbow[clu-1],
        fill_opacity=0.7).add_to(map_clusters)
map_clusters

The shown spots in map are the clustered sub-district. According to the map above, the cluster also filter sub-districts that have very few venue. It can be illustrated that the shown spots are the least optimum point to locate the new power plant since they have a lots of social communities neaby.

In [411]:
#th_merged.to_csv("Songkhla_th_merged.csv", index = False)

In [14]:
map_all = folium.Map(location=[latitude, longitude], zoom_start=9)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]
rainbow2 = ['#ff0000', '#ff6200', '#f74d7d', '#ff9a2e', '#cb2eff', '#fffc2e']

# add markers to the map
markers_colors = []
for lat, lng, tambon in zip(s_province['Latitude'], s_province['Longitude'], s_province['TAMBON']):
    label = '{}'.format(tambon)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='#0090b8',
        fill=True,
        fill_color='#0090b8',
        fill_opacity=0.7,
        parse_html=False).add_to(map_all)  

for lat, lon, poi, cluster in zip(th_merged['Latitude'], th_merged['Longitude'], th_merged['Province'], th_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    clu = int(cluster)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow2[clu-1],
        fill=True,
        fill_color=rainbow2[clu-1],
        fill_opacity=0.7).add_to(map_all)

    
map_all

### 7. Calculate the optimum sub-districts that locates far away from community

#### 7.1 Calculate distance between sub-districts by latitude and longitude coordinates

In [15]:
from math import sin, cos, sqrt, atan2, radians

def get_distance(Lat1, Lon1, Lat2, Lon2):
    R = 6373.0

    lat1 = radians(Lat1)
    lon1 = radians(Lon1)
    lat2 = radians(Lat2)
    lon2 = radians(Lon2)

    dlon = lon2 - lon1
    dlat = lat2 - lat1

    a = sin(dlat / 2)**2 + cos(lat1) * cos(lat2) * sin(dlon / 2)**2
    c = 2 * atan2(sqrt(a), sqrt(1 - a))

    distance = R * c
    
    return(distance)

test = get_distance(52.2296756, 21.0122287, 52.406374, 16.9251681)

#### 7.2 Find the optimum sub-districts that locates far away from community

The distance in this study is 15 kilometers.

In [18]:
th_dis = s_province
th_dis = th_dis.set_index('TAMBON', inplace = False)

th_dis['dropp'] = int(0)

for lats, lngs, tbs in zip(s_province['Latitude'], s_province['Longitude'], s_province['TAMBON']):
    for i, j, k in zip(th_merged['Latitude'], th_merged['Longitude'], th_merged['TAMBON']):
        dis = get_distance(lats, lngs, i, j)
        if (dis <= 15): 
            #th_dis = th_dis.drop(tbs) 
            th_dis.loc[tbs, 'dropp'] = 1
            break
        else: continue 
        break
th_dis = th_dis[th_dis.dropp != 1]
#th_dis2 = s_province[th_dis.drop]
th_dis2 = th_dis.dropna(subset= ['TA_ID'], axis = 0, inplace = True)

In [20]:
map_opt = folium.Map(location = [latitude, longitude], zoom_start = 9)
for lat, lng, tambon in zip(th_dis['Latitude'], th_dis['Longitude'], th_dis.index):
    label = '{}'.format(tambon)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='green',
        fill=True,
        fill_color='green',
        fill_opacity=0.7,
        parse_html=False).add_to(map_opt)  
    
map_opt

### 8. Plots all the sub-districts in area with the information from lowest to highest optimum location of new power plants.

In [27]:
map_final = folium.Map(location=[latitude, longitude], zoom_start=9)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]
rainbow2 = ['#ff0000', '#ff6200', '#f74d7d', '#ff9a2e', '#cb2eff', '#FF78BC']

# add markers to the map
markers_colors = []
for lat, lng, tambon in zip(s_province['Latitude'], s_province['Longitude'], s_province['TAMBON']):
    label = '{}'.format(tambon)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='#A28901',
        fill=True,
        fill_color='#A28901',
        fill_opacity=0.7,
        parse_html=False).add_to(map_final)  

for lat, lon, poi, cluster in zip(th_merged['Latitude'], th_merged['Longitude'], th_merged['Province'], th_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    clu = int(cluster)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow2[clu-1],
        fill=True,
        fill_color=rainbow2[clu-1],
        fill_opacity=0.7).add_to(map_final)
    
for lat, lng, tambon in zip(th_dis['Latitude'], th_dis['Longitude'], th_dis.index):
    label = '{}'.format(tambon)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='#1FDC0C',
        fill=True,
        fill_color='#1FDC0C',
        fill_opacity=0.7,
        parse_html=False).add_to(map_final)  


map_final

## Results and Discussion <a name="results"></a>

From the map above, green spots are the most optimum locations to locate the new power plant according to the concern of impacts towards neaby community. They locates at least 15 kilometers away from communities. The brown spots are the sub-districts that do not have high density community but locates close to the dense populace. The red-toned spots are the least optimum locations due to the conentrated community.

In order to locate site of power plant, however, other considerations are needed for instance, transmission systems, distribution of fuel, etc. The final decision selecting locations from this study alone can not be made.
**Therefore the plots of this study can be implements as the one factor in the trade-off for seleting locations. If there is a necessity to locate power plant in brown or red-toned area, the result from this study can help to assess impact of power plant towards difference types of clustered community.**

## Conclusion <a name="conclusion"></a>

This study sorts and illustrates the order of optimal location of power plant with the consideration of impact towards nearby locals. The area of study is Songkhla and Patthalung province which locates in the southern of Thailand. The methodogies implemented are including; collection of latitude and longitude coordinate, plot locations on map by Folium, collection of venues information by Foursqure, cluster venues in are by K-means cluster and calculate the distance by latitude and longitude coordinate by 'haversine' formular. The results illustrates the order of optimal location power plant in the area. This study can be implemented to help to assess impact of power plant towards difference types of clustered community.