# 5G in Langkawi

## _Background_

### Import the necessary libraries

In [1]:
import requests # this is to retrieve the url

import lxml.html as lh

import pandas as pd # this is for datframe

#part 2
!pip -q install geopy

from geopy.geocoders import Nominatim # library to covert address to latitude and longitude

!pip -q install geocoder
import geocoder

#part 3
import matplotlib.cm as cm
import matplotlib.colors as colors

import numpy as np

import json # library to handle JSON files
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

from sklearn.cluster import KMeans

!pip -q install folium
print('folium installed...')
import folium # library for map rendering
print('folium imported...')
print('Done')

folium installed...
folium imported...
Done


### Scrape the website site to parse the table

In [2]:
langkawi_url = 'https://web.archive.org/web/20190620102338/http://www.jaik.gov.my/?page_id=658' # assign the url of the wiki page

tables = pd.read_html(langkawi_url) # Returns list of all tables on page
langkawi_table = tables[3] # Select table of interest

print(langkawi_table)# print table

      0                            1  \
0   BIL                KARIAH MASJID   
1     1                         Kuah   
2     2                        Kisap   
3     3                   Pulau Tuba   
4     4             Selat Bagan Pauh   
5     5            Selat Bagan Nyiur   
6     6                     Kelibang   
7     7                Padang Lalang   
8     8                  Sungai Itau   
9     9                          Ewa   
10   10                        Gelam   
11   11            Bubur @ Temonyong   
12   12  Padang Kandang @ Bohor Raja   
13   13              Padang Matsirat   
14   14                Kuala Teriang   
15   15                        Bayas   
16   16                  Bukit Temin   
17   17              Ulu Melaka Lama   
18   18              Ulu Melaka Baru   
19   19      Kumpulan @ Padang Gaong   
20   20                 Nyiur Cabang   
21   21                         Yooi   
22   22        Keda Wang Tok Rendong   
23   23              Sungai Menghulu   


### Convert the parsed table to Pandas Dataframe. Print the first 5 rows to see how it looks.

In [3]:
df = langkawi_table #converting table to pandas dataframe
df.head() #print the dataframe

Unnamed: 0,0,1,2,3
0,BIL,KARIAH MASJID,ALAMAT,MUKIM
1,1,Kuah,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah
2,2,Kisap,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah
3,3,Pulau Tuba,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah
4,4,Selat Bagan Pauh,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah


### I only want 'Kariah Masjid', 'Alamat', and 'Mukim'. So the 'Bil column is dropped. The first row after the header shall also be dropped.

In [4]:
df = df.drop(df.columns[0], axis=1) #drop the first column
df_langkawi = df.drop(df.index[0]) # define the new dataframe and alsodrop the second row
df_langkawi.head() # print the new dataframe

Unnamed: 0,1,2,3
1,Kuah,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah
2,Kisap,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah
3,Pulau Tuba,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah
4,Selat Bagan Pauh,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah
5,Selat Bagan Nyiur,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah


### After dropping the 'Bil' column and the said row, the dataframe now has 28 rows and 3 columns.

In [5]:
print(df_langkawi.shape) # print the rows and columns of the dataframe

(28, 3)


### When the table was parsed from the website, the header columns are integers. Printing the header confirmed this.

In [6]:
list(df_langkawi.columns) # get the column headers

[1, 2, 3]

### In this case, the integers are replaced with strings. The new colums shall be 'Local', 'Address' and 'Mukim'.

In [7]:
df_langkawi.rename(columns={1:'Local', 2:'Address', 3:'Mukim'}, inplace = True) # replace the column headers
df_langkawi.head() # print the dataframe

Unnamed: 0,Local,Address,Mukim
1,Kuah,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah
2,Kisap,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah
3,Pulau Tuba,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah
4,Selat Bagan Pauh,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah
5,Selat Bagan Nyiur,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah


### Lets check if the header columns have been replaced accordingly.

In [8]:
list(df_langkawi.columns) # get the column header

['Local', 'Address', 'Mukim']

### The row remains at 28, while columns at 3. The dataframe is intact.

In [9]:
print(df_langkawi.shape) # print the dataframe shape

(28, 3)


### Define the geocode for Langkawi Malaysia and while loop to create latitude and longitude for each rows.

In [10]:
def get_latlng(arcgis_geocoder): # defining the function
    
    lat_lng_coords = None # initialising location to None
    
    while(lat_lng_coords is None): # geocode while loop to create latitude and longitude for each rows
        g = geocoder.arcgis('{}, Langkawi, Malaysia'.format(arcgis_geocoder))
        lat_lng_coords = g.latlng
    return lat_lng_coords

### Get the latitude and longitude based on data in 'Local' column.

In [11]:
local = df_langkawi[df_langkawi.columns[0]] # define local to the 'Local' column
coordinates = [get_latlng(local) for local in local.tolist()] # define coordinates to the latitude and longitude of local

### Append the Latitude and Longitude to the respective rows. Print the dataframe.

In [12]:
df_loc = df_langkawi # define the new datframe with the latitude and longitude appended

df_coordinates = pd.DataFrame(coordinates, columns = ['Latitude', 'Longitude']) # apend the 'Latitude' and 'Longitude' to dataframe

df_loc['Latitude'] = df_coordinates['Latitude'] # append the respective latitude

df_loc['Longitude'] = df_coordinates['Longitude'] # append the respective longitude

df_loc # print the dataframe with the appended latitude and longitude

Unnamed: 0,Local,Address,Mukim,Latitude,Longitude
1,Kuah,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah,6.347663,99.791374
2,Kisap,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah,6.24621,99.82671
3,Pulau Tuba,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah,6.246863,99.829304
4,Selat Bagan Pauh,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah,6.25569,99.824
5,Selat Bagan Nyiur,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah,6.324513,99.798385
6,Kelibang,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah,6.4323,99.80104
7,Padang Lalang,"Mukim Ayer Hangat, 07000 Langkawi, Kedah Darul...",Ayer Hangat,6.4192,99.82195
8,Sungai Itau,"Mukim Ayer Hangat, 07000 Langkawi, Kedah Darul...",Ayer Hangat,6.347663,99.791374
9,Ewa,"Mukim Ayer Hangat, 07000 Langkawi, Kedah Darul...",Ayer Hangat,6.347663,99.791374
10,Gelam,"Mukim Kedawang, 07000 Langkawi, Kedah Darul Aman",Kedawang,6.29775,99.73946


### As seen in the table above, in the last row there is NaN value for the latitude and longitude of Bukit Tunggal. I proposed to drop the any rows with NaN value. In the meantime I check the shape of the dataframe. Currently there are 28 rows and 5 columns.

In [13]:
df_loc.shape # print shape

(28, 5)

### As mentioned earlier, any rows and NaN values shall be dropped.

In [14]:
df_loc = df_loc.dropna() # drop rows with NaN values
df_loc # print the dataframe

Unnamed: 0,Local,Address,Mukim,Latitude,Longitude
1,Kuah,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah,6.347663,99.791374
2,Kisap,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah,6.24621,99.82671
3,Pulau Tuba,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah,6.246863,99.829304
4,Selat Bagan Pauh,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah,6.25569,99.824
5,Selat Bagan Nyiur,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah,6.324513,99.798385
6,Kelibang,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah,6.4323,99.80104
7,Padang Lalang,"Mukim Ayer Hangat, 07000 Langkawi, Kedah Darul...",Ayer Hangat,6.4192,99.82195
8,Sungai Itau,"Mukim Ayer Hangat, 07000 Langkawi, Kedah Darul...",Ayer Hangat,6.347663,99.791374
9,Ewa,"Mukim Ayer Hangat, 07000 Langkawi, Kedah Darul...",Ayer Hangat,6.347663,99.791374
10,Gelam,"Mukim Kedawang, 07000 Langkawi, Kedah Darul Aman",Kedawang,6.29775,99.73946


### After dropping the rows with NaN values, the row is now down from 28 to 27. However, the column remains at 5.

In [15]:
df_loc.shape # print shape

(27, 5)

### To generate the map of Langkawi with folium, get the geographical coordinates of Langkawi which are 6.3700386, 99.7928634.

In [16]:
address = 'Langkawi, Malaysia'

geolocator = Nominatim(user_agent="ln_explorer")

location = geolocator.geocode(address)

latitude = location.latitude

longitude = location.longitude

print('The geographical coordinates of Langkawi are {}, {}.'.format(latitude, longitude))


The geographical coordinates of Langkawi are 6.3700386, 99.7928634.


### Generate the map of Langkawi with folium.

In [17]:
map_langkawi = folium.Map(location = [latitude, longitude], zoom_start=12)

map_langkawi

### Superimpose the localities on the folium map.

In [18]:
for lat, lng, borough, loc in zip(df_loc['Latitude'],
                                  df_loc['Longitude'], 
                                  df_loc['Local'], 
                                  df_loc['Mukim']):
    label = '{} - {}'.format(loc, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='3186cc',
        fill_opacity=0.7).add_to(map_langkawi)

display(map_langkawi)

### Retrieve the Foursquare API with the personal credentials.

In [37]:
CLIENT_ID = '' # your Foursquare ID
CLIENT_SECRET = '' # your Foursquare Secret
VERSION = '' # Foursquare API version

print('Your credentails:')
#print('CLIENT_ID: ' + CLIENT_ID)
#print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:


### Limit the venue to be returned at 100, and the radius from the localities at 2000m.

In [20]:
LIMIT = 100 # limit of 100 venues

def getNearbyVenues(names, latitudes, longitudes, radius=2000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighbourhood', 
                  'Neighbourhood Latitude', 
                  'Neighbourhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

### List the localities

In [21]:
langkawi_venues = getNearbyVenues(names=df_loc['Local'],
                                   latitudes=df_loc['Latitude'],
                                   longitudes=df_loc['Longitude']
                                  )

Kuah
Kisap
Pulau Tuba
Selat Bagan Pauh
Selat Bagan Nyiur
Kelibang
Padang Lalang
Sungai Itau
Ewa
Gelam
Bubur @ Temonyong
Padang Kandang @ Bohor Raja
Padang Matsirat
Kuala Teriang
Bayas
Bukit Temin
Ulu Melaka Lama
Ulu Melaka Baru
Kumpulan @ Padang Gaong
Nyiur Cabang
Yooi
Keda Wang Tok Rendong
Sungai Menghulu
Pasir Hitam
Bohor Merah
Kilim
Chandek Kura


### Print the head of the vanues of the respective localities.

In [22]:
langkawi_venues.head()

Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Kuah,6.347663,99.791374,Nasi Dagang Pak Malau,6.340188,99.788376,Asian Restaurant
1,Kuah,6.347663,99.791374,Mardi Fruit Farm,6.356275,99.798794,National Park
2,Kuah,6.347663,99.791374,Mardi Agro Technology Park,6.361383,99.79232,Park
3,Kuah,6.347663,99.791374,CampValley Fitness Homestay,6.350074,99.793222,Resort
4,Kuah,6.347663,99.791374,Perigi Mahsuri,6.339417,99.784472,Art Gallery


### From the table above, up to 564 unique venue categories were returned.

In [23]:
langkawi_venues.shape

(541, 7)

### One hot encoding

In [24]:
# one hot encoding
langkawi_onehot = pd.get_dummies(langkawi_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
langkawi_onehot['Neighbourhood'] = langkawi_venues['Neighbourhood'] 

# move neighborhood column to the first column
fixed_columns = [langkawi_onehot.columns[-1]] + list(langkawi_onehot.columns[:-1])
langkawi_onehot = langkawi_onehot[fixed_columns]

langkawi_onehot.head()

Unnamed: 0,Neighbourhood,American Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Bagel Shop,Bar,Beach,Beach Bar,...,Soccer Stadium,Soup Place,Spa,Steakhouse,Sushi Restaurant,Thai Restaurant,Trail,Train Station,Vape Store,Wine Bar
0,Kuah,0,0,0,1,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Kuah,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Kuah,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Kuah,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Kuah,0,1,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [25]:
langkawi_grouped = langkawi_onehot.groupby('Neighbourhood').mean().reset_index()
langkawi_grouped.head()

Unnamed: 0,Neighbourhood,American Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Bagel Shop,Bar,Beach,Beach Bar,...,Soccer Stadium,Soup Place,Spa,Steakhouse,Sushi Restaurant,Thai Restaurant,Trail,Train Station,Vape Store,Wine Bar
0,Bayas,0.0,0.055556,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Bohor Merah,0.0,0.055556,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Bubur @ Temonyong,0.0,0.0,0.0,0.043478,0.0,0.043478,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0
3,Bukit Temin,0.0,0.04,0.0,0.12,0.0,0.04,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Chandek Kura,0.0,0.055556,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [26]:
num_top_venues = 5

for hood in langkawi_grouped['Neighbourhood']:
    print("----"+hood+"----")
    temp = langkawi_grouped[langkawi_grouped['Neighbourhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Bayas----
              venue  freq
0            Resort  0.17
1  Asian Restaurant  0.11
2     Historic Site  0.11
3    Breakfast Spot  0.06
4     Boat or Ferry  0.06


----Bohor Merah----
              venue  freq
0            Resort  0.17
1  Asian Restaurant  0.11
2     Historic Site  0.11
3    Breakfast Spot  0.06
4     Boat or Ferry  0.06


----Bubur @ Temonyong----
                 venue  freq
0     Malay Restaurant  0.17
1       Farmers Market  0.04
2             Boutique  0.04
3  Rental Car Location  0.04
4                Diner  0.04


----Bukit Temin----
              venue  freq
0  Asian Restaurant  0.12
1            Resort  0.12
2        Restaurant  0.12
3     Historic Site  0.08
4  Malay Restaurant  0.08


----Chandek Kura----
              venue  freq
0            Resort  0.17
1  Asian Restaurant  0.11
2     Historic Site  0.11
3    Breakfast Spot  0.06
4     Boat or Ferry  0.06


----Ewa----
              venue  freq
0            Resort  0.17
1  Asian Restaurant  0.11
2

In [27]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [28]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
local_venues_sorted = pd.DataFrame(columns=columns)
local_venues_sorted['Neighbourhood'] = langkawi_grouped['Neighbourhood']

for ind in np.arange(langkawi_grouped.shape[0]):
    local_venues_sorted.iloc[ind, 1:] = return_most_common_venues(langkawi_grouped.iloc[ind, :], num_top_venues)

local_venues_sorted.head(12)

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bayas,Resort,Asian Restaurant,Historic Site,Boat or Ferry,Clothing Store,Malay Restaurant,Motel,Restaurant,Park,Breakfast Spot
1,Bohor Merah,Resort,Asian Restaurant,Historic Site,Boat or Ferry,Clothing Store,Malay Restaurant,Motel,Restaurant,Park,Breakfast Spot
2,Bubur @ Temonyong,Malay Restaurant,Gym / Fitness Center,Indian Restaurant,Motel,Diner,Department Store,Mediterranean Restaurant,Food Truck,Café,Business Service
3,Bukit Temin,Restaurant,Asian Restaurant,Resort,Historic Site,Malay Restaurant,History Museum,Garden Center,Diner,Motel,Park
4,Chandek Kura,Resort,Asian Restaurant,Historic Site,Boat or Ferry,Clothing Store,Malay Restaurant,Motel,Restaurant,Park,Breakfast Spot
5,Ewa,Resort,Asian Restaurant,Historic Site,Boat or Ferry,Clothing Store,Malay Restaurant,Motel,Restaurant,Park,Breakfast Spot
6,Gelam,Seafood Restaurant,Resort,Motel,Restaurant,Café,Hotel,French Restaurant,Spa,Asian Restaurant,Beach
7,Keda Wang Tok Rendong,Seafood Restaurant,Halal Restaurant,Asian Restaurant,Burger Joint,Hotel,Restaurant,Convenience Store,Music Store,Market,Flea Market
8,Kelibang,Malay Restaurant,Hot Spring,Beach,Resort,Wine Bar,Restaurant,Italian Restaurant,Fish & Chips Shop,Motel,Convenience Store
9,Kilim,Malay Restaurant,Restaurant,Hotel,Asian Restaurant,Motel,Vape Store,Boat or Ferry,Housing Development,Bistro,Food Truck


### Run k-means clustering up to 5 clusters.

In [29]:
# set number of clusters
kclusters = 5

langkawi_grouped_clustering = langkawi_grouped.drop('Neighbourhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(langkawi_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([0, 0, 2, 0, 0, 0, 2, 2, 2, 2], dtype=int32)

### Merge the grouped dataframe to the existing dataframe.

In [30]:
# add clustering labels
local_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

langkawi_merged = df_loc

# merge langkawi_grouped with langkawi_data to add latitude/longitude for each neighborhood
langkawi_merged = langkawi_merged.join(local_venues_sorted.set_index('Neighbourhood'), on='Local')

langkawi_merged # check the last columns!

Unnamed: 0,Local,Address,Mukim,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Kuah,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah,6.347663,99.791374,0,Resort,Asian Restaurant,Historic Site,Boat or Ferry,Clothing Store,Malay Restaurant,Motel,Restaurant,Park,Breakfast Spot
2,Kisap,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah,6.24621,99.82671,3,Beach,Harbor / Marina,Mosque,Wine Bar,Football Stadium,Duty-free Shop,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Flea Market
3,Pulau Tuba,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah,6.246863,99.829304,3,Mosque,Beach,Wine Bar,French Restaurant,Duty-free Shop,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Flea Market,Food
4,Selat Bagan Pauh,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah,6.25569,99.824,1,Beach,Grocery Store,Wine Bar,Football Stadium,Diner,Duty-free Shop,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Flea Market
5,Selat Bagan Nyiur,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah,6.324513,99.798385,2,Malay Restaurant,Thai Restaurant,Seafood Restaurant,Pharmacy,Resort,American Restaurant,Restaurant,Golf Course,Furniture / Home Store,Food Truck
6,Kelibang,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",Kuah,6.4323,99.80104,2,Malay Restaurant,Hot Spring,Beach,Resort,Wine Bar,Restaurant,Italian Restaurant,Fish & Chips Shop,Motel,Convenience Store
7,Padang Lalang,"Mukim Ayer Hangat, 07000 Langkawi, Kedah Darul...",Ayer Hangat,6.4192,99.82195,4,Hot Spring,River,Soup Place,Gym,Asian Restaurant,Hotel,Pharmacy,Fast Food Restaurant,Food,Flea Market
8,Sungai Itau,"Mukim Ayer Hangat, 07000 Langkawi, Kedah Darul...",Ayer Hangat,6.347663,99.791374,0,Resort,Asian Restaurant,Historic Site,Boat or Ferry,Clothing Store,Malay Restaurant,Motel,Restaurant,Park,Breakfast Spot
9,Ewa,"Mukim Ayer Hangat, 07000 Langkawi, Kedah Darul...",Ayer Hangat,6.347663,99.791374,0,Resort,Asian Restaurant,Historic Site,Boat or Ferry,Clothing Store,Malay Restaurant,Motel,Restaurant,Park,Breakfast Spot
10,Gelam,"Mukim Kedawang, 07000 Langkawi, Kedah Darul Aman",Kedawang,6.29775,99.73946,2,Seafood Restaurant,Resort,Motel,Restaurant,Café,Hotel,French Restaurant,Spa,Asian Restaurant,Beach


### Generate the map in folium with the clusters superimposed.

In [31]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(langkawi_merged['Latitude'], langkawi_merged['Longitude'], langkawi_merged['Local'], langkawi_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Cluster 1

In [32]:
langkawi_merged.loc[langkawi_merged['Cluster Labels'] == 0, langkawi_merged.columns[[1] + list(range(5, langkawi_merged.shape[1]))]]

Unnamed: 0,Address,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",0,Resort,Asian Restaurant,Historic Site,Boat or Ferry,Clothing Store,Malay Restaurant,Motel,Restaurant,Park,Breakfast Spot
8,"Mukim Ayer Hangat, 07000 Langkawi, Kedah Darul...",0,Resort,Asian Restaurant,Historic Site,Boat or Ferry,Clothing Store,Malay Restaurant,Motel,Restaurant,Park,Breakfast Spot
9,"Mukim Ayer Hangat, 07000 Langkawi, Kedah Darul...",0,Resort,Asian Restaurant,Historic Site,Boat or Ferry,Clothing Store,Malay Restaurant,Motel,Restaurant,Park,Breakfast Spot
15,"Mukim Ulu Melaka, 07000 Langkawi, Kedah Darul ...",0,Resort,Asian Restaurant,Historic Site,Boat or Ferry,Clothing Store,Malay Restaurant,Motel,Restaurant,Park,Breakfast Spot
16,"Mukim Ulu Melaka, 07000 Langkawi, Kedah Darul ...",0,Restaurant,Asian Restaurant,Resort,Historic Site,Malay Restaurant,History Museum,Garden Center,Diner,Motel,Park
17,"Mukim Ulu Melaka, 07000 Langkawi, Kedah Darul ...",0,Restaurant,Asian Restaurant,Resort,Historic Site,Malay Restaurant,History Museum,Garden Center,Diner,Motel,Park
18,"Mukim Ulu Melaka Baru, 07000 Langkawi, Kedah D...",0,Food Truck,Resort,Park,Asian Restaurant,Motel,Garden Center,National Park,Lake,Diner,Duty-free Shop
20,"Mukim Bohor, 07000 Langkawi, Kedah Darul Aman",0,Resort,Asian Restaurant,Historic Site,Boat or Ferry,Clothing Store,Malay Restaurant,Motel,Restaurant,Park,Breakfast Spot
25,"Mukim Ayer Hangat, 07000 Langkawi, Kedah Darul...",0,Resort,Asian Restaurant,Historic Site,Boat or Ferry,Clothing Store,Malay Restaurant,Motel,Restaurant,Park,Breakfast Spot
27,"Mukim Kedawang, 07000 Langkawi, Kedah Darul Aman",0,Resort,Asian Restaurant,Historic Site,Boat or Ferry,Clothing Store,Malay Restaurant,Motel,Restaurant,Park,Breakfast Spot


### Cluster 2

In [33]:
langkawi_merged.loc[langkawi_merged['Cluster Labels'] == 1, langkawi_merged.columns[[1] + list(range(5, langkawi_merged.shape[1]))]]

Unnamed: 0,Address,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",1,Beach,Grocery Store,Wine Bar,Football Stadium,Diner,Duty-free Shop,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Flea Market


### Cluster 3

In [34]:
langkawi_merged.loc[langkawi_merged['Cluster Labels'] == 2, langkawi_merged.columns[[1] + list(range(5, langkawi_merged.shape[1]))]]

Unnamed: 0,Address,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",2,Malay Restaurant,Thai Restaurant,Seafood Restaurant,Pharmacy,Resort,American Restaurant,Restaurant,Golf Course,Furniture / Home Store,Food Truck
6,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",2,Malay Restaurant,Hot Spring,Beach,Resort,Wine Bar,Restaurant,Italian Restaurant,Fish & Chips Shop,Motel,Convenience Store
10,"Mukim Kedawang, 07000 Langkawi, Kedah Darul Aman",2,Seafood Restaurant,Resort,Motel,Restaurant,Café,Hotel,French Restaurant,Spa,Asian Restaurant,Beach
11,"Mukim Kedawang, 07000 Langkawi, Kedah Darul Aman",2,Malay Restaurant,Gym / Fitness Center,Indian Restaurant,Motel,Diner,Department Store,Mediterranean Restaurant,Food Truck,Café,Business Service
12,"Mukim Padang Matsirat, 07100 Langkawi, Kedah D...",2,Malay Restaurant,Resort,Restaurant,Chinese Restaurant,Food Truck,Diner,Café,Hotel,Seafood Restaurant,Fast Food Restaurant
13,"Mukim Padang Matsirat, 07100 Langkawi, Kedah D...",2,Malay Restaurant,Asian Restaurant,Food Truck,Thai Restaurant,Restaurant,History Museum,Diner,Burger Joint,Convenience Store,Department Store
14,"Mukim Padang Matsirat, 07100 Langkawi, Kedah D...",2,Malay Restaurant,Resort,Hot Spring,Beach,Wine Bar,Restaurant,Gym,Fish & Chips Shop,Pool,Italian Restaurant
19,"Mukim Ulu Melaka, 07000 Langkawi, Kedah Darul ...",2,Malay Restaurant,Motel,Business Service,Rental Car Location,Café,Food Truck,Indian Restaurant,Park,Mediterranean Restaurant,Bagel Shop
21,"Mukim Bohor, 07000 Langkawi, Kedah Darul Aman",2,Campground,Soccer Field,Fried Chicken Joint,Scenic Lookout,Football Stadium,Boarding House,Wine Bar,Food Truck,Diner,Duty-free Shop
22,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",2,Seafood Restaurant,Halal Restaurant,Asian Restaurant,Burger Joint,Hotel,Restaurant,Convenience Store,Music Store,Market,Flea Market


### Cluster 4

In [35]:
langkawi_merged.loc[langkawi_merged['Cluster Labels'] == 3, langkawi_merged.columns[[1] + list(range(5, langkawi_merged.shape[1]))]]

Unnamed: 0,Address,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",3,Beach,Harbor / Marina,Mosque,Wine Bar,Football Stadium,Duty-free Shop,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Flea Market
3,"Mukim Kuah, 07000 Langkawi, Kedah Darul Aman",3,Mosque,Beach,Wine Bar,French Restaurant,Duty-free Shop,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Flea Market,Food


### Cluster 5

In [36]:
langkawi_merged.loc[langkawi_merged['Cluster Labels'] == 4, langkawi_merged.columns[[1] + list(range(5, langkawi_merged.shape[1]))]]

Unnamed: 0,Address,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,"Mukim Ayer Hangat, 07000 Langkawi, Kedah Darul...",4,Hot Spring,River,Soup Place,Gym,Asian Restaurant,Hotel,Pharmacy,Fast Food Restaurant,Food,Flea Market


## Observation

### Cluster 1 - This cluster located across localities of Mukim Kuah, Mukim Kedawang, Mukim Ayer Hangat, Mukim Ulu Melaka, Mukim Ulu Melaja Baru and Mukim Bohor are mainly comprise of restaurants and historical sites. Athough Langkawi is known for its nature, the island is also rich with history and folklore. There are several museums located in this cluster which 5G would have great potential to offer new experience in tourism.

### Cluster 2 - This cluster is located at one of the smaller islands surrounding the main island, Pulau Tuba. Pulau Tuba is knows for its agriculture and marine industry. There is a sizeable number of locals staying on this island who are farmers, fisherman and also working at the resort. Pulau Tuba would greatly benefit from 5G as this would greatly improve the agriculture, marine and hopitality industry.

### Cluster 3 - This cluster is the main commercial and tourism center of Langkawi. The venues in these localities comprises of restaurants, hotels, motels, and various shopping and businesses offering goods and services. In addition, it should also be noted that the Langkawi International Airport is also located here at Mukim Padang Matsirat. 5G at this cluster should be optimised for commerce, hospitality and the airline industry.

### Cluster 4 - This cluster is also located at one of the smaller islands surrounding the main island, Pulau Tuba. As observed in Cluster 2, Pulau Tuba is knows for its agriculture and marine industry. There is a sizeable number of locals staying on this island who are farmers, fisherman and also working at the resort. Pulau Tuba would greatly benefit from 5G as this would greatly improve the agriculture, marine and hopitality industry.

### Cluster 5 - This cluster located at a certain locality of Mukim Ayer Hangat is at a secluded part of Langkawi which may not be ideal for 5G coverage due to its low activity. The top 3 most common venue is hot spring, river and soup place. Nonetheless, limited 5G coverage can be proposed at this location as access to the network at this area may provide seamless connectivity throughout the island.