## Choropleth Mapping In Singapore
### This Notebook shows you how I utilise several of Python's module to produce an interactive choropleth map of Singapore. 
### The Choropleth Map shows the housing average prices of each town in Singapore (04/22) and the markers show the historical pricing averages of each town
#### The data that I have used is from 2019 MasterPlan Land Use Layer and Housing prices in Singapore as of 04/22 ( I have only used 2017 - 2022 )
### Python Modules I have used in this project:
#### Pandas, Numpy, Altair, Folium,Geopandas and Vega (Built inside Folium)
### Websites Referenced:
#### StackOverflow, Altair ,Folium ,Geopandas and Pandas documentation, Google Geocoding API and Stack Exchange 

In [2]:
import pandas as pd
import numpy as np
import folium as fs
import geopandas
import altair as alt

### S1: Importing the code and reading the different towns offered inside the data set itself

In [3]:
#reading the file
f = pd.read_csv('resale-flat-prices-based-on-registration-date-from-jan-2017-onwards.csv')

latest = f.loc[f['month'].str.contains('2022-04')].reset_index()
del latest['index']
del latest['month']
town = f.drop_duplicates(subset = 'town')
#finding out all the towns in Singapore
for i in town['town']:
    print(i)

ANG MO KIO
BEDOK
BISHAN
BUKIT BATOK
BUKIT MERAH
BUKIT PANJANG
BUKIT TIMAH
CENTRAL AREA
CHOA CHU KANG
CLEMENTI
GEYLANG
HOUGANG
JURONG EAST
JURONG WEST
KALLANG/WHAMPOA
MARINE PARADE
PASIR RIS
PUNGGOL
QUEENSTOWN
SEMBAWANG
SENGKANG
SERANGOON
TAMPINES
TOA PAYOH
WOODLANDS
YISHUN


### Calculating the average price of each town in 04/22

In [4]:
#calculating the LATEST AVERAGE PRICE 
#creating a new df dedicated to average price, town and lat + long
data = []
def average_price():
    town = f.drop_duplicates(subset = 'town')
    for i in town['town']:
            resale_average = latest.loc[latest['town'] == str(i)]
            resale_mean = resale_average['resale_price'].mean()
            resale_final = round(resale_mean,1)
            final = [resale_final,i]
            data.append(final)
    return data

average_price()

[[463591.0, 'ANG MO KIO'],
 [482042.3, 'BEDOK'],
 [789241.1, 'BISHAN'],
 [518357.1, 'BUKIT BATOK'],
 [626869.2, 'BUKIT MERAH'],
 [504942.2, 'BUKIT PANJANG'],
 [nan, 'BUKIT TIMAH'],
 [870600.0, 'CENTRAL AREA'],
 [521480.3, 'CHOA CHU KANG'],
 [557928.6, 'CLEMENTI'],
 [440083.3, 'GEYLANG'],
 [490075.4, 'HOUGANG'],
 [533000.0, 'JURONG EAST'],
 [526527.8, 'JURONG WEST'],
 [638819.3, 'KALLANG/WHAMPOA'],
 [555142.9, 'MARINE PARADE'],
 [575000.0, 'PASIR RIS'],
 [573859.3, 'PUNGGOL'],
 [601692.3, 'QUEENSTOWN'],
 [509928.6, 'SEMBAWANG'],
 [549996.5, 'SENGKANG'],
 [539916.7, 'SERANGOON'],
 [543412.8, 'TAMPINES'],
 [498603.7, 'TOA PAYOH'],
 [518821.1, 'WOODLANDS'],
 [471087.6, 'YISHUN']]

### Setting up the coordinats of each town by finding the lat and lon using Google Geocoding API before converting them into points using Geopandas

In [5]:
final = pd.DataFrame(data,columns = ['average_price','town']) 
#LATTITUDE AND LONGITUTDE OF EACH TOWN IN SINGAPORE (I had to physically geocode this using Google Geocoding API)
lat = [1.369115,1.323604,1.352585,1.359029,1.281905,1.377414,1.329411,1.284484,1.38398,1.316181,1.320054,1.361218,1.332857,1.34039,1.324513,1.301969,1.372094,1.398446,1.294166,1.449111,1.386812,1.355357,1.349591,1.334304,1.438192,1.430368]
lon = [103.845434,103.927341,103.835212,103.76368,103.823918,103.77195,103.802078,103.851345,103.746961,103.764938,103.891775,103.886253,103.743552,103.708988,103.857225,103.897082,103.947373,103.907205,103.786127,103.818495,103.891443,103.867871,103.956788,103.856327,103.78896,103.835363]

In [6]:
#implementing the coords into the dataframe
final['latitude'] = lat
final['longitude'] = lon

In [7]:
#implementing combined coords into df
coords = []
for i,n in enumerate(final['town']):
    a = [final['latitude'][i],final['longitude'][i]]
    coords.append(a)
final['coordinates'] = coords
final.head(5)

Unnamed: 0,average_price,town,latitude,longitude,coordinates
0,463591.0,ANG MO KIO,1.369115,103.845434,"[1.369115, 103.845434]"
1,482042.3,BEDOK,1.323604,103.927341,"[1.323604, 103.927341]"
2,789241.1,BISHAN,1.352585,103.835212,"[1.352585, 103.835212]"
3,518357.1,BUKIT BATOK,1.359029,103.76368,"[1.359029, 103.76368]"
4,626869.2,BUKIT MERAH,1.281905,103.823918,"[1.281905, 103.823918]"


In [8]:
#converting df to geodf
geodf = geopandas.GeoDataFrame(data = final, geometry = geopandas.points_from_xy(final.longitude,final.latitude))
geodf

Unnamed: 0,average_price,town,latitude,longitude,coordinates,geometry
0,463591.0,ANG MO KIO,1.369115,103.845434,"[1.369115, 103.845434]",POINT (103.84543 1.36912)
1,482042.3,BEDOK,1.323604,103.927341,"[1.323604, 103.927341]",POINT (103.92734 1.32360)
2,789241.1,BISHAN,1.352585,103.835212,"[1.352585, 103.835212]",POINT (103.83521 1.35258)
3,518357.1,BUKIT BATOK,1.359029,103.76368,"[1.359029, 103.76368]",POINT (103.76368 1.35903)
4,626869.2,BUKIT MERAH,1.281905,103.823918,"[1.281905, 103.823918]",POINT (103.82392 1.28191)
5,504942.2,BUKIT PANJANG,1.377414,103.77195,"[1.377414, 103.77195]",POINT (103.77195 1.37741)
6,,BUKIT TIMAH,1.329411,103.802078,"[1.329411, 103.802078]",POINT (103.80208 1.32941)
7,870600.0,CENTRAL AREA,1.284484,103.851345,"[1.284484, 103.851345]",POINT (103.85134 1.28448)
8,521480.3,CHOA CHU KANG,1.38398,103.746961,"[1.38398, 103.746961]",POINT (103.74696 1.38398)
9,557928.6,CLEMENTI,1.316181,103.764938,"[1.316181, 103.764938]",POINT (103.76494 1.31618)


### Finally loading the map using folium. Loading a choropleth map to show distribution of the house prices 
#### In this cell, I have also made some slight modifications to the GeoJson Folder provided. In order to make the GeoJsonToolTip work, I had to insert the average town price into the geojson file using a for loop. 

In [9]:
#
m = fs.Map(location=[1.290270,103.851959])
choro_test =fs.Choropleth(geo_data = "planning-boundary-area.geojson",
            data = geodf,
            columns = ['town','average_price'],
            key_on = 'feature.properties.PLN_AREA_N',
            fill_color = 'YlGnBu',
            nan_fill_color = "White",
            fill_opacity = 0.7,
            line_opacity = 0.2,
            legend_name = 'Average Price Of Houses',
            line_color = 'black',
            nan_fill_opacity = 0.3
              
             ).add_to(m)
geodf_index = geodf.set_index('town')
#using a for loop to add property prices into the geojson file, allowing the mapping to work
for i in choro_test.geojson.data['features']:
        if i['properties']['PLN_AREA_N'] not in list(final['town']):  
            i['properties']['Average_Town_Price'] = None
        else:
            i['properties']['Average_Town_Price'] = geodf_index.loc[i['properties']['PLN_AREA_N'], 'average_price']
fs.GeoJsonTooltip(['PLN_AREA_N','Average_Town_Price']).add_to(choro_test.geojson)
m

### This cell is an old code from one of my previous projects. By slightly modifying the code, I was able to produce an Altair graph which would then be sent to Vega in Folium. 

In [10]:
#From an old github project (flat resale prices) , modified it such that it calculates the average house prices of each town in SG
#super important as it is used to implement markers into the graph
def average_ftype_overtime_modified(town):
    resale_town = f.loc[f['town'] ==  str(town)]
    #testing the annotation purpose by using key value pairs and lists separately
    rm_kv = {}
    i = 2017
    while i < 2023:
        n = 1
        while n < 13:
            if n < 10:
                resale_month = resale_town.loc[resale_town['month'] == str(i) + '-0' + str(n)]
                resale_month_mean = round(resale_month['resale_price'].mean(),2)
                rm_kv[str(i) + '-0' + str(n)] = resale_month_mean
                n += 1
            else:
                resale_month = resale_town.loc[resale_town['month'] == str(i) + '-' + str(n)]
                resale_month_mean = round(resale_month['resale_price'].mean(),2)
                rm_kv[str(i) + '-' + str(n)] = resale_month_mean
                n += 1
                
        i += 1
    #we will filter off the NaN months inside the kv prior to sepearting them into lists 
    for key,value in rm_kv.items():
        if np.isnan(value) == True:
            del rm_kv[key]
            break
    #reindexing all the months + need the months for plotting
    #seperating the key value pairs into list for annotation
    rm_kv_key = list(rm_kv.keys())
    rm_kv_value = list(rm_kv.values())
    x = rm_kv_key
    month = f.drop_duplicates(subset = "month")
    #Creating df to transfer them into altair
    source = pd.DataFrame({
    'Time': x,
    'Average Price': rm_kv_value})
    #Creating linegraph map using Altair.
    #The only reason we can only use Altair is because only Folium supports Vincent & Altair for graph implementation on map markers 
    a = alt.Chart(source,title= f'Average House Price Of {town}').mark_line().encode( alt.X('Time:T'),
        y= 'Average Price'
    ).properties(width = 300, height = 260) #remember 300 x300!
    return a 
#Testing the function using one of the towns to see if it works
average_ftype_overtime_modified('CENTRAL AREA')

### Finally, adding of markers with graph popups using VegaLite

In [11]:
#markers function implemented into the graph
def markers():
    for i,n in enumerate(final['town']):
        popup = fs.Popup(max_width = 400)
        a = average_ftype_overtime_modified(n)
        v1 = a.to_json()
        marker_test  = fs.Marker(location = [final['latitude'][i],final['longitude'][i]],
                                 popup = popup.add_child(fs.VegaLite(v1,width = 900, height = 300)
                                        ),tooltip = f'Click For More!')
        marker_test.add_to(m)
    return m
markers()

### Converting the tilelayer into cartodbpositron to make it more readable

In [12]:
#Adding cartodbpositron to make the colors more outstanding!
fs.TileLayer('cartodbpositron').add_to(m)


<folium.raster_layers.TileLayer at 0x7fdc1d030b20>

In [13]:
m

### Final Thoughts:
#### This is just an overview on how to make an interactive choropleth map using folium. Although it is mainly worked using python, learning JS is also useful as you can manipulate the geojson files given. I hope this would help those who are looking to work with choropleth mapping especially in Singapore since there are very few projects available on it. All the best and hope you have fun with choropleth mapping as well!
#### Addiitonal Thoughts: I might use Plotly to create a richer visual environment, but that is for another day...