# MIPANSUSUSU 

## Contents
- [Packages](#section1)
- [Datasets Cleaning](#section2)
    - [HDB Resale 2020](#subsection2.1)
    - [MRT Coordinates](#subsection2.2)
    - [Mall Coordinates](#subsection2.3)
    - [BusStop Coordinates](#subsection2.4)
    - [Import All Other Relevant Datasets](#subsection2.5)
- [General Functions](#section3)

## Packages<a id="section2"></a>

In [1]:
import json
import requests
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import urllib  

## Datasets Cleaning<a id="section1"></a>

We first begin by cleaning raw data sets in order to consolidate related data, as well as process data for later data visualisation and data processing.

<div class="alert alert-block alert-success">
<b>List of Key Datasets:</b> 
The Key Datasets we will be using are summarised below (alongside links to online data files if applicable)
  <li><b><a href="https://data.gov.sg/dataset/resale-flat-prices">HDB Resale Data</a></b>: Resale price, town, blk, street_name, postal code, longitude and latitude of HDB blocks</li>
  <li><b>Malls</b>: Shopping Mall Names, region, longitude and latitude</li>
  <li><b>MRT</b>: MRT locations, longitude and latitude</li>
  <li><b><a href="https://github.com/cheeaun/busrouter-sg/blob/master/data/3/stops.onemap.json">Buses</a></b>: BusStop Numbers, longitude and latitude</li>
</div> 


<a href="https://docs.onemap.sg/#search">OneMap API</a> was utilised in mapping corresponding latitude and longitude data to respective locations.

Due to the length of time taken to query onemap's API to attain geolocation data, we have prerun the queries and exported the data into a csv file for any later data processing and visualisation. The below variable has been set to 'False' to prevent rerunning of those specific queries when restarting the kernel

We necessary we converted json files with raw data to csv for easier processing through an <a href="https://www.convertcsv.com/json-to-csv.htm">online converting site</a>.

In [2]:
rerun = True

### HDB Resale 2020<a id="subsection2.1"></a>

In [12]:
HDB_coordinates = pd.read_csv("raw_HDB_coordinates.csv")
HDV_resale_2020_raw = pd.read_csv("raw_resale_flat_prices_2017_2020.csv")
HDB_resale_2020 = HDV_resale_2020_raw.merge(HDB_coordinates, how='inner', left_on=['town','block'], right_on=['town','block'])

#export the full dataframe for hdb with longitude and latitudes into csv
HDB_resale_2020.to_csv('clean_HDBresale2020.csv')

HDB_resale_2020

Unnamed: 0,month,town,flat_type,block,STREETreet_name,STREETorey_range,floor_area_sqm,flat_model,lease_commence_date,remaining_lease,resale_price,street_name,lat,long
0,2017-01,ANG MO KIO,2 ROOM,406,ANG MO KIO AVENUENUE 10,10 TO 12,44.0,Improved,1979,61 years 04 months,232000.0,ANG MO KIO AVE 10,1.362005,103.853880
1,2017-05,ANG MO KIO,2 ROOM,406,ANG MO KIO AVENUE 10,10 TO 12,44.0,Improved,1979,61 years 01 month,235000.0,ANG MO KIO AVE 10,1.362005,103.853880
2,2018-03,ANG MO KIO,2 ROOM,406,ANG MO KIO AVENUE 10,01 TO 03,44.0,Improved,1979,60 years 02 months,202000.0,ANG MO KIO AVE 10,1.362005,103.853880
3,2018-03,ANG MO KIO,2 ROOM,406,ANG MO KIO AVENUE 10,01 TO 03,44.0,Improved,1979,60 years 02 months,210000.0,ANG MO KIO AVE 10,1.362005,103.853880
4,2018-05,ANG MO KIO,2 ROOM,406,ANG MO KIO AVENUE 10,07 TO 09,44.0,Improved,1979,60 years 01 month,220000.0,ANG MO KIO AVE 10,1.362005,103.853880
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
74059,2020-08,TOA PAYOH,EXECUTIVE,240,LORONG 1 TOA PAYOH,01 TO 03,146.0,Maisonette,1986,65 years 03 months,760000.0,LOR 1 TOA PAYOH,1.340876,103.850830
74060,2020-08,TOA PAYOH,EXECUTIVE,101B,LORONG 2 TOA PAYOH,04 TO 06,144.0,Apartment,1993,71 years 08 months,901000.0,LOR 2 TOA PAYOH,1.339599,103.847605
74061,2020-08,WOODLANDS,EXECUTIVE,176,WOODLANDS STREET 13,04 TO 06,184.0,Apartment,1994,72 years 05 months,670888.0,WOODLANDS ST 13,1.433579,103.778353
74062,2020-08,YISHUN,EXECUTIVE,361,YISHUN RING ROAD,01 TO 03,145.0,Maisonette,1988,66 years 10 months,610000.0,YISHUN RING RD,1.428325,103.845908


### MRT Coordinates<a id="subsection2.2"></a>

In [8]:
MRT_coords = pd.read_csv("clean_MRT.csv")

MRT_coords

Unnamed: 0,OBJECTID,STN_NAME,STN_NO,X,Y,Latitude,Longitude,COLOR
0,12,ADMIRALTY MRT STATION,NS10,24402.1063,46918.1131,1.440585,103.800998,RED
1,16,ALJUNIED MRT STATION,EW9,33518.6049,33190.0020,1.316433,103.882893,GREEN
2,33,ANG MO KIO MRT STATION,NS16,29807.2655,39105.7720,1.369933,103.849553,RED
3,81,BAKAU LRT STATION,SE3,36026.0821,41113.8766,1.388093,103.905418,OTHERS
4,80,BANGKIT LRT STATION,BP9,21248.2460,40220.9693,1.380018,103.772667,OTHERS
...,...,...,...,...,...,...,...,...
182,175,WOODLANDS SOUTH MRT STATION,TE3,23607.8309,45444.7113,1.427260,103.793863,OTHERS
183,146,WOODLEIGH MRT STATION,NE11,32173.3186,35706.3794,1.339190,103.870808,PURPLE
184,6,YEW TEE MRT STATION,NS5,18438.9791,42158.0124,1.397535,103.747431,RED
185,41,YIO CHU KANG MRT STATION,NS15,29294.1283,40413.0820,1.381756,103.844944,RED


### Mall Coordinates<a id="subsection2.3"></a>

In [34]:
list_of_shopping_mall = [
    '100 AM', '313@Somerset', 'Aperia', 'Balestier Hill Shopping Centre', 'Bugis Cube', 
    'Bugis Junction', 'Bugis+', 'Capitol Piazza', 'Cathay Cineleisure Orchard', 'City Gate', 
    'City Square Mall', 'CityLink Mall', 'Clarke Quay Central', 'Duo', 'Far East Plaza', 'Funan', 
    'Great World City', 'HDB Hub', 'Holland Village Shopping Mall', 'ION Orchard', 'Junction 8', 
    'Knightsbridge[1]', 'Liang Court', 'Liat Towers', 'Lucky Plaza', 'Marina Bay Financial Centre Tower 3', 
    'Marina Bay Link Mall', 'Marina Bay Sands', 'Marina One', 'Marina Square', 'Midpoint Orchard', 'Millenia Walk', 
    'Mustafa Shopping Centre', 'Ngee Ann City', 'Orchard Central', 'Orchard Gateway', 'Orchard Plaza', 
    'Orchard Shopping Centre', 'Palais Renaissance', 'Peoples Park Centre', 'Peoples Park Complex', 
    'Plaza Singapura', 'PoMo', 'Raffles City', 'Scotts Square', 'Serangoon Plaza', 'Shaw House and Centre', 
    'Sim Lim Square', 'Singapore Shopping Centre', 'Square 2', 'Suntec City', 'Tanglin Mall', 'Tangs', 
    'Tanjong Pagar Centre', 'Tekka Centre', 'The Centrepoint', 'The Paragon', 'The Poiz [2]', 
    'The Shoppes at Marina Bay Sands', 'The South Beach', 'Thomson Plaza', 'United Square, The Kids Learning Mall', 
    'Velocity', 'Wheelock Place', 'Wisma Atria', 'Zhongshan Mall', '112 Katong', 'Bedok Mall', 'Bedok Point', 
    'Century Square', 'Changi Airport', 'Changi City Point', 'City Plaza', 'Djitsun Mall Bedok', 'Downtown East', 
    'East Village', 'Eastpoint Mall', 'Elias Mall', 'Kallang Wave Mall', 'Katong Square', 'Katong V', 
    'Kinex', 'Leisure Park Kallang', 'Loyang Point', 'Our Tampines Hub', 'Parkway Parade', 
    'Paya Lebar Square', 'PLQ Mall', 'Singapore Post Centre', 'Tampines 1', 'Tampines Mall', 'The Flow', 
    'White Sands', '888 Plaza', 'Admiralty Place', 'AMK Hub', 'Beauty World Centre', 'Beauty World Plaza', 
    'Broadway Plaza', 'Buangkok Square', 'Bukit Panjang Plaza', 'Bukit Timah Plaza', 'Causeway Point', 
    'Compass One', 'Djitsun Mall', 'Fajar Shopping Centre', 'Greenridge Shopping Centre', 'Greenwich V', 
    'Heartland Mall', 'Hillion Mall', 'HillV2', 'Hougang 1', 'Hougang Green Shopping Mall', 'Hougang Mall', 
    'Jubilee Square', 'Junction 10', 'Junction 9', 'Keat Hong Shopping Centre', 'KKH The Retail Mall', 
    'Limbang Shopping Centre', 'Lot One', 'Marsiling Mall', 'myVillage @ Serangoon', 'NEX', 'North East', 
    'North West', 'Northpoint City', 'Oasis Terraces', 'Punggol Plaza', 'Rail Mall', 'Rivervale Mall', 
    'Rivervale Plaza', 'Sembawang Shopping Centre', 'Sun Plaza', 'Sunshine Place', 'Teck Whye Shopping Centre', 
    'The Midtown', 'The Seletar Mall', 'Upper Serangoon Shopping Centre', 'Waterway Point', 'West Mall', 
    'Wisteria Mall', 'Woodlands Mart', 'Yew Tee Point', 'Yew Tee Shopping Centre', 'Yew Tee Square', 
    'Alexandra Retail Centre', 'HarbourFront Centre', 'VivoCity', '321 Clementi', 'Alexandra Central', 
    'Anchorpoint', 'Big Box', 'Boon Lay Shopping Centre', 'Fairprice Hub', 'Gek Poh Shopping Centre', 
    'Grantral Mall', 'IMM', 'JCube', 'Jem', 'Jurong Point', 'OD Mall', 'Pioneer Mall', 'Queensway Shopping Centre', 
    'Rochester Mall', 'Taman Jurong Shopping Centre', 'The Clementi Mall', 'The Star Vista', 'Tiong Bahru Plaza', 
    'West Coast Plaza', 'Westgate Mall',
]

len(list_of_shopping_mall)

171

In [35]:
mall_name = []
mall_roadname = []
mall_lat = []
mall_long = []

for i in range(0, len(list_of_shopping_mall)):
    query_address = list_of_shopping_mall[i]
    query_string = 'https://developers.onemap.sg/commonapi/search?searchVal='+str(query_address)+'&returnGeom=Y&getAddrDetails=Y'
    resp = requests.get(query_string)
    try:
        data_mall = json.loads(resp.content)
    except json.decoder.JSONDecodeError:
            print('Could not decode a row')
            continue
    
    if data_mall['found'] != 0:
        mall_name.append(query_address)
        mall_roadname.append(data_mall["results"][0]["ROAD_NAME"])
        mall_lat.append(data_mall["results"][0]["LATITUDE"])
        mall_long.append(data_mall["results"][0]["LONGITUDE"])

        print (str(query_address)+" ,Lat: "+data_mall['results'][0]['LATITUDE'] +" Long: "+data_mall['results'][0]['LONGITUDE'])

    else:
        print ("No Results")

# Store this information in a dataframe
Mall_coords = pd.DataFrame({
    'Mall': mall_name,
    'RoadName': mall_roadname,
    'Latitude': mall_lat,
    'Longitude': mall_long
})

Mall_coords.to_csv('clean_Mall.csv')

Mall_coords

100 AM ,Lat: 1.2745882179999999 Long: 103.84347070000001
313@Somerset ,Lat: 1.3010065690000001 Long: 103.83824659999999
Aperia ,Lat: 1.30974242 Long: 103.8641016
Balestier Hill Shopping Centre ,Lat: 1.326124169 Long: 103.8437095
Bugis Cube ,Lat: 1.298195005 Long: 103.85565550000001
Bugis Junction ,Lat: 1.300117893 Long: 103.8561916
Bugis+ ,Lat: 1.300007152 Long: 103.8556805
Capitol Piazza ,Lat: 1.293063359 Long: 103.85129350000001
Cathay Cineleisure Orchard ,Lat: 1.301497434 Long: 103.8364448
City Gate ,Lat: 1.302347543 Long: 103.86225999999999
City Square Mall ,Lat: 1.31147745 Long: 103.8567808
CityLink Mall ,Lat: 1.292378607 Long: 103.854663
No Results
Duo ,Lat: 1.299216621 Long: 103.8579968
Far East Plaza ,Lat: 1.3073111259999999 Long: 103.83394
Funan ,Lat: 1.2912440059999999 Long: 103.8499848
Great World City ,Lat: 1.293154929 Long: 103.831928
HDB Hub ,Lat: 1.33218812 Long: 103.8470778
No Results
ION Orchard ,Lat: 1.3040425340000001 Long: 103.83188829999999
Junction 8 ,Lat: 1.35017

Queensway Shopping Centre ,Lat: 1.287617926 Long: 103.8033911
Rochester Mall ,Lat: 1.3054076559999999 Long: 103.78844670000001
Taman Jurong Shopping Centre ,Lat: 1.334844875 Long: 103.720462
The Clementi Mall ,Lat: 1.315496754 Long: 103.76457020000001
The Star Vista ,Lat: 1.30697044 Long: 103.7884203
Tiong Bahru Plaza ,Lat: 1.2864712 Long: 103.8271583
West Coast Plaza ,Lat: 1.303742583 Long: 103.7660929
No Results


Unnamed: 0,Mall,RoadName,Latitude,Longitude
0,100 AM,TRAS STREET,1.2745882179999999,103.84347070000001
1,313@Somerset,ORCHARD ROAD,1.3010065690000001,103.83824659999999
2,Aperia,KALLANG AVENUE,1.30974242,103.8641016
3,Balestier Hill Shopping Centre,BALESTIER ROAD,1.326124169,103.8437095
4,Bugis Cube,NORTH BRIDGE ROAD,1.298195005,103.85565550000001
...,...,...,...,...
151,Taman Jurong Shopping Centre,YUNG SHENG ROAD,1.334844875,103.720462
152,The Clementi Mall,COMMONWEALTH AVENUE WEST,1.315496754,103.76457020000001
153,The Star Vista,VISTA EXCHANGE GREEN,1.30697044,103.7884203
154,Tiong Bahru Plaza,TIONG BAHRU ROAD,1.2864712,103.8271583


### BusStop Coordinates<a id="subsection2.4"></a>

In [36]:
Bus_coords = pd.read_csv("clean_bus.csv")

Bus_coords

Unnamed: 0,Busstop_no.,Latitude,Longitude
0,10009,1.282102,103.817225
1,10011,1.277738,103.837497
2,10017,1.278320,103.837626
3,10018,1.279008,103.838604
4,10021,1.277448,103.838389
...,...,...,...
4988,3419,1.271277,103.862977
4989,3421,1.266941,103.859648
4990,3411,1.271593,103.863009
4991,3339,1.274985,103.859500


### Import All Other Relevant Datasets<a id="subsection2.5"></a>

## General Functions<a id="section3"></a>

In [None]:
#calculate distance between two pairs of lat and long
def distance(x1,y1,x2,y2):
    R = 6373.0
    lat1 = radians(x1)
    lon1 = radians(y1)
    lat2 = radians(x2)
    lon2 = radians(y2)
    dlon = lon2 - lon1
    dlat = lat2 - lat1
    a = (sin(dlat/2))**2 + cos(lat1)*cos(lat2)*(sin(dlon/2))**2
    c = 2* atan2(sqrt(a), sqrt(1-a))
    dist = R * c
    return dist