To get the latitudes and longitudes of the rental locations, such that we can plot them on a graph later, we use the <code>geopy API</code> python module.

This file is run after editing the property names in the <code>final_rent.csv</code> file such that the <code>geopy API</code> can properly identify the areas. The edited file is saved as <code>edited_rent.csv</code>. 

NOTE: DO NOT EDIT THE <CODE>EDITED_RENT.CSV</CODE>

Note: We chose to not use the <code>Google Map API</code>, despite it being more powerful, as it is not a free service.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from geopy.geocoders import Nominatim

In [2]:
rent = pd.read_csv('edited_rent.csv')
rent

Unnamed: 0.1,Unnamed: 0,property_name,area (sqft),psf price ($/psf),total price ($)
0,0,Kallang Place,1952,1.90,3709
1,1,East Coast Road,1500,2.67,4000
2,2,East Coast Road,1500,2.67,4000
3,3,Eng Kong Park,1500,2.80,4200
4,4,Jurong West Street 41,1700,2.94,5000
...,...,...,...,...,...
138,138,Bukit Pasoh,1700,14.71,25000
139,139,Telok Ayer,1645,15.81,26000
140,140,Telok Ayer,1645,15.81,26000
141,141,Bukit Pasoh,1830,16.39,30000


In [3]:
name = []
latitude = []
longitude = []
total_price = []

print('Processing...')

geolocator = Nominatim(user_agent="geolocator"+str(1))
for index, row in rent.iterrows():
    location = geolocator.geocode(row['property_name'] + ", Singapore")
    split_name_list = row['property_name'].split()
    
    if location is None:
        location = geolocator.geocode(row['property_name'])
        
    if location.latitude < 1.203139 or location.latitude > 1.478409:
        print(f"{str(index)}. {row['property_name']} has incorrect latitude")
        continue
    
    if location.longitude < 103.598186 or location.longitude > 104.049312:
        print(f"{str(index)}. {row['property_name']} has incorrect longitude")
        continue
        
    name.append(row['property_name'])
    latitude.append(location.latitude)
    longitude.append(location.longitude)
    total_price.append(row['total price ($)'])

print(f'Processing complete, information from {str(len(latitude))} locations extracted.')

Processing...
Processing complete, information from 143 locations extracted.


In [4]:
df = pd.DataFrame(list(zip(name, latitude, longitude, total_price)), columns=['Name', 'Lat', 'Lon', 'Price'])
df.to_csv("FINAL_RENT.csv")