# Introduction: Business Problem 


In this project we will try to predict the monthly rental price for a condominium. Specifically, this report will be targeted to stakeholders interested in finding the best value in renting a condominium in Singapore.

We will use our data science powers to find optimum rental price and recommend stake holders the best values and similar units for the stakeholders.

# Data 


Based on definition of our problem, factors that will influence a housing price could be:

1. Size of the unit
2. Furnishing Level of the Unit
3. Location of the unit
4. Proximity of the unit to public transportation
5. Remaining least of the unit / How new the unit is

etc....


In this section, we will retrieve the location(lat long) of the unit in street level using **Geocoder API**. Eg, if a unit is 13 Oxley Rise, lattitude & longitude of Oxley Rise will be generated. This ensure that our feature will not be highly cardinal.

## Units Lattitude & Longitude

In [44]:
import pandas as pd
import geocoder
import numpy as np

In [45]:
df = pd.read_csv('Data/scraped_df.csv')

In [46]:
# extract only the road name without the number as we dont want to be that 'pin-point' to avoid high cardinality
df['detailed_address'] = df.detailed_address.str.extract(r'[\s]([a-zA-Z\s]+)') + [', Singapore']

In [46]:
#code for getting lat long
index = 1
all_address = list()
for address in df.detailed_address.unique():
    unit = geocoder.arcgis(address)
    print(f'{index}  Coordinate of {address}: {unit.latlng}')
    try:
        
        unit_lat = unit.latlng[0]
        unit_lng = unit.latlng[1]
    except: 
        unit_lat = np.nan
        unit_lng = np.nan
    all_address.append([address, unit_lat, unit_lng])
    index = index + 1

1  Coordinate of Wallich Street, Singapore: [1.2768934719622607, 103.84445782929865]
2  Coordinate of Claymore Drive, Singapore: [1.307183281195904, 103.82951494950314]
3  Coordinate of Paya Lebar Road, Singapore: [1.3333343386056902, 103.88861632290003]
4  Coordinate of Orchard Boulevard, Singapore: [1.3044099853032676, 103.8297984755598]
5  Coordinate of Guillemard Road, Singapore: [1.310916707150167, 103.88425819102845]
6  Coordinate of Sims Avenue, Singapore: [1.3160621422010523, 103.88785072165165]
7  Coordinate of West Coast Vale, Singapore: [1.31152000000003, 103.75963000000007]
8  Coordinate of Oxley Rise, Singapore: [1.2970148285833927, 103.84310990453162]
9  Coordinate of Pasir Ris Grove, Singapore: [1.3719997019260144, 103.9448006739876]
10  Coordinate of Beach Road, Singapore: [1.2984684054481868, 103.8581001983291]
11  Coordinate of Handy Road, Singapore: [1.2995853410236504, 103.84648872462901]
12  Coordinate of Scotts Road, Singapore: [1.3098205396712679, 103.83414487491

97  Coordinate of Shanghai Road, Singapore: [1.2952226406406737, 103.82760341457447]
98  Coordinate of Thomson Road, Singapore: [1.3356873403320322, 103.83704745912773]
99  Coordinate of Kensington Park Drive, Singapore: [1.3682887519212072, 103.86965915473552]
100  Coordinate of Thomson Lane, Singapore: [1.3303418629662638, 103.84026991708241]
101  Coordinate of Marne Road, Singapore: [1.3127031835320864, 103.85711257525585]
102  Coordinate of Chwee Chian Road, Singapore: [1.2791571332075529, 103.788733385235]
103  Coordinate of Mount Sophia, Singapore: [1.3024341332440437, 103.84615846803028]
104  Coordinate of Yishun Avenue , Singapore: [1.4362100000000737, 103.83582000000007]
105  Coordinate of Sinaran Drive, Singapore: [1.320026169917651, 103.84423422533]
106  Coordinate of Poh Huat Road, Singapore: [1.366477089134612, 103.88239397623914]
107  Coordinate of Central Boulevard, Singapore: [1.2775104043512195, 103.85581043276464]
108  Coordinate of Hillview Avenue, Singapore: [1.3621

192  Coordinate of Club Street, Singapore: [1.2824700000000462, 103.84633000000008]
193  Coordinate of Moulmein Rise, Singapore: [1.3189362407679863, 103.84750950946002]
194  Coordinate of Wolskel Road, Singapore: [1.3469791861962253, 103.86965436765661]
195  Coordinate of Ridgewood Close, Singapore: [1.3173464226151594, 103.77823278949512]
196  Coordinate of Beatty Road, Singapore: [1.3139792384858533, 103.85850869602861]
197  Coordinate of Woodlands Road, Singapore: [1.4233352334286122, 103.75745247253174]
198  Coordinate of Lorong Sarhad, Singapore: [1.2804754479342733, 103.78804044562656]
199  Coordinate of Paterson Hill, Singapore: [1.3008492229504562, 103.82932915844438]
200  Coordinate of Kim Keat Road, Singapore: [1.3267063765765865, 103.85494431452226]
201  Coordinate of Jalan Loyang Besar, Singapore: [1.3786835379714903, 103.96086723977375]
202  Coordinate of Jalan Mutiara, Singapore: [1.2965512174361289, 103.8284240064863]
203  Coordinate of Yio Chu Kang Road, Singapore: [1.

287  Coordinate of Mohamed Sultan Road, Singapore: [1.2921329935541135, 103.84076847416932]
288  Coordinate of Marymount Terrace, Singapore: [1.3502588290293327, 103.84021589199884]
289  Coordinate of Swiss View, Singapore: [1.3470183052900357, 103.78980909870158]
290  Coordinate of Flora Road, Singapore: [1.3599076529292755, 103.96510102133749]
291  Coordinate of Jalan Jintan, Singapore: [1.3057669910163483, 103.83442133843639]
292  Coordinate of Cuscaden Walk, Singapore: [1.304047056554312, 103.82912151545388]
293  Coordinate of Chancery Lane, Singapore: [1.3205374262592773, 103.83695025342124]
294  Coordinate of Faber Walk, Singapore: [1.3214676171172632, 103.75429194149082]
295  Coordinate of Loyang Avenue, Singapore: [1.3853681097439667, 103.97839541117412]
296  Coordinate of Queensway, Singapore: [1.296480569313264, 103.79981538852024]
297  Coordinate of Jalan Masjid, Singapore: [1.3198504511995075, 103.91240316023848]
298  Coordinate of Saint Michael, Singapore: [1.3255347085259

382  Coordinate of Eng Hoon Street, Singapore: [1.2837504926891146, 103.83418770028965]
383  Coordinate of Jalan Bunga Rampai, Singapore: [1.339257935175682, 103.88278435815678]
384  Coordinate of Oxford Road, Singapore: [1.3157221021873868, 103.85269277592782]
385  Coordinate of King Albert Park, Singapore: [1.3344400000000292, 103.77972000000005]
386  Coordinate of Wilkie Terrace, Singapore: [1.3025920518220317, 103.84894679945378]
387  Coordinate of Taman Serasi, Singapore: [1.308363549554312, 103.81954732756837]
388  Coordinate of Mountbatten Road, Singapore: [1.3025481935136725, 103.89817065740849]
389  Coordinate of Sengkang East Avenue, Singapore: [1.3867492830248804, 103.89359740994644]
390  Coordinate of Kovan Rise, Singapore: [1.3576567132760147, 103.88072598906268]
391  Coordinate of Upper Paya Lebar Road, Singapore: [1.3432946639973373, 103.88277137157795]
392  Coordinate of Shan Road, Singapore: [1.3249666739557753, 103.84673968114937]
393  Coordinate of Leedon Road, Singa

477  Coordinate of Lorong Puntong, Singapore: [1.3615836209239942, 103.83155336361268]
478  Coordinate of Kee Seng Street, Singapore: [1.274689293765567, 103.8424788970263]
479  Coordinate of Lorong Ong Lye, Singapore: [1.3483182198158334, 103.87886306368532]
480  Coordinate of Jalan Ayer, Singapore: [1.3112548776223558, 103.87277249749042]
481  Coordinate of Hazel Park Terrace, Singapore: [1.372135193726109, 103.76496654179181]
482  Coordinate of Tamarind Road, Singapore: [1.38832296052829, 103.86052850034662]
483  Coordinate of Hemmant Road, Singapore: [1.3115564624933749, 103.89087339209802]
484  Coordinate of Hamilton Road, Singapore: [1.311209153974176, 103.86074118318194]
485  Coordinate of Jalan Rumbia, Singapore: [1.295774574392642, 103.84304422817019]
486  Coordinate of Bartley Road, Singapore: [1.3433490373733106, 103.87876438360418]
487  Coordinate of Compassvale Bow, Singapore: [1.3836119612205564, 103.89068804790406]
488  Coordinate of Sin Ming Walk, Singapore: [1.36477024

In [48]:
#creating a dataframe for address & lat long
latlng = pd.DataFrame(all_address, columns=['address', 'lat', 'long'])
#latlng.to_csv('latlng.csv', index='ignore')

## Merge with Main Dataframe

In [35]:
main_df = pd.read_csv('Data/scraped_df.csv')

merge_df = pd.read_csv('Data/latlng.csv', index_col = 'Unnamed: 0')

#prepare  the main dataframe 
main_df['less_detailed_address'] = main_df.detailed_address.str.extract(r'[\s]([a-zA-Z\s]+)') + [', Singapore']

In [38]:
#merging with main dataframe
main_df = pd.merge(main_df, merge_df, how='left', left_on = 'less_detailed_address', right_on = 'address')

In [39]:
#re-arrange the column for better representation of a unit
arrange_col = ['detailed_address', 
               'lat', 
               'long',
               'bedrooms',
               'bathrooms',
               'sqft',
               'price/sqft',
               'developer', 
               'district', 
               'built_year',
               'facing', 
               'floor_level', 
               'furnishing',
               'amenities',
               'mrt_distance',
               'mrt_name', 
               'name', 
               'neighbourhood',
               'overlooking_view',
               'property_type', 
               'tenure',
               'total_units',
               'unit_types', 
               'less_detailed_address', 
               'address', 
               'link',
               'picture_url',
               'price_month']

In [40]:
main_df = main_df[arrange_col]
main_df = main_df.drop(['less_detailed_address', 'address'], axis=1)

In [41]:
## save main dataframe
# main_df.to_csv('Data/main_df.csv', index='ignore')