# Predicting Price Of Renting An Apartment In Lagos State.

The objective of this project is predicting the cost of renting an apartment in lagos state based on the area, number of bed rooms and number of toilets.
Data set for this project was gathered from https://tolet.ng, also scrapper for gathered the data can be found on my github here -> https://github.com/victorlenerd/toletngscrapper.

## Loading And Cleaning Data.

In [27]:
## Loading Data
import pandas as pd
from collections import Counter
import matplotlib.pyplot as plt

data = pd.read_json('./data.json')

%matplotlib inline

In [28]:
data.tail()

Unnamed: 0,area,no_bath,no_bed,no_toilets,price
9041,"Mafoluku road, Isolo, Mafoluku Oshodi Lagos",1,1,1,280000.0
9042,"Mafoluku road, Mafoluku Oshodi Lagos",1,1,1,280000.0
9043,,0,0,0,
9044,,0,0,0,
9045,,0,0,0,


In [29]:
data.describe()

Unnamed: 0,no_bath,no_bed,no_toilets,price
count,9046.0,9046.0,9046.0,7846.0
mean,2.125138,2.170462,2.404378,3302896.0
std,1.588876,1.581791,1.88939,48223540.0
min,0.0,0.0,0.0,0.01
25%,1.0,1.0,1.0,500000.0
50%,2.0,2.0,3.0,1000000.0
75%,3.0,3.0,4.0,2875000.0
max,10.0,10.0,10.0,4000000000.0


In [30]:
# The scrapper gathered some wrong values from dom elements that had the same property. It's best to drop them
data = data.dropna()

In [31]:
data.describe()

Unnamed: 0,no_bath,no_bed,no_toilets,price
count,7846.0,7846.0,7846.0,7846.0
mean,2.450166,2.502422,2.772113,3302896.0
std,1.454034,1.43317,1.759633,48223540.0
min,0.0,0.0,0.0,0.01
25%,1.0,1.0,1.0,500000.0
50%,3.0,3.0,3.0,1000000.0
75%,3.0,3.0,4.0,2875000.0
max,10.0,10.0,10.0,4000000000.0


In [32]:
data.tail()

Unnamed: 0,area,no_bath,no_bed,no_toilets,price
9038,- Abule Egba Abule Egba Lagos,2,2,3,500000.0
9039,- Lekki Phase 1 Lekki Lagos,5,5,6,5500000.0
9040,"Awuse Estate, Opebi, Lagos Opebi Ikeja Lagos",5,5,6,2500000.0
9041,"Mafoluku road, Isolo, Mafoluku Oshodi Lagos",1,1,1,280000.0
9042,"Mafoluku road, Mafoluku Oshodi Lagos",1,1,1,280000.0


### Using areas.json to make area property more structured, by converting address to area.

In [33]:
## Loading Areas
areas = pd.read_json('./areas.json')

In [34]:
areas_dict = areas.to_dict()
areas_list = []

for key, value in areas_dict[0].items():
    areas_list.append(value)
    
print(areas_list)

['Abule Egba', 'Agege', 'Ajah', 'Ajaokuta', 'Alimosho', 'Amuwo Odofin', 'Apapa', 'Arepo', 'Badagry', 'Egbe Idimu', 'Egbeda', 'Ejigbo', 'Epe', 'Gbagada', 'Ibeju Lekki', 'Iju', 'Ikeja', 'Ikorodu', 'Ikotun Igando', 'Ikoyi', 'Ilaje', 'Ilupeju', 'Ipaja', 'Isolo', 'Ketu', 'Kosofe Ikosi', 'Lagos Island', 'Lekki', 'Maryland', 'Mushin', 'Ogba', 'Ogudu', 'Ojo', 'Ojodu', 'Ojota', 'Okota', 'Orile', 'Oshodi', 'Sangotedo', 'Shagari', 'Shomolu', 'Surulere', 'Victoria Island', 'Yaba']


In [35]:
def addressToArea(address):
    
    address = address.replace(',', '')
    address = address.replace('-', '')
    address = address.replace('/', ' ')
    
    for area in areas_list:
        if area in address:
            return area
    
    return address

In [36]:
data['area'] = data['area'].apply(addressToArea)
data.head()

Unnamed: 0,area,no_bath,no_bed,no_toilets,price
0,Ajah,3,3,4,1200000.0
1,Ikotun Igando,1,1,1,140000.0
2,Ikoyi,1,1,1,227100.0
3,Victoria Island,1,1,1,349850.0
4,Ikoyi,1,1,1,33900.0


In [43]:
# Remove entries with less than one bed rooms

data = data[data['no_bed'] > 1]

In [45]:
data.describe()

Unnamed: 0,no_bath,no_bed,no_toilets,price
count,5866.0,5866.0,5866.0,5866.0
mean,3.018411,3.125298,3.396693,3491016.0
std,1.193333,1.064734,1.494171,52416820.0
min,0.0,2.0,0.0,1.0
25%,2.0,2.0,3.0,700000.0
50%,3.0,3.0,3.0,1500000.0
75%,4.0,4.0,4.0,3300000.0
max,10.0,10.0,10.0,4000000000.0
