## ANALYSIS ON THE NIGERIAN REAL ESTATE MARKET - PART 1

![eal-estate-2.jpg](attachment:eal-estate-2.jpg)

### DATA COLLECTION

The data used in this project would be extracted from the website of Nigerian Property Center via web scraping. The Nigerian Property Center is one of the biggest real estate firms in Nigeria, and West Africa. 
The website is one of the only websites in the country that has house listings for over 2 years on their site.

The website has listings on houses, flats, land, commercial properties and events centers.

For the purpose of this project, only house listings would be used.

Below is a screenshot of the website.

![Screenshot%20from%202020-08-26%2012-30-34.png](attachment:Screenshot%20from%202020-08-26%2012-30-34.png)

### Technologies used

The project dependencies/libraries are listed below:

1. BeautifulSoup
2. Pandas
3. time
4. requests


### Categories to be extracted

The category fields relevant to the project are:

1. Property Description
2. Property Address
3. Listed Time
4. Number of bedroom(s)
5. Number of bathroom(s)
6. Property price


### Load dependencies

In [1]:
import requests
from bs4 import BeautifulSoup as bs
import pandas as pd
import time

### Description of the website's HTML structure

The website has the following number of listings for sale:

    a. All types: 57,402
    b. Flat: 3,623
    c. Land: 17,425
    d. Commercial property: 2,392
    e. Event centers: 6
    f. Houses: 33,960

Since, the focus of this project is on the listings for houses alone, I used the advanced filter on the website to filter out only house listing to enable me extract only the data that is required.

The website has a lot empty tags, deeply nested tags, and tags with the same class names displaying different categories or features, which would make the task of scraping more complicated.

Below is the function that handles the scraping of the website.

### Function that handles the web scraping

In [2]:
# declare empty list to store data from scraping
nigerian_homes = []


# for loop to loop through all the (1670) pages of the site with house listings
# first 1 - 700, then 700 - 1670, to prevent IP ban
for x in range(700, 1670):
    
    # the filtered url of the target website displaying only house listings
    url = "https://nigeriapropertycentre.com/for-sale/houses/showtype?page="

    # Load the webpage content
    r = requests.get(url+str(x))

    # Convert to a beautiful soup object
    soup = bs(r.content, features="html.parser")

    # target container div
    content = soup.find_all("div", class_="col-md-12")
    
    # declare/initialise global variables which would be assigned values inside the for loop fuctions
    address = ""
    bedroom = ""
    bathroom = ""
    property_price = ""
    
#     using a for loop to iterate through the container div for all possible values, 
#     if not we would only get one value
    for property_ in content:
        # to target the property description and replace null values with "?"
        name = property_.find("h4", attrs={"class": "content-title"})
        property_title = name.string if name else "?"
        
        #the location is in a <strong> tag with an <i> icon tag nested in the 
        #same <strong> tag. All the icons were selected and decomposed using 
        #beautifulsoup decompose method. Then the location was targeted and converted
        #to string with no issues
        location_icon = property_.select("address.voffset-bottom-10 strong i")
        for y in location_icon:
            y.decompose()
        location_ = property_.select("address.voffset-bottom-10 strong")
        for x in location_:          
            address = x.string if x else "?"
            
        # to target the bedroom and bathroom values. The values are in a <span> with no class name and
        #are nested in a <li> tag which also has an <i> icon tag over the values
        # the only unique identifier was the icon class name which was used to target the values
        feature_bedroom = property_.select(".aux-info li i.fa-bed ~ span:nth-child(2)")
        for feature in feature_bedroom:
            bedroom = feature.string if feature else "?"
        feature_bathroom = property_.select(".aux-info li i.fa-bath ~ span:nth-child(2)")
        for feature in feature_bathroom:
            bathroom = feature.string if feature else "?"
        posted_date = property_.find("span", attrs={"class": "added-on"})
        period = posted_date.string if posted_date else "?"
        
        #the price values are deeply nested with <i> tags, so the parent container
        # with class name - pull-sm-left, was selected and then the price which is the 
        # second child of the container was also selected
        price = property_.select(".pull-sm-left > span:nth-child(2)")
        for x in price:
            property_price = x.string if x else "?"
    
        # create dictionary and assign the targeted values to keys
        property_info = {
            "description": property_title,
            "location": address,
            "date": period,
            "bedroom": bedroom,
            "bathroom": bathroom,
            "price": property_price        
        }
    
        # store data into list by appending
        nigerian_homes.append(property_info)
    print("Nigerian house prices:",len(nigerian_homes))
    # command that gives a time delay between each request, it is helpful to prevent IP ban
    # 15 seconds time delay
    time.sleep(5)

# store appended list in a pandas dataframe
df = pd.DataFrame(nigerian_homes)
#df.head()

# save results in a newly created csv file
df.to_csv("houses_ngnn_two.csv", index=False)

Nigerian house prices: 23
Nigerian house prices: 46
Nigerian house prices: 69
Nigerian house prices: 92
Nigerian house prices: 115
Nigerian house prices: 138
Nigerian house prices: 161
Nigerian house prices: 184
Nigerian house prices: 207
Nigerian house prices: 230
Nigerian house prices: 253
Nigerian house prices: 276
Nigerian house prices: 299
Nigerian house prices: 322
Nigerian house prices: 345
Nigerian house prices: 368
Nigerian house prices: 391
Nigerian house prices: 414
Nigerian house prices: 437
Nigerian house prices: 460
Nigerian house prices: 483
Nigerian house prices: 506
Nigerian house prices: 529
Nigerian house prices: 552
Nigerian house prices: 575
Nigerian house prices: 598
Nigerian house prices: 621
Nigerian house prices: 644
Nigerian house prices: 667
Nigerian house prices: 690
Nigerian house prices: 713
Nigerian house prices: 736
Nigerian house prices: 759
Nigerian house prices: 782
Nigerian house prices: 805
Nigerian house prices: 828
Nigerian house prices: 851
Niger

Nigerian house prices: 6808
Nigerian house prices: 6831
Nigerian house prices: 6854
Nigerian house prices: 6877
Nigerian house prices: 6900
Nigerian house prices: 6923
Nigerian house prices: 6946
Nigerian house prices: 6969
Nigerian house prices: 6992
Nigerian house prices: 7015
Nigerian house prices: 7038
Nigerian house prices: 7061
Nigerian house prices: 7084
Nigerian house prices: 7107
Nigerian house prices: 7130
Nigerian house prices: 7153
Nigerian house prices: 7176
Nigerian house prices: 7199
Nigerian house prices: 7222
Nigerian house prices: 7245
Nigerian house prices: 7268
Nigerian house prices: 7291
Nigerian house prices: 7314
Nigerian house prices: 7337
Nigerian house prices: 7360
Nigerian house prices: 7383
Nigerian house prices: 7406
Nigerian house prices: 7429
Nigerian house prices: 7452
Nigerian house prices: 7475
Nigerian house prices: 7498
Nigerian house prices: 7521
Nigerian house prices: 7544
Nigerian house prices: 7567
Nigerian house prices: 7590
Nigerian house price

Nigerian house prices: 13409
Nigerian house prices: 13432
Nigerian house prices: 13455
Nigerian house prices: 13478
Nigerian house prices: 13501
Nigerian house prices: 13524
Nigerian house prices: 13547
Nigerian house prices: 13570
Nigerian house prices: 13593
Nigerian house prices: 13616
Nigerian house prices: 13639
Nigerian house prices: 13662
Nigerian house prices: 13685
Nigerian house prices: 13708
Nigerian house prices: 13731
Nigerian house prices: 13754
Nigerian house prices: 13777
Nigerian house prices: 13800
Nigerian house prices: 13823
Nigerian house prices: 13846
Nigerian house prices: 13869
Nigerian house prices: 13892
Nigerian house prices: 13915
Nigerian house prices: 13938
Nigerian house prices: 13961
Nigerian house prices: 13984
Nigerian house prices: 14007
Nigerian house prices: 14030
Nigerian house prices: 14053
Nigerian house prices: 14076
Nigerian house prices: 14099
Nigerian house prices: 14122
Nigerian house prices: 14145
Nigerian house prices: 14168
Nigerian house

Nigerian house prices: 19918
Nigerian house prices: 19941
Nigerian house prices: 19964
Nigerian house prices: 19987
Nigerian house prices: 20010
Nigerian house prices: 20033
Nigerian house prices: 20056
Nigerian house prices: 20079
Nigerian house prices: 20102
Nigerian house prices: 20125
Nigerian house prices: 20148
Nigerian house prices: 20171
Nigerian house prices: 20194
Nigerian house prices: 20217
Nigerian house prices: 20240
Nigerian house prices: 20263
Nigerian house prices: 20286
Nigerian house prices: 20309
Nigerian house prices: 20332
Nigerian house prices: 20355
Nigerian house prices: 20378
Nigerian house prices: 20401
Nigerian house prices: 20424
Nigerian house prices: 20447
Nigerian house prices: 20470
Nigerian house prices: 20493
Nigerian house prices: 20516
Nigerian house prices: 20539
Nigerian house prices: 20562
Nigerian house prices: 20585
Nigerian house prices: 20608
Nigerian house prices: 20631
Nigerian house prices: 20654
Nigerian house prices: 20677
Nigerian house

In [3]:
# see first 5 results in the dataframe
df.head()

Unnamed: 0,description,location,date,bedroom,bathroom,price
0,?,,?,,,
1,5 bedroom semi-detached duplex for sale,"Old Ikoyi, Ikoyi, Lagos",Added on 06 Jun 2020,5.0,5.0,250000000.0
2,4 bedroom semi-detached duplex for sale,"Southern View Estate, Chevron, Second Tollga...",Added on 29 Jul 2020,4.0,4.0,43000000.0
3,4 bedroom terraced duplex for sale,"Off Banana Link Road, Old Ikoyi, Ikoyi, Lagos",Added on 29 Jul 2020,4.0,4.0,155000000.0
4,5 bedroom detached duplex for sale,"By Southern View Estate, Second Tollgate, La...",Added on 29 Jul 2020,5.0,5.0,65000000.0


### Conclusion

The data has been successfully extracted and exported to a CSV file.

Data cleaning, pre-processing and analysis can now be done

#### ML Engineer - Ubanna Dan-Ekeh