## Import The Neccessary Library

#### **What I want to d**o:
I want to import the necessary libraries for web scraping, data manipulation, and data visualization, as well as for building and evaluating machine learning models.

#### **Why I want to do it:**
I need these libraries to collect data from websites, process and analyze the data, and visualize the findings using PowerBI

#### **How I want to do it:**
- `import requests`: This library will allow me to send HTTP requests to retrieve data from web pages.
- `from bs4 import BeautifulSoup`: BeautifulSoup is used for parsing HTML and XML documents, making it easier to extract data from web pages.
- `import pandas as pd`: Pandas is a powerful data manipulation library that helps in handling structured data effectively.
- `import numpy as np`: NumPy is used for numerical operations and handling arrays.
- `import seaborn as sns`: Seaborn is a visualization library based on Matplotlib that provides a high-level interface for drawing attractive statistical graphics.
- `import matplotlib.pyplot as plt`: Matplotlib is a plotting library used to create static, interactive, and animated visualizations in Python.
- `from sklearn.model_selection import train_test_split`: This function is used to split datasets into training and testing sets, which is essential for evaluating machine learning models.



In [1]:
#import the needed libraries

import requests
from bs4 import BeautifulSoup
import pandas as pd
import numpy as np

#### **Result**:
I successfully imported the libraries needed for the data



---



## Scrape The Data From Nigerian Properties Website

#### **What I want to do:**
I want to scrape property listings from a website that features flats and apartments for rent in Lagos, Nigeria. The goal is to collect data such as property names, prices, addresses, and additional information.

#### **Why I want to do it:**
I want to gather this data to analyze the rental market in Lagos, understand pricing trends, and identify available properties. This information can be useful for potential renters or investors looking to make informed decisions.

#### **How I want to do it:**
- Initialize empty lists: `names`, `prices`, `addresses`, and `info` to store the scraped data.
- Use a `for` loop to iterate over a specified range (in this case, just one page) to construct the URL for the property listings.
- Send a GET request to the constructed URL using `requests.get(url)` and parse the response content with BeautifulSoup.
- Use `soup.find_all()` to extract property names, prices, addresses, and additional information from the HTML structure of the page:
  - `names_raw`: Extracts property names from the specified HTML tags and classes.
  - `prices_raw`: Extracts property prices, which may include currency symbols.
  - `address_raw`: Extracts the addresses of the properties.
  - `info_raw`: Extracts additional information about the properties.
- Clean the `prices` list by removing currency symbols (₦ and $).
- Print the lengths of the lists to ensure the data has been collected correctly.
- Create a DataFrame using Pandas to organize the collected data into a structured format.
- Save the DataFrame to a CSV file named 'Lagos_properties.csv' for further analysis.



In [None]:
names = []
prices = []
addresses = []
info = []

for i in range(1,2):
  url = 'https://nigeriapropertycentre.com/for-rent/flats-apartments/lagos/showtype?page='+str(i)
  response =requests.get(url)
  soup = BeautifulSoup(response.content,'html.parser')



  names_raw = soup.find_all('h4', class_='content-title')
  for i in names_raw:
    n = i.text
    names.append(n)



  prices_raw = soup.find_all('span', class_='price')
  for i in prices_raw:
    p = i.text
    prices.append(p)


  address_raw = soup.find_all('address', class_='voffset-bottom-10')
  for i in address_raw:
    a = i.text
    addresses.append(a)



  info_raw = soup.find_all('ul', class_='aux-info')
  for i in info_raw:
    a = i.text
    info.append(a)



# Remove both Naira and Dollar signs
prices = [item for item in prices if item not in ['₦', '$']]


print(len(names))
print(len(prices))

print(len(addresses))
print(len(info))

df = pd.DataFrame({'Name':names,'Price':prices,'Address':addresses,'Info':info})


df.to_csv('Lagos_properties.csv')

#### **Result:**
The data was scraped successfully but I only did 2 pages, this can be increased based on the number of pages available. The 2 pages I did here was for test only.