# 🏙️ Hyderabad Rental Market Data Scraper

### *Author:* Ramasahayam Keerthi Reddy  
**Description:**  
This project scrapes rental property data from [Housing.com](https://housing.com) for **Hyderabad, Telangana**, using Python’s `requests` and `BeautifulSoup` libraries.  
The extracted data includes **price, area, furnishing status, locality, and key amenities**, which is then saved as a structured **CSV file** for further analysis.

---



## 🔗 Importing Required Libraries


In [11]:
import requests
from bs4 import BeautifulSoup
import re

## 🌐 Step 1: Generating Page URLs


In [12]:
urls=["https://housing.com/rent/house-for-rent-in-hyderabad-telangana-M2P679xe73u28050522"]

In [13]:
for i in range(2,101):
    urls.append("https://housing.com/rent/flats-for-rent-in-hyderabad-telangana-P679xe73u28050522?page="+str(i))

## 🧾 Step 2: Initializing Data Storage Lists


In [14]:
no_of_rooms=[]
unit_type=[]
property_category=[]
locality=[]
society_name=[]
price=[]
area_sqft=[]
furnishing_status=[]
pool=[]
gym=[]
Parking=[]
Lift=[]
Close_to_Hospital=[]
power_backup=[]
Kids_Area=[]
Security=[]

## 🧠 Step 3: Defining HTTP Request Headers


In [15]:
headers = {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
    "Accept-Encoding": "gzip, deflate, br, zstd",
    "Accept-Language": "en-US,en;q=0.9",
    "Cache-Control": "max-age=0",
    "Referer": "https://in.search.yahoo.com/",
    "Sec-CH-UA": '"Google Chrome";v="141", "Not?A_Brand";v="8", "Chromium";v="141"',
    "Sec-CH-UA-Mobile": "?0",
    "Sec-CH-UA-Platform": '"Windows"',
    "Sec-Fetch-Dest": "document",
    "Sec-Fetch-Mode": "navigate",
    "Sec-Fetch-Site": "cross-site",
    "Sec-Fetch-User": "?1",
    "Upgrade-Insecure-Requests": "1",
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/141.0.0.0 Safari/537.36",
    "Cookie": "Akamai-GRN=aka-0.acf03517.1759478128.378448dd; category=residential; ... (trimmed)"
}

## 🕷️ Step 4: Extracting Data from Housing.com

### - Extracting Number of Rooms and Unit Type

In [6]:
for url in urls:
    resp=requests.get(url,headers=headers)
    soup=BeautifulSoup(resp.text,"html.parser")
    a=soup.find_all("div",class_="infoTopContainer")
    for i in a:
        no_of_rooms.append(re.split(r'(\bBHK\b|\bRK\b)', i.text)[0])
    for i in a:
        unit_type.append(re.split(r'(\bBHK\b|\bRK\b)', i.text)[1])
    

### - Extracting Property Category

In [147]:
for url in urls:
    resp=requests.get(url,headers=headers)
    soup=BeautifulSoup(resp.text,"html.parser")
    a=soup.find_all("div",class_="infoTopContainer")
    for i in a:
        property_category.append(i.text.split('for rent')[0].split()[2])

### - Extracting Locality & Society Name

In [149]:
for url in urls:
    resp=requests.get(url,headers=headers)
    soup=BeautifulSoup(resp.text,"html.parser")
    a=soup.find_all("div",class_="infoTopContainer")
    for i in a:
        locality.append(i.text.split('  ')[0].split('for rent in')[-1])

In [156]:
for url in urls:
    resp=requests.get(url,headers=headers)
    soup=BeautifulSoup(resp.text,"html.parser")
    a=soup.find_all("div",class_="infoTopContainer")
    for i in a:
        society_name.append(i.text.split('₹')[0].split("  ")[-1])

### - Extracting Price and Area

In [159]:
for url in urls:
    resp=requests.get(url,headers=headers)
    soup=BeautifulSoup(resp.text,"html.parser")
    a=soup.find_all("div",class_="infoTopContainer")
    for i in a:
        price.append(i.text.split('₹')[1].split("see")[0])

In [162]:
for url in urls:
    resp=requests.get(url,headers=headers)
    soup=BeautifulSoup(resp.text,"html.parser")
    a=soup.find_all("div",class_="infoTopContainer")
    for i in a:
        area_sqft.append(i.text.split("breakup")[1].split()[0])

### - Extracting Furnishing Status

In [164]:
for url in urls:
    resp=requests.get(url,headers=headers)
    soup=BeautifulSoup(resp.text,"html.parser")
    a=soup.find_all("div",class_="infoTopContainer")
    for i in a:
        furnishing_status.append(i.text.split("Furnishing status")[0].split("sq.ftBuiltup")[-1])

### - Extracting Amenities (Gym, Pool, Parking, etc.)

In [165]:
for url in urls:
    resp=requests.get(url,headers=headers)
    soup=BeautifulSoup(resp.text,"html.parser")
    a=soup.find_all("div",class_="infoTopContainer")
    for i in a:
        if re.findall(r"\bGym",i.text)==['Gym']:
            gym.append("Yes")
        else:
            gym.append("No")

In [167]:
for url in urls:
    resp=requests.get(url,headers=headers)
    soup=BeautifulSoup(resp.text,"html.parser")
    a=soup.find_all("div",class_="infoTopContainer")
    for i in a:
        if re.findall(r"\bPoo\w+",i.text)==['Pool']:
            pool.append("Yes")
        else:
            pool.append("No")

In [170]:
for url in urls:
    resp=requests.get(url,headers=headers)
    soup=BeautifulSoup(resp.text,"html.parser")
    a=soup.find_all("div",class_="infoTopContainer")
    for i in a:
        if re.findall(r"\bLift",i.text)==['Lift']:
            Lift.append("Yes")
        else:
            Lift.append("No")

In [171]:
for url in urls:
    resp=requests.get(url,headers=headers)
    soup=BeautifulSoup(resp.text,"html.parser")
    a=soup.find_all("div",class_="infoTopContainer")
    for i in a:
        if re.findall(r"\bParking",i.text)==['Parking']:
            Parking.append("Yes")
        else:
            Parking.append("No")

In [172]:
for url in urls:
    resp=requests.get(url,headers=headers)
    soup=BeautifulSoup(resp.text,"html.parser")
    a=soup.find_all("div",class_="infoTopContainer")
    for i in a:
        if re.findall(r"\bClose to Hospital",i.text)==['Close to Hospital']:
            Close_to_Hospital.append("Yes")
        else:
            Close_to_Hospital.append("No")

In [122]:
for url in urls:
    resp=requests.get(url,headers=headers)
    soup=BeautifulSoup(resp.text,"html.parser")
    a=soup.find_all("div",class_="infoTopContainer")
    for i in a:
        if re.findall(r"\bPower Backup",i.text)==['Power Backup']:
            power_backup.append("Yes")
        else:
            power_backup.append("No")

In [123]:
for url in urls:
    resp=requests.get(url,headers=headers)
    soup=BeautifulSoup(resp.text,"html.parser")
    a=soup.find_all("div",class_="infoTopContainer")
    for i in a:
        if re.findall(r"\bKids Area",i.text)==['Kids Area']:
            Kids_Area.append("Yes")
        else:
            Kids_Area.append("No")

In [124]:
for url in urls:
    resp=requests.get(url,headers=headers)
    soup=BeautifulSoup(resp.text,"html.parser")
    a=soup.find_all("div",class_="infoTopContainer")
    for i in a:
        if re.search(r"\b24x7 Security",i.text)==['24x7 Security']:
            Security.append("Yes")
        else:
            Security.append("No")

## 🧮 Step 5: Validating Data Lengths


In [183]:
for lst in [no_of_rooms, unit_type, property_category, locality, society_name, price, area_sqft,
            furnishing_status, pool, gym, Parking, Lift, Close_to_Hospital, power_backup, Kids_Area, Security]:
    print(len(lst))


3000
3000
3000
3000
3000
3000
3000
3000
3000
3000
3000
3000
3000
3000
3000
3000


## 💾 Step 6: Creating DataFrame and Saving to CSV


In [16]:
import pandas as pd
data = {
    'No_of_Rooms': no_of_rooms,
    'Unit_Type': unit_type,
    'Property_Category': property_category,
    'Locality': locality,
    'Society_Name': society_name,
    'Price_INR': price,
    'Area_sqft': area_sqft,
    'Furnishing_Status': furnishing_status,
    'Has_Pool': pool,
    'Has_Gym': gym,
    'Has_Parking': Parking,
    'Has_Lift': Lift,
    'Close_to_Hospital': Close_to_Hospital,
    'Power_Backup': power_backup,
    'Kids_Play_Area': Kids_Area,
    'Security_24x7': Security
}

df = pd.DataFrame(data)
df.to_csv("webscrap1.csv", index=False)
print("DataFrame created successfully")


DataFrame created successfully


# ✅ Data Scraping Completed Successfully!
**File Saved As:** `webscrap1.csv`

This notebook successfully scraped rental property data for **Hyderabad, Telangana** and saved it as a structured CSV file for analysis.
