# **Problem Statement**:
The objective of this project is to develop a robust data preprocessing pipeline for real estate property listings. The pipeline should encompass both numerical and textual data, focusing on cleaning, transforming, and enriching the dataset to provide a foundation for accurate analysis and modeling.

# **Dataset Description**:

The dataset comprises the following columns:



* **property_name**: Name or title associated with the property listing.

* **property_type**: Type or category of the property (e.g., apartment, villa, penthouse).

* **society**: Name of the housing society or complex where the property is situated.

* **price**: Selling or listing price of the property.

* **price_per_sqft**: Price per square foot of the property.

* **area**: Estimated area of the property in square units.

* **areaWithType**: Combination of area and property type, providing size and type context.

* **bedRoom**: Number of bedrooms in the property.

* **bathroom**: Number of bathrooms in the property.

* **balcony**: Number of balconies attached to the property.

* **additionalRoom**: Count of extra rooms or spaces in the property.

* **address**: Location details of the property, including city and locality.

* **floorNum**: Floor number of the property within the building.

* **facing**: Direction the property is facing (e.g., north, south, east).

* **agePossession**: Age of the property or years since construction.

* **nearbyLocations**: Landmarks or neighborhoods near the property's location.

* **description**: Detailed description of property features and amenities.

* **furnishDetails**: Level of furnishing in the property (e.g., fully furnished, semi-furnished).

* **features**: Special amenities associated with the property (e.g., pool, security).

* **rating**: Rating or feedback score assigned to the property.



# **Data Preprocessing**:



* Handle missing values, correct data types, and remove duplicates.

* Calculate 'area' based on price and price per square foot.

* Encode categorical variables like property type and facing direction.

* and many more

and lastly Export Cleaned Data



# **Import Basic Libraries**

In [1]:
import numpy as np
import pandas as pd

In [2]:
pd.set_option('display.max_rows',None)
pd.set_option('display.max_columns',None)

In [3]:
df = pd.read_csv('houses.csv')
df.head(3)

Unnamed: 0,property_name,link,society,price,rate,area,areaWithType,bedRoom,bathroom,balcony,additionalRoom,address,noOfFloor,facing,agePossession,nearbyLocations,description,furnishDetails,features,rating,property_id
0,5 Bedroom House for sale in Sector 70A Gurgaon,https://www.99acres.com/5-bhk-bedroom-independ...,Bptp Visionnaire,5.25 Crore,"₹ 20,115/sq.ft.",(242 sq.m.) Plot Area,Plot area 290(242.48 sq.m.),5 Bedrooms,4 Bathrooms,3+ Balconies,Servant Room,"29b, Sector 70A Gurgaon, Gurgaon, Haryana",3 Floors,North-East,0 to 1 Year Old,"['Good Earth City Center 2', 'Kunskapsskolan I...",Do you wish to buy an independent house in sec...,"['1 Wardrobe', '1 Fan', '1 Exhaust Fan', '1 Ge...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment5 out of 5', 'Lifestyle4 out of 5...",K70037724
1,5 Bedroom House for sale in Sector 21A Faridabad,https://www.99acres.com/5-bhk-bedroom-independ...,,5.7 Crore,"₹ 105,751/sq.ft.",(50 sq.m.) Plot Area,Plot area 539(50.07 sq.m.),5 Bedrooms,4 Bathrooms,2 Balconies,"Store Room,Pooja Room,Servant Room","Sector 21A Faridabad, Gurgaon, Haryana",2 Floors,,5 to 10 Year Old,,"Hi, we have an independent house/villa availab...","['1 Water Purifier', '5 Fan', '1 Exhaust Fan',...","['Private Garden / Terrace', 'Park', 'Visitor ...",,E69288322
2,10 Bedroom House for sale in Sushant Lok Phase 1,https://www.99acres.com/10-bhk-bedroom-indepen...,,2.1 Crore,"₹ 38,251/sq.ft.",(51 sq.m.) Plot Area,Plot area 61(51 sq.m.),10 Bedrooms,10 Bathrooms,3+ Balconies,Servant Room,"Sushant Lok Phase 1, Gurgaon, Haryana",5 Floors,West,0 to 1 Year Old,"['Sector 42-43 metro station', 'Huda city cent...","Monthly rental income is rs1,40,000/- Best opt...","['10 Bed', '3 Fan', '10 Geyser', '2 Light', 'N...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5...",F69536898


# **Basic Checks**

In [4]:
# shape
df.shape

(1036, 21)

In [5]:
# Check for missing values
df.isnull().sum()

property_name        0
link                 0
society            584
price               68
rate                38
area                 0
areaWithType        49
bedRoom             49
bathroom            49
balcony             49
additionalRoom     447
address              5
noOfFloor           69
facing             362
agePossession       49
nearbyLocations    123
description          0
furnishDetails     293
features           362
rating             129
property_id          0
dtype: int64

In [6]:
# Duplictes
df.duplicated().sum()

13

In [7]:
# info
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1036 entries, 0 to 1035
Data columns (total 21 columns):
 #   Column           Non-Null Count  Dtype 
---  ------           --------------  ----- 
 0   property_name    1036 non-null   object
 1   link             1036 non-null   object
 2   society          452 non-null    object
 3   price            968 non-null    object
 4   rate             998 non-null    object
 5   area             1036 non-null   object
 6   areaWithType     987 non-null    object
 7   bedRoom          987 non-null    object
 8   bathroom         987 non-null    object
 9   balcony          987 non-null    object
 10  additionalRoom   589 non-null    object
 11  address          1031 non-null   object
 12  noOfFloor        967 non-null    object
 13  facing           674 non-null    object
 14  agePossession    987 non-null    object
 15  nearbyLocations  913 non-null    object
 16  description      1036 non-null   object
 17  furnishDetails   743 non-null    

# **Drop Unwanted Columns**

In [8]:
# Drop columns -> property_name,link,property_id

df.drop(columns=['link','property_id'],inplace=True)

In [9]:
df.head(3)

Unnamed: 0,property_name,society,price,rate,area,areaWithType,bedRoom,bathroom,balcony,additionalRoom,address,noOfFloor,facing,agePossession,nearbyLocations,description,furnishDetails,features,rating
0,5 Bedroom House for sale in Sector 70A Gurgaon,Bptp Visionnaire,5.25 Crore,"₹ 20,115/sq.ft.",(242 sq.m.) Plot Area,Plot area 290(242.48 sq.m.),5 Bedrooms,4 Bathrooms,3+ Balconies,Servant Room,"29b, Sector 70A Gurgaon, Gurgaon, Haryana",3 Floors,North-East,0 to 1 Year Old,"['Good Earth City Center 2', 'Kunskapsskolan I...",Do you wish to buy an independent house in sec...,"['1 Wardrobe', '1 Fan', '1 Exhaust Fan', '1 Ge...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment5 out of 5', 'Lifestyle4 out of 5..."
1,5 Bedroom House for sale in Sector 21A Faridabad,,5.7 Crore,"₹ 105,751/sq.ft.",(50 sq.m.) Plot Area,Plot area 539(50.07 sq.m.),5 Bedrooms,4 Bathrooms,2 Balconies,"Store Room,Pooja Room,Servant Room","Sector 21A Faridabad, Gurgaon, Haryana",2 Floors,,5 to 10 Year Old,,"Hi, we have an independent house/villa availab...","['1 Water Purifier', '5 Fan', '1 Exhaust Fan',...","['Private Garden / Terrace', 'Park', 'Visitor ...",
2,10 Bedroom House for sale in Sushant Lok Phase 1,,2.1 Crore,"₹ 38,251/sq.ft.",(51 sq.m.) Plot Area,Plot area 61(51 sq.m.),10 Bedrooms,10 Bathrooms,3+ Balconies,Servant Room,"Sushant Lok Phase 1, Gurgaon, Haryana",5 Floors,West,0 to 1 Year Old,"['Sector 42-43 metro station', 'Huda city cent...","Monthly rental income is rs1,40,000/- Best opt...","['10 Bed', '3 Fan', '10 Geyser', '2 Light', 'N...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."


# **Rename Some columns**

In [10]:
df.rename(columns={'rate':'price_per_sqft'},inplace=True)

In [11]:
df.head(3)

Unnamed: 0,property_name,society,price,price_per_sqft,area,areaWithType,bedRoom,bathroom,balcony,additionalRoom,address,noOfFloor,facing,agePossession,nearbyLocations,description,furnishDetails,features,rating
0,5 Bedroom House for sale in Sector 70A Gurgaon,Bptp Visionnaire,5.25 Crore,"₹ 20,115/sq.ft.",(242 sq.m.) Plot Area,Plot area 290(242.48 sq.m.),5 Bedrooms,4 Bathrooms,3+ Balconies,Servant Room,"29b, Sector 70A Gurgaon, Gurgaon, Haryana",3 Floors,North-East,0 to 1 Year Old,"['Good Earth City Center 2', 'Kunskapsskolan I...",Do you wish to buy an independent house in sec...,"['1 Wardrobe', '1 Fan', '1 Exhaust Fan', '1 Ge...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment5 out of 5', 'Lifestyle4 out of 5..."
1,5 Bedroom House for sale in Sector 21A Faridabad,,5.7 Crore,"₹ 105,751/sq.ft.",(50 sq.m.) Plot Area,Plot area 539(50.07 sq.m.),5 Bedrooms,4 Bathrooms,2 Balconies,"Store Room,Pooja Room,Servant Room","Sector 21A Faridabad, Gurgaon, Haryana",2 Floors,,5 to 10 Year Old,,"Hi, we have an independent house/villa availab...","['1 Water Purifier', '5 Fan', '1 Exhaust Fan',...","['Private Garden / Terrace', 'Park', 'Visitor ...",
2,10 Bedroom House for sale in Sushant Lok Phase 1,,2.1 Crore,"₹ 38,251/sq.ft.",(51 sq.m.) Plot Area,Plot area 61(51 sq.m.),10 Bedrooms,10 Bathrooms,3+ Balconies,Servant Room,"Sushant Lok Phase 1, Gurgaon, Haryana",5 Floors,West,0 to 1 Year Old,"['Sector 42-43 metro station', 'Huda city cent...","Monthly rental income is rs1,40,000/- Best opt...","['10 Bed', '3 Fan', '10 Geyser', '2 Light', 'N...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."


# **Society**

In [12]:
df['society'].value_counts()

International City by SOBHA Phase 2                  26
Emaar MGF Marbella                                   26
Vipul Tatvam Villa                                   26
International City by Sobha Phase 1                  23
Unitech Uniworld Resorts                             13
DLF City Plots                                       11
DLF City Plots Phase 2                               11
Unitech Espace                                       11
Eldeco Mansionz                                       9
Bptp Visionnaire                                      9
Uppal Southend4.0 ★                                   8
Not Applicable                                        7
Ansals Palam Vihar                                    6
Experion Windchants3.8 ★                              6
Sushant Lok 1 Builder Floors                          6
DLF City Phase 1                                      5
Vatika India Next4.1 ★                                4
nul                                             

In [13]:
df['society'].value_counts().shape

(185,)

In [14]:
import re
df['society']=df['society'].apply(lambda name: re.sub(r'\d+(\.\d+)?\s?★', '', str(name)).strip()).str.lower()

In [15]:
df['society'].value_counts().shape

(183,)

In [16]:
df['society']=df['society'].str.replace('nan','independent')

In [17]:
df.sample(5)

Unnamed: 0,property_name,society,price,price_per_sqft,area,areaWithType,bedRoom,bathroom,balcony,additionalRoom,address,noOfFloor,facing,agePossession,nearbyLocations,description,furnishDetails,features,rating
540,6 Bedroom House for sale in DLF Phase 2,independent,13 Crore,"₹ 45,710/sq.ft.",(264 sq.m.) Plot Area,Plot area 316(264.22 sq.m.),6 Bedrooms,8 Bathrooms,3+ Balconies,"Pooja Room,Servant Room,Study Room,Store Room","DLF Phase 2, Gurgaon, Haryana",,,1 to 5 Year Old,"['Chintapurni Mandir', 'State bank ATM', 'Esic...",An impeccable opportunity is here for those wh...,"['1 Water Purifier', '1 Fridge', '1 Sofa', '1 ...",,"['Environment5 out of 5', 'Lifestyle5 out of 5..."
169,4 Bedroom House for sale in Ashok Vihar Phase ...,ashok vihar phase 3 gurgaon,95 Lac,"₹ 8,878/sq.ft.",(99 sq.m.) Plot Area,Plot area 1070(99.41 sq.m.),4 Bedrooms,3 Bathrooms,2 Balconies,,"House No 58 Gali No 11, Ashok Vihar Phase 3 Ex...",1 Floors,East,10+ Year Old,"['Palam Vihar Vyapar kendra', 'Palam triangle'...",60 60 gaj ke 2 plot bna rkhe hai 2 side wala p...,,['Park'],"['Environment3 out of 5', 'Lifestyle4 out of 5..."
738,12 Bedroom House for sale in DLF Phase 1,luxury dlf city floors,8.25 Crore,"₹ 45,833/sq.ft.",(167 sq.m.) Plot Area,Plot area 200(167.23 sq.m.),12 Bedrooms,12 Bathrooms,3+ Balconies,Servant Room,"A Block Dlf-1, DLF Phase 1, Gurgaon, Haryana",4 Floors,North,0 to 1 Year Old,"['Sikandarpur metro station', 'Sikanderpur met...",For sale brand new building having stilt and 4...,"['20 Fan', '1 Exhaust Fan', '12 Geyser', '40 L...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment5 out of 5', 'Lifestyle5 out of 5..."
841,3 Bedroom House for sale in Nirvana Country,unitech aspen greens,12 Crore,"₹ 37,037/sq.ft.",(301 sq.m.) Plot Area,Plot area 360(301.01 sq.m.),3 Bedrooms,4 Bathrooms,2 Balconies,"Study Room,Servant Room","Nirvana Country, Gurgaon, Haryana",1 Floors,East,5 to 10 Year Old,"['Radhakrishna Shani Mandir', 'Standard charte...",Park feacing villa for sale in nirvana country...,"['3 Wardrobe', '3 Fan', '1 Exhaust Fan', '3 Ge...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment5 out of 5', 'Safety5 out of 5', ..."
328,2 Bedroom House for sale in Sector 13 Gurgaon,independent,1.5 Crore,"₹ 10,518/sq.ft.",(132 sq.m.) Plot Area,Plot area 1426(132.48 sq.m.),2 Bedrooms,2 Bathrooms,2 Balconies,"Pooja Room,Study Room","Hno. 26, A Block, Sector 13 Gurgaon, Gurgaon, ...",3 Floors,South-West,10+ Year Old,"['Sheetla Mata Mandir', 'Hsbc bank ATM', 'Hdfc...","The house is located in sanjay gram, sector 13...",,"['Water Storage', 'Waste Disposal']","['Environment4 out of 5', 'Lifestyle5 out of 5..."


# **Price**

In [18]:
df['price'].value_counts()

3.5 Crore           19
5.5 Crore           17
6.5 Crore           16
6 Crore             14
8 Crore             13
5 Crore             13
4.5 Crore           13
50 Lac              13
70 Lac              13
75 Lac              13
85 Lac              12
2.5 Crore           12
8.5 Crore           12
7 Crore             12
4 Crore             11
1.5 Crore           11
60 Lac              10
45 Lac              10
Price on Request    10
90 Lac              10
10 Crore            10
9 Crore             10
2.25 Crore          10
80 Lac               9
1.7 Crore            9
9.5 Crore            9
1.1 Crore            9
12 Crore             9
2 Crore              9
1.6 Crore            8
4.6 Crore            8
1 Crore              8
6.75 Crore           8
11 Crore             8
3.75 Crore           8
1.2 Crore            7
2.1 Crore            7
1.3 Crore            7
65 Lac               7
95 Lac               7
55 Lac               7
40 Lac               7
1.35 Crore           7
12.5 Crore 

In [19]:
df['price'].value_counts().shape

(302,)

In [20]:
df = df[df['price'] != 'Price on Request']

In [21]:
df.head()

Unnamed: 0,property_name,society,price,price_per_sqft,area,areaWithType,bedRoom,bathroom,balcony,additionalRoom,address,noOfFloor,facing,agePossession,nearbyLocations,description,furnishDetails,features,rating
0,5 Bedroom House for sale in Sector 70A Gurgaon,bptp visionnaire,5.25 Crore,"₹ 20,115/sq.ft.",(242 sq.m.) Plot Area,Plot area 290(242.48 sq.m.),5 Bedrooms,4 Bathrooms,3+ Balconies,Servant Room,"29b, Sector 70A Gurgaon, Gurgaon, Haryana",3 Floors,North-East,0 to 1 Year Old,"['Good Earth City Center 2', 'Kunskapsskolan I...",Do you wish to buy an independent house in sec...,"['1 Wardrobe', '1 Fan', '1 Exhaust Fan', '1 Ge...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment5 out of 5', 'Lifestyle4 out of 5..."
1,5 Bedroom House for sale in Sector 21A Faridabad,independent,5.7 Crore,"₹ 105,751/sq.ft.",(50 sq.m.) Plot Area,Plot area 539(50.07 sq.m.),5 Bedrooms,4 Bathrooms,2 Balconies,"Store Room,Pooja Room,Servant Room","Sector 21A Faridabad, Gurgaon, Haryana",2 Floors,,5 to 10 Year Old,,"Hi, we have an independent house/villa availab...","['1 Water Purifier', '5 Fan', '1 Exhaust Fan',...","['Private Garden / Terrace', 'Park', 'Visitor ...",
2,10 Bedroom House for sale in Sushant Lok Phase 1,independent,2.1 Crore,"₹ 38,251/sq.ft.",(51 sq.m.) Plot Area,Plot area 61(51 sq.m.),10 Bedrooms,10 Bathrooms,3+ Balconies,Servant Room,"Sushant Lok Phase 1, Gurgaon, Haryana",5 Floors,West,0 to 1 Year Old,"['Sector 42-43 metro station', 'Huda city cent...","Monthly rental income is rs1,40,000/- Best opt...","['10 Bed', '3 Fan', '10 Geyser', '2 Light', 'N...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."
3,21 Bedroom House for sale in Sector 54 Gurgaon,independent,5 Crore,"₹ 43,066/sq.ft.",(108 sq.m.) Plot Area,Plot area 129(107.86 sq.m.),21 Bedrooms,21 Bathrooms,3+ Balconies,Servant Room,"Sector 54 Gurgaon, Gurgaon, Haryana",5 Floors,North,1 to 5 Year Old,"['Sector 53-54 metro station', 'Sector 54 chow...","129 sq yd plot size. 5 floors built up , fully...","['1 Water Purifier', '21 Fan', '1 Fridge', '1 ...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment4 out of 5', 'Lifestyle5 out of 5..."
4,12 Bedroom House for sale in Sushant Lok Phase 1,independent,3 Crore,"₹ 53,763/sq.ft.",(52 sq.m.) Plot Area,Plot area 62(51.84 sq.m.),12 Bedrooms,12 Bathrooms,3+ Balconies,Others,"1228, Sushant Lok Phase 1, Gurgaon, Haryana",5 Floors,West,Within 6 months,"['Sector 42-43 metro station', 'Huda city cent...",Best for investment purpose. Fully furnished b...,"['1 Water Purifier', '1 Fridge', '12 Fan', '1 ...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."


In [22]:
def treat_price(x):
    if type(x) == float:
        return x
    else:
        if x[1] == 'Lac':
            return round(float(x[0])/100,2)
        else:
            return round(float(x[0]),2)

In [23]:
df['price'] = df['price'].str.split(' ').apply(treat_price)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['price'] = df['price'].str.split(' ').apply(treat_price)


In [24]:
df.head()

Unnamed: 0,property_name,society,price,price_per_sqft,area,areaWithType,bedRoom,bathroom,balcony,additionalRoom,address,noOfFloor,facing,agePossession,nearbyLocations,description,furnishDetails,features,rating
0,5 Bedroom House for sale in Sector 70A Gurgaon,bptp visionnaire,5.25,"₹ 20,115/sq.ft.",(242 sq.m.) Plot Area,Plot area 290(242.48 sq.m.),5 Bedrooms,4 Bathrooms,3+ Balconies,Servant Room,"29b, Sector 70A Gurgaon, Gurgaon, Haryana",3 Floors,North-East,0 to 1 Year Old,"['Good Earth City Center 2', 'Kunskapsskolan I...",Do you wish to buy an independent house in sec...,"['1 Wardrobe', '1 Fan', '1 Exhaust Fan', '1 Ge...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment5 out of 5', 'Lifestyle4 out of 5..."
1,5 Bedroom House for sale in Sector 21A Faridabad,independent,5.7,"₹ 105,751/sq.ft.",(50 sq.m.) Plot Area,Plot area 539(50.07 sq.m.),5 Bedrooms,4 Bathrooms,2 Balconies,"Store Room,Pooja Room,Servant Room","Sector 21A Faridabad, Gurgaon, Haryana",2 Floors,,5 to 10 Year Old,,"Hi, we have an independent house/villa availab...","['1 Water Purifier', '5 Fan', '1 Exhaust Fan',...","['Private Garden / Terrace', 'Park', 'Visitor ...",
2,10 Bedroom House for sale in Sushant Lok Phase 1,independent,2.1,"₹ 38,251/sq.ft.",(51 sq.m.) Plot Area,Plot area 61(51 sq.m.),10 Bedrooms,10 Bathrooms,3+ Balconies,Servant Room,"Sushant Lok Phase 1, Gurgaon, Haryana",5 Floors,West,0 to 1 Year Old,"['Sector 42-43 metro station', 'Huda city cent...","Monthly rental income is rs1,40,000/- Best opt...","['10 Bed', '3 Fan', '10 Geyser', '2 Light', 'N...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."
3,21 Bedroom House for sale in Sector 54 Gurgaon,independent,5.0,"₹ 43,066/sq.ft.",(108 sq.m.) Plot Area,Plot area 129(107.86 sq.m.),21 Bedrooms,21 Bathrooms,3+ Balconies,Servant Room,"Sector 54 Gurgaon, Gurgaon, Haryana",5 Floors,North,1 to 5 Year Old,"['Sector 53-54 metro station', 'Sector 54 chow...","129 sq yd plot size. 5 floors built up , fully...","['1 Water Purifier', '21 Fan', '1 Fridge', '1 ...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment4 out of 5', 'Lifestyle5 out of 5..."
4,12 Bedroom House for sale in Sushant Lok Phase 1,independent,3.0,"₹ 53,763/sq.ft.",(52 sq.m.) Plot Area,Plot area 62(51.84 sq.m.),12 Bedrooms,12 Bathrooms,3+ Balconies,Others,"1228, Sushant Lok Phase 1, Gurgaon, Haryana",5 Floors,West,Within 6 months,"['Sector 42-43 metro station', 'Huda city cent...",Best for investment purpose. Fully furnished b...,"['1 Water Purifier', '1 Fridge', '12 Fan', '1 ...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."


# **price_per_sqft**

In [25]:
df['price_per_sqft'].value_counts()

₹ 22,222/sq.ft.     14
₹ 11,111/sq.ft.     11
₹ 33,333/sq.ft.     11
₹ 10,000/sq.ft.     10
₹ 26,667/sq.ft.      6
₹ 31,481/sq.ft.      6
₹ 37,037/sq.ft.      6
₹ 30,556/sq.ft.      6
₹ 26,235/sq.ft.      6
₹ 9,444/sq.ft.       5
₹ 23,333/sq.ft.      5
₹ 5,000/sq.ft.       5
₹ 8,889/sq.ft.       5
₹ 11,100/sq.ft.      5
₹ 8,000/sq.ft.       5
₹ 13,333/sq.ft.      5
₹ 23,148/sq.ft.      5
₹ 7,058/sq.ft.       4
₹ 21,605/sq.ft.      4
₹ 34,722/sq.ft.      4
₹ 9,722/sq.ft.       4
₹ 24,691/sq.ft.      4
₹ 35,000/sq.ft.      4
₹ 18,889/sq.ft.      4
₹ 27,778/sq.ft.      4
₹ 32,407/sq.ft.      4
₹ 16,058/sq.ft.      3
₹ 36,111/sq.ft.      3
₹ 12,000/sq.ft.      3
₹ 6,944/sq.ft.       3
₹ 17,284/sq.ft.      3
₹ 15,625/sq.ft.      3
₹ 25,562/sq.ft.      3
₹ 24,500/sq.ft.      3
₹ 11,250/sq.ft.      3
₹ 14,815/sq.ft.      3
₹ 8,333/sq.ft.       3
₹ 20,000/sq.ft.      3
₹ 60,000/sq.ft.      3
₹ 12,222/sq.ft.      3
₹ 8,235/sq.ft.       3
₹ 25,000/sq.ft.      3
₹ 8,444/sq.ft.       3
₹ 25,926/sq

In [26]:
df['price_per_sqft'].value_counts().shape

(750,)

In [27]:
df['price_per_sqft'] = df['price_per_sqft'].str.split('/').str.get(0).str.replace('₹','').str.replace(',','').str.strip().astype('float')

In [28]:
df.head()

Unnamed: 0,property_name,society,price,price_per_sqft,area,areaWithType,bedRoom,bathroom,balcony,additionalRoom,address,noOfFloor,facing,agePossession,nearbyLocations,description,furnishDetails,features,rating
0,5 Bedroom House for sale in Sector 70A Gurgaon,bptp visionnaire,5.25,20115.0,(242 sq.m.) Plot Area,Plot area 290(242.48 sq.m.),5 Bedrooms,4 Bathrooms,3+ Balconies,Servant Room,"29b, Sector 70A Gurgaon, Gurgaon, Haryana",3 Floors,North-East,0 to 1 Year Old,"['Good Earth City Center 2', 'Kunskapsskolan I...",Do you wish to buy an independent house in sec...,"['1 Wardrobe', '1 Fan', '1 Exhaust Fan', '1 Ge...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment5 out of 5', 'Lifestyle4 out of 5..."
1,5 Bedroom House for sale in Sector 21A Faridabad,independent,5.7,105751.0,(50 sq.m.) Plot Area,Plot area 539(50.07 sq.m.),5 Bedrooms,4 Bathrooms,2 Balconies,"Store Room,Pooja Room,Servant Room","Sector 21A Faridabad, Gurgaon, Haryana",2 Floors,,5 to 10 Year Old,,"Hi, we have an independent house/villa availab...","['1 Water Purifier', '5 Fan', '1 Exhaust Fan',...","['Private Garden / Terrace', 'Park', 'Visitor ...",
2,10 Bedroom House for sale in Sushant Lok Phase 1,independent,2.1,38251.0,(51 sq.m.) Plot Area,Plot area 61(51 sq.m.),10 Bedrooms,10 Bathrooms,3+ Balconies,Servant Room,"Sushant Lok Phase 1, Gurgaon, Haryana",5 Floors,West,0 to 1 Year Old,"['Sector 42-43 metro station', 'Huda city cent...","Monthly rental income is rs1,40,000/- Best opt...","['10 Bed', '3 Fan', '10 Geyser', '2 Light', 'N...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."
3,21 Bedroom House for sale in Sector 54 Gurgaon,independent,5.0,43066.0,(108 sq.m.) Plot Area,Plot area 129(107.86 sq.m.),21 Bedrooms,21 Bathrooms,3+ Balconies,Servant Room,"Sector 54 Gurgaon, Gurgaon, Haryana",5 Floors,North,1 to 5 Year Old,"['Sector 53-54 metro station', 'Sector 54 chow...","129 sq yd plot size. 5 floors built up , fully...","['1 Water Purifier', '21 Fan', '1 Fridge', '1 ...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment4 out of 5', 'Lifestyle5 out of 5..."
4,12 Bedroom House for sale in Sushant Lok Phase 1,independent,3.0,53763.0,(52 sq.m.) Plot Area,Plot area 62(51.84 sq.m.),12 Bedrooms,12 Bathrooms,3+ Balconies,Others,"1228, Sushant Lok Phase 1, Gurgaon, Haryana",5 Floors,West,Within 6 months,"['Sector 42-43 metro station', 'Huda city cent...",Best for investment purpose. Fully furnished b...,"['1 Water Purifier', '1 Fridge', '12 Fan', '1 ...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."


# **bedRoom**

In [30]:
df['bedRoom'].value_counts()

4 Bedrooms     239
5 Bedrooms     195
3 Bedrooms     148
2 Bedrooms      88
6 Bedrooms      82
9 Bedrooms      45
1 Bedroom       38
8 Bedrooms      34
7 Bedrooms      31
12 Bedrooms     29
10 Bedrooms     21
16 Bedrooms     12
13 Bedrooms      4
18 Bedrooms      2
19 Bedrooms      2
11 Bedrooms      2
20 Bedrooms      1
21 Bedrooms      1
36 Bedrooms      1
34 Bedrooms      1
14 Bedrooms      1
Name: bedRoom, dtype: int64

In [34]:
df['bedRoom'].isnull().sum()

49

In [35]:
df = df[~df['bedRoom'].isnull()]

In [36]:
df.shape

(977, 19)

In [None]:
df['bedRoom']= df['bedRoom'].str.split(' ').str.get(0).astype('int')

In [38]:
df.head()

Unnamed: 0,property_name,society,price,price_per_sqft,area,areaWithType,bedRoom,bathroom,balcony,additionalRoom,address,noOfFloor,facing,agePossession,nearbyLocations,description,furnishDetails,features,rating
0,5 Bedroom House for sale in Sector 70A Gurgaon,bptp visionnaire,5.25,20115.0,(242 sq.m.) Plot Area,Plot area 290(242.48 sq.m.),5,4 Bathrooms,3+ Balconies,Servant Room,"29b, Sector 70A Gurgaon, Gurgaon, Haryana",3 Floors,North-East,0 to 1 Year Old,"['Good Earth City Center 2', 'Kunskapsskolan I...",Do you wish to buy an independent house in sec...,"['1 Wardrobe', '1 Fan', '1 Exhaust Fan', '1 Ge...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment5 out of 5', 'Lifestyle4 out of 5..."
1,5 Bedroom House for sale in Sector 21A Faridabad,independent,5.7,105751.0,(50 sq.m.) Plot Area,Plot area 539(50.07 sq.m.),5,4 Bathrooms,2 Balconies,"Store Room,Pooja Room,Servant Room","Sector 21A Faridabad, Gurgaon, Haryana",2 Floors,,5 to 10 Year Old,,"Hi, we have an independent house/villa availab...","['1 Water Purifier', '5 Fan', '1 Exhaust Fan',...","['Private Garden / Terrace', 'Park', 'Visitor ...",
2,10 Bedroom House for sale in Sushant Lok Phase 1,independent,2.1,38251.0,(51 sq.m.) Plot Area,Plot area 61(51 sq.m.),10,10 Bathrooms,3+ Balconies,Servant Room,"Sushant Lok Phase 1, Gurgaon, Haryana",5 Floors,West,0 to 1 Year Old,"['Sector 42-43 metro station', 'Huda city cent...","Monthly rental income is rs1,40,000/- Best opt...","['10 Bed', '3 Fan', '10 Geyser', '2 Light', 'N...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."
3,21 Bedroom House for sale in Sector 54 Gurgaon,independent,5.0,43066.0,(108 sq.m.) Plot Area,Plot area 129(107.86 sq.m.),21,21 Bathrooms,3+ Balconies,Servant Room,"Sector 54 Gurgaon, Gurgaon, Haryana",5 Floors,North,1 to 5 Year Old,"['Sector 53-54 metro station', 'Sector 54 chow...","129 sq yd plot size. 5 floors built up , fully...","['1 Water Purifier', '21 Fan', '1 Fridge', '1 ...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment4 out of 5', 'Lifestyle5 out of 5..."
4,12 Bedroom House for sale in Sushant Lok Phase 1,independent,3.0,53763.0,(52 sq.m.) Plot Area,Plot area 62(51.84 sq.m.),12,12 Bathrooms,3+ Balconies,Others,"1228, Sushant Lok Phase 1, Gurgaon, Haryana",5 Floors,West,Within 6 months,"['Sector 42-43 metro station', 'Huda city cent...",Best for investment purpose. Fully furnished b...,"['1 Water Purifier', '1 Fridge', '12 Fan', '1 ...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."


# **bathroom**

In [39]:
df['bathroom'].value_counts()

4 Bathrooms     233
3 Bathrooms     166
5 Bathrooms     139
2 Bathrooms     114
6 Bathrooms      85
1 Bathroom       65
9 Bathrooms      42
7 Bathrooms      39
8 Bathrooms      29
12 Bathrooms     23
10 Bathrooms     11
16 Bathrooms      8
18 Bathrooms      4
13 Bathrooms      4
11 Bathrooms      4
20 Bathrooms      3
17 Bathrooms      3
14 Bathrooms      2
21 Bathrooms      1
36 Bathrooms      1
34 Bathrooms      1
Name: bathroom, dtype: int64

In [40]:
df['bathroom'].isnull().sum()

0

In [None]:
df['bathroom'] = df['bathroom'].str.split(' ').str.get(0).astype('int')

In [42]:
df.head()

Unnamed: 0,property_name,society,price,price_per_sqft,area,areaWithType,bedRoom,bathroom,balcony,additionalRoom,address,noOfFloor,facing,agePossession,nearbyLocations,description,furnishDetails,features,rating
0,5 Bedroom House for sale in Sector 70A Gurgaon,bptp visionnaire,5.25,20115.0,(242 sq.m.) Plot Area,Plot area 290(242.48 sq.m.),5,4,3+ Balconies,Servant Room,"29b, Sector 70A Gurgaon, Gurgaon, Haryana",3 Floors,North-East,0 to 1 Year Old,"['Good Earth City Center 2', 'Kunskapsskolan I...",Do you wish to buy an independent house in sec...,"['1 Wardrobe', '1 Fan', '1 Exhaust Fan', '1 Ge...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment5 out of 5', 'Lifestyle4 out of 5..."
1,5 Bedroom House for sale in Sector 21A Faridabad,independent,5.7,105751.0,(50 sq.m.) Plot Area,Plot area 539(50.07 sq.m.),5,4,2 Balconies,"Store Room,Pooja Room,Servant Room","Sector 21A Faridabad, Gurgaon, Haryana",2 Floors,,5 to 10 Year Old,,"Hi, we have an independent house/villa availab...","['1 Water Purifier', '5 Fan', '1 Exhaust Fan',...","['Private Garden / Terrace', 'Park', 'Visitor ...",
2,10 Bedroom House for sale in Sushant Lok Phase 1,independent,2.1,38251.0,(51 sq.m.) Plot Area,Plot area 61(51 sq.m.),10,10,3+ Balconies,Servant Room,"Sushant Lok Phase 1, Gurgaon, Haryana",5 Floors,West,0 to 1 Year Old,"['Sector 42-43 metro station', 'Huda city cent...","Monthly rental income is rs1,40,000/- Best opt...","['10 Bed', '3 Fan', '10 Geyser', '2 Light', 'N...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."
3,21 Bedroom House for sale in Sector 54 Gurgaon,independent,5.0,43066.0,(108 sq.m.) Plot Area,Plot area 129(107.86 sq.m.),21,21,3+ Balconies,Servant Room,"Sector 54 Gurgaon, Gurgaon, Haryana",5 Floors,North,1 to 5 Year Old,"['Sector 53-54 metro station', 'Sector 54 chow...","129 sq yd plot size. 5 floors built up , fully...","['1 Water Purifier', '21 Fan', '1 Fridge', '1 ...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment4 out of 5', 'Lifestyle5 out of 5..."
4,12 Bedroom House for sale in Sushant Lok Phase 1,independent,3.0,53763.0,(52 sq.m.) Plot Area,Plot area 62(51.84 sq.m.),12,12,3+ Balconies,Others,"1228, Sushant Lok Phase 1, Gurgaon, Haryana",5 Floors,West,Within 6 months,"['Sector 42-43 metro station', 'Huda city cent...",Best for investment purpose. Fully furnished b...,"['1 Water Purifier', '1 Fridge', '12 Fan', '1 ...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."


# **balcony**

In [43]:
df['balcony'].value_counts()

3+ Balconies    373
2 Balconies     223
3 Balconies     161
No Balcony      115
1 Balcony       105
Name: balcony, dtype: int64

In [44]:
df['balcony'].isnull().sum()

0

In [None]:
df['balcony'] = df['balcony'].str.split(' ').str.get(0).str.replace('NO','0')

In [46]:
df.head()

Unnamed: 0,property_name,society,price,price_per_sqft,area,areaWithType,bedRoom,bathroom,balcony,additionalRoom,address,noOfFloor,facing,agePossession,nearbyLocations,description,furnishDetails,features,rating
0,5 Bedroom House for sale in Sector 70A Gurgaon,bptp visionnaire,5.25,20115.0,(242 sq.m.) Plot Area,Plot area 290(242.48 sq.m.),5,4,3+,Servant Room,"29b, Sector 70A Gurgaon, Gurgaon, Haryana",3 Floors,North-East,0 to 1 Year Old,"['Good Earth City Center 2', 'Kunskapsskolan I...",Do you wish to buy an independent house in sec...,"['1 Wardrobe', '1 Fan', '1 Exhaust Fan', '1 Ge...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment5 out of 5', 'Lifestyle4 out of 5..."
1,5 Bedroom House for sale in Sector 21A Faridabad,independent,5.7,105751.0,(50 sq.m.) Plot Area,Plot area 539(50.07 sq.m.),5,4,2,"Store Room,Pooja Room,Servant Room","Sector 21A Faridabad, Gurgaon, Haryana",2 Floors,,5 to 10 Year Old,,"Hi, we have an independent house/villa availab...","['1 Water Purifier', '5 Fan', '1 Exhaust Fan',...","['Private Garden / Terrace', 'Park', 'Visitor ...",
2,10 Bedroom House for sale in Sushant Lok Phase 1,independent,2.1,38251.0,(51 sq.m.) Plot Area,Plot area 61(51 sq.m.),10,10,3+,Servant Room,"Sushant Lok Phase 1, Gurgaon, Haryana",5 Floors,West,0 to 1 Year Old,"['Sector 42-43 metro station', 'Huda city cent...","Monthly rental income is rs1,40,000/- Best opt...","['10 Bed', '3 Fan', '10 Geyser', '2 Light', 'N...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."
3,21 Bedroom House for sale in Sector 54 Gurgaon,independent,5.0,43066.0,(108 sq.m.) Plot Area,Plot area 129(107.86 sq.m.),21,21,3+,Servant Room,"Sector 54 Gurgaon, Gurgaon, Haryana",5 Floors,North,1 to 5 Year Old,"['Sector 53-54 metro station', 'Sector 54 chow...","129 sq yd plot size. 5 floors built up , fully...","['1 Water Purifier', '21 Fan', '1 Fridge', '1 ...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment4 out of 5', 'Lifestyle5 out of 5..."
4,12 Bedroom House for sale in Sushant Lok Phase 1,independent,3.0,53763.0,(52 sq.m.) Plot Area,Plot area 62(51.84 sq.m.),12,12,3+,Others,"1228, Sushant Lok Phase 1, Gurgaon, Haryana",5 Floors,West,Within 6 months,"['Sector 42-43 metro station', 'Huda city cent...",Best for investment purpose. Fully furnished b...,"['1 Water Purifier', '1 Fridge', '12 Fan', '1 ...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."


# **additionalRoom**

In [47]:
df['additionalRoom'].value_counts()

Servant Room                                     82
Pooja Room,Study Room,Servant Room,Store Room    63
Others                                           59
Pooja Room                                       38
Pooja Room,Study Room,Servant Room               34
Pooja Room,Study Room,Servant Room,Others        32
Store Room                                       31
Pooja Room,Servant Room                          24
Study Room                                       19
Study Room,Servant Room                          19
Servant Room,Store Room                          15
Pooja Room,Study Room                            11
Pooja Room,Store Room                            11
Pooja Room,Others                                11
Pooja Room,Store Room,Study Room,Servant Room     9
Servant Room,Pooja Room                           8
Servant Room,Others                               8
Pooja Room,Servant Room,Store Room                7
Study Room,Servant Room,Pooja Room,Store Room     7
Servant Room

In [48]:
df['additionalRoom'].isnull().sum()

388

In [None]:
df['additionalRoom'].fillna('not available',inplace=True)

In [None]:
df['additionalRoom'] = df['additionalRoom'].str.lower()

In [54]:
df['additionalRoom'].isnull().sum()

0

In [55]:
df.head()

Unnamed: 0,property_name,society,price,price_per_sqft,area,areaWithType,bedRoom,bathroom,balcony,additionalRoom,address,noOfFloor,facing,agePossession,nearbyLocations,description,furnishDetails,features,rating
0,5 Bedroom House for sale in Sector 70A Gurgaon,bptp visionnaire,5.25,20115.0,(242 sq.m.) Plot Area,Plot area 290(242.48 sq.m.),5,4,3+,servant room,"29b, Sector 70A Gurgaon, Gurgaon, Haryana",3 Floors,North-East,0 to 1 Year Old,"['Good Earth City Center 2', 'Kunskapsskolan I...",Do you wish to buy an independent house in sec...,"['1 Wardrobe', '1 Fan', '1 Exhaust Fan', '1 Ge...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment5 out of 5', 'Lifestyle4 out of 5..."
1,5 Bedroom House for sale in Sector 21A Faridabad,independent,5.7,105751.0,(50 sq.m.) Plot Area,Plot area 539(50.07 sq.m.),5,4,2,"store room,pooja room,servant room","Sector 21A Faridabad, Gurgaon, Haryana",2 Floors,,5 to 10 Year Old,,"Hi, we have an independent house/villa availab...","['1 Water Purifier', '5 Fan', '1 Exhaust Fan',...","['Private Garden / Terrace', 'Park', 'Visitor ...",
2,10 Bedroom House for sale in Sushant Lok Phase 1,independent,2.1,38251.0,(51 sq.m.) Plot Area,Plot area 61(51 sq.m.),10,10,3+,servant room,"Sushant Lok Phase 1, Gurgaon, Haryana",5 Floors,West,0 to 1 Year Old,"['Sector 42-43 metro station', 'Huda city cent...","Monthly rental income is rs1,40,000/- Best opt...","['10 Bed', '3 Fan', '10 Geyser', '2 Light', 'N...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."
3,21 Bedroom House for sale in Sector 54 Gurgaon,independent,5.0,43066.0,(108 sq.m.) Plot Area,Plot area 129(107.86 sq.m.),21,21,3+,servant room,"Sector 54 Gurgaon, Gurgaon, Haryana",5 Floors,North,1 to 5 Year Old,"['Sector 53-54 metro station', 'Sector 54 chow...","129 sq yd plot size. 5 floors built up , fully...","['1 Water Purifier', '21 Fan', '1 Fridge', '1 ...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment4 out of 5', 'Lifestyle5 out of 5..."
4,12 Bedroom House for sale in Sushant Lok Phase 1,independent,3.0,53763.0,(52 sq.m.) Plot Area,Plot area 62(51.84 sq.m.),12,12,3+,others,"1228, Sushant Lok Phase 1, Gurgaon, Haryana",5 Floors,West,Within 6 months,"['Sector 42-43 metro station', 'Huda city cent...",Best for investment purpose. Fully furnished b...,"['1 Water Purifier', '1 Fridge', '12 Fan', '1 ...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."


# **noOfFloor**

In [56]:
df['noOfFloor'].value_counts()


3 Floors     309
2 Floors     294
1 Floors     181
4 Floors     137
5 Floors       9
0 Floors       6
14 Floors      3
6 Floors       3
11 Floors      2
12 Floors      2
10 Floors      2
16 Floors      1
51 Floors      1
41 Floors      1
27 Floors      1
20 Floors      1
13 Floors      1
21 Floors      1
02 Floors      1
33 Floors      1
Name: noOfFloor, dtype: int64

In [57]:
df['noOfFloor'].isnull().sum()

20

In [None]:
df['noOfFloor']=df['noOfFloor'].str.split().str.get(0)

In [59]:
df.head()

Unnamed: 0,property_name,society,price,price_per_sqft,area,areaWithType,bedRoom,bathroom,balcony,additionalRoom,address,noOfFloor,facing,agePossession,nearbyLocations,description,furnishDetails,features,rating
0,5 Bedroom House for sale in Sector 70A Gurgaon,bptp visionnaire,5.25,20115.0,(242 sq.m.) Plot Area,Plot area 290(242.48 sq.m.),5,4,3+,servant room,"29b, Sector 70A Gurgaon, Gurgaon, Haryana",3,North-East,0 to 1 Year Old,"['Good Earth City Center 2', 'Kunskapsskolan I...",Do you wish to buy an independent house in sec...,"['1 Wardrobe', '1 Fan', '1 Exhaust Fan', '1 Ge...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment5 out of 5', 'Lifestyle4 out of 5..."
1,5 Bedroom House for sale in Sector 21A Faridabad,independent,5.7,105751.0,(50 sq.m.) Plot Area,Plot area 539(50.07 sq.m.),5,4,2,"store room,pooja room,servant room","Sector 21A Faridabad, Gurgaon, Haryana",2,,5 to 10 Year Old,,"Hi, we have an independent house/villa availab...","['1 Water Purifier', '5 Fan', '1 Exhaust Fan',...","['Private Garden / Terrace', 'Park', 'Visitor ...",
2,10 Bedroom House for sale in Sushant Lok Phase 1,independent,2.1,38251.0,(51 sq.m.) Plot Area,Plot area 61(51 sq.m.),10,10,3+,servant room,"Sushant Lok Phase 1, Gurgaon, Haryana",5,West,0 to 1 Year Old,"['Sector 42-43 metro station', 'Huda city cent...","Monthly rental income is rs1,40,000/- Best opt...","['10 Bed', '3 Fan', '10 Geyser', '2 Light', 'N...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."
3,21 Bedroom House for sale in Sector 54 Gurgaon,independent,5.0,43066.0,(108 sq.m.) Plot Area,Plot area 129(107.86 sq.m.),21,21,3+,servant room,"Sector 54 Gurgaon, Gurgaon, Haryana",5,North,1 to 5 Year Old,"['Sector 53-54 metro station', 'Sector 54 chow...","129 sq yd plot size. 5 floors built up , fully...","['1 Water Purifier', '21 Fan', '1 Fridge', '1 ...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment4 out of 5', 'Lifestyle5 out of 5..."
4,12 Bedroom House for sale in Sushant Lok Phase 1,independent,3.0,53763.0,(52 sq.m.) Plot Area,Plot area 62(51.84 sq.m.),12,12,3+,others,"1228, Sushant Lok Phase 1, Gurgaon, Haryana",5,West,Within 6 months,"['Sector 42-43 metro station', 'Huda city cent...",Best for investment purpose. Fully furnished b...,"['1 Water Purifier', '1 Fridge', '12 Fan', '1 ...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."


In [None]:
df.rename(columns={'noOfFloor':'floorNum'},inplace=True)

In [65]:
df.head()

Unnamed: 0,property_name,society,price,price_per_sqft,area,areaWithType,bedRoom,bathroom,balcony,additionalRoom,address,floorNum,facing,agePossession,nearbyLocations,description,furnishDetails,features,rating
0,5 Bedroom House for sale in Sector 70A Gurgaon,bptp visionnaire,5.25,20115.0,(242 sq.m.) Plot Area,Plot area 290(242.48 sq.m.),5,4,3+,servant room,"29b, Sector 70A Gurgaon, Gurgaon, Haryana",3,North-East,0 to 1 Year Old,"['Good Earth City Center 2', 'Kunskapsskolan I...",Do you wish to buy an independent house in sec...,"['1 Wardrobe', '1 Fan', '1 Exhaust Fan', '1 Ge...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment5 out of 5', 'Lifestyle4 out of 5..."
1,5 Bedroom House for sale in Sector 21A Faridabad,independent,5.7,105751.0,(50 sq.m.) Plot Area,Plot area 539(50.07 sq.m.),5,4,2,"store room,pooja room,servant room","Sector 21A Faridabad, Gurgaon, Haryana",2,,5 to 10 Year Old,,"Hi, we have an independent house/villa availab...","['1 Water Purifier', '5 Fan', '1 Exhaust Fan',...","['Private Garden / Terrace', 'Park', 'Visitor ...",
2,10 Bedroom House for sale in Sushant Lok Phase 1,independent,2.1,38251.0,(51 sq.m.) Plot Area,Plot area 61(51 sq.m.),10,10,3+,servant room,"Sushant Lok Phase 1, Gurgaon, Haryana",5,West,0 to 1 Year Old,"['Sector 42-43 metro station', 'Huda city cent...","Monthly rental income is rs1,40,000/- Best opt...","['10 Bed', '3 Fan', '10 Geyser', '2 Light', 'N...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."
3,21 Bedroom House for sale in Sector 54 Gurgaon,independent,5.0,43066.0,(108 sq.m.) Plot Area,Plot area 129(107.86 sq.m.),21,21,3+,servant room,"Sector 54 Gurgaon, Gurgaon, Haryana",5,North,1 to 5 Year Old,"['Sector 53-54 metro station', 'Sector 54 chow...","129 sq yd plot size. 5 floors built up , fully...","['1 Water Purifier', '21 Fan', '1 Fridge', '1 ...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment4 out of 5', 'Lifestyle5 out of 5..."
4,12 Bedroom House for sale in Sushant Lok Phase 1,independent,3.0,53763.0,(52 sq.m.) Plot Area,Plot area 62(51.84 sq.m.),12,12,3+,others,"1228, Sushant Lok Phase 1, Gurgaon, Haryana",5,West,Within 6 months,"['Sector 42-43 metro station', 'Huda city cent...",Best for investment purpose. Fully furnished b...,"['1 Water Purifier', '1 Fridge', '12 Fan', '1 ...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."


# **facing**

In [66]:
df['facing'].value_counts()

East          181
North-East    159
North         113
West           75
North-West     44
South-East     40
South          36
South-West     25
Name: facing, dtype: int64

In [68]:
df['facing'].isnull().sum()


304

In [None]:
df['facing'].fillna('NA',inplace=True)

In [70]:
df['facing'].isnull().sum()

0

In [72]:
df.head()

Unnamed: 0,property_name,society,price,price_per_sqft,area,areaWithType,bedRoom,bathroom,balcony,additionalRoom,address,floorNum,facing,agePossession,nearbyLocations,description,furnishDetails,features,rating
0,5 Bedroom House for sale in Sector 70A Gurgaon,bptp visionnaire,5.25,20115.0,(242 sq.m.) Plot Area,Plot area 290(242.48 sq.m.),5,4,3+,servant room,"29b, Sector 70A Gurgaon, Gurgaon, Haryana",3,North-East,0 to 1 Year Old,"['Good Earth City Center 2', 'Kunskapsskolan I...",Do you wish to buy an independent house in sec...,"['1 Wardrobe', '1 Fan', '1 Exhaust Fan', '1 Ge...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment5 out of 5', 'Lifestyle4 out of 5..."
1,5 Bedroom House for sale in Sector 21A Faridabad,independent,5.7,105751.0,(50 sq.m.) Plot Area,Plot area 539(50.07 sq.m.),5,4,2,"store room,pooja room,servant room","Sector 21A Faridabad, Gurgaon, Haryana",2,,5 to 10 Year Old,,"Hi, we have an independent house/villa availab...","['1 Water Purifier', '5 Fan', '1 Exhaust Fan',...","['Private Garden / Terrace', 'Park', 'Visitor ...",
2,10 Bedroom House for sale in Sushant Lok Phase 1,independent,2.1,38251.0,(51 sq.m.) Plot Area,Plot area 61(51 sq.m.),10,10,3+,servant room,"Sushant Lok Phase 1, Gurgaon, Haryana",5,West,0 to 1 Year Old,"['Sector 42-43 metro station', 'Huda city cent...","Monthly rental income is rs1,40,000/- Best opt...","['10 Bed', '3 Fan', '10 Geyser', '2 Light', 'N...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."
3,21 Bedroom House for sale in Sector 54 Gurgaon,independent,5.0,43066.0,(108 sq.m.) Plot Area,Plot area 129(107.86 sq.m.),21,21,3+,servant room,"Sector 54 Gurgaon, Gurgaon, Haryana",5,North,1 to 5 Year Old,"['Sector 53-54 metro station', 'Sector 54 chow...","129 sq yd plot size. 5 floors built up , fully...","['1 Water Purifier', '21 Fan', '1 Fridge', '1 ...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment4 out of 5', 'Lifestyle5 out of 5..."
4,12 Bedroom House for sale in Sushant Lok Phase 1,independent,3.0,53763.0,(52 sq.m.) Plot Area,Plot area 62(51.84 sq.m.),12,12,3+,others,"1228, Sushant Lok Phase 1, Gurgaon, Haryana",5,West,Within 6 months,"['Sector 42-43 metro station', 'Huda city cent...",Best for investment purpose. Fully furnished b...,"['1 Water Purifier', '1 Fridge', '12 Fan', '1 ...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."


# **area**

In [None]:
df['area'] = round((df['price']*10000000)/df['price_per_sqft'])

In [74]:
df.insert(loc=1,column='property_type',value='house')

In [75]:
df.head()

Unnamed: 0,property_name,property_type,society,price,price_per_sqft,area,areaWithType,bedRoom,bathroom,balcony,additionalRoom,address,floorNum,facing,agePossession,nearbyLocations,description,furnishDetails,features,rating
0,5 Bedroom House for sale in Sector 70A Gurgaon,house,bptp visionnaire,5.25,20115.0,2610.0,Plot area 290(242.48 sq.m.),5,4,3+,servant room,"29b, Sector 70A Gurgaon, Gurgaon, Haryana",3,North-East,0 to 1 Year Old,"['Good Earth City Center 2', 'Kunskapsskolan I...",Do you wish to buy an independent house in sec...,"['1 Wardrobe', '1 Fan', '1 Exhaust Fan', '1 Ge...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment5 out of 5', 'Lifestyle4 out of 5..."
1,5 Bedroom House for sale in Sector 21A Faridabad,house,independent,5.7,105751.0,539.0,Plot area 539(50.07 sq.m.),5,4,2,"store room,pooja room,servant room","Sector 21A Faridabad, Gurgaon, Haryana",2,,5 to 10 Year Old,,"Hi, we have an independent house/villa availab...","['1 Water Purifier', '5 Fan', '1 Exhaust Fan',...","['Private Garden / Terrace', 'Park', 'Visitor ...",
2,10 Bedroom House for sale in Sushant Lok Phase 1,house,independent,2.1,38251.0,549.0,Plot area 61(51 sq.m.),10,10,3+,servant room,"Sushant Lok Phase 1, Gurgaon, Haryana",5,West,0 to 1 Year Old,"['Sector 42-43 metro station', 'Huda city cent...","Monthly rental income is rs1,40,000/- Best opt...","['10 Bed', '3 Fan', '10 Geyser', '2 Light', 'N...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."
3,21 Bedroom House for sale in Sector 54 Gurgaon,house,independent,5.0,43066.0,1161.0,Plot area 129(107.86 sq.m.),21,21,3+,servant room,"Sector 54 Gurgaon, Gurgaon, Haryana",5,North,1 to 5 Year Old,"['Sector 53-54 metro station', 'Sector 54 chow...","129 sq yd plot size. 5 floors built up , fully...","['1 Water Purifier', '21 Fan', '1 Fridge', '1 ...","['Feng Shui / Vaastu Compliant', 'Private Gard...","['Environment4 out of 5', 'Lifestyle5 out of 5..."
4,12 Bedroom House for sale in Sushant Lok Phase 1,house,independent,3.0,53763.0,558.0,Plot area 62(51.84 sq.m.),12,12,3+,others,"1228, Sushant Lok Phase 1, Gurgaon, Haryana",5,West,Within 6 months,"['Sector 42-43 metro station', 'Huda city cent...",Best for investment purpose. Fully furnished b...,"['1 Water Purifier', '1 Fridge', '12 Fan', '1 ...","['Maintenance Staff', 'Water Storage', 'Visito...","['Environment5 out of 5', 'Lifestyle5 out of 5..."


In [77]:
df.shape

(977, 20)

In [78]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 977 entries, 0 to 1035
Data columns (total 20 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   property_name    977 non-null    object 
 1   property_type    977 non-null    object 
 2   society          977 non-null    object 
 3   price            958 non-null    float64
 4   price_per_sqft   958 non-null    float64
 5   area             958 non-null    float64
 6   areaWithType     977 non-null    object 
 7   bedRoom          977 non-null    int64  
 8   bathroom         977 non-null    int64  
 9   balcony          977 non-null    object 
 10  additionalRoom   977 non-null    object 
 11  address          972 non-null    object 
 12  floorNum         957 non-null    object 
 13  facing           977 non-null    object 
 14  agePossession    977 non-null    object 
 15  nearbyLocations  861 non-null    object 
 16  description      977 non-null    object 
 17  furnishDetails 

In [79]:
df.to_csv('houses_cleaned.csv',index=False)