# Analysis of Renting Property in UAE

The purpose of this project is to get an overview of the renting property in UAE. This notebook uses `pandas` library to load the scraped `csv` file, clean it, transform the data where necessary and prepare it to work on visualization purposes.

###  &#9656; Source: [Property Finder](https://www.propertyfinder.ae)

## Column Details:
1. Verification Status
    - Verified
    - Not Verified
2. Agent Type
    - Superagent
    - Not Superagent
3. Property Type
    - Apartment
    - Villa
    - Townhouse
    - Hotel & Hotel Apartment
    - Duplex
    - Penthouse
    - Compound
    - Bungalow
    - Full Floor
    - Half Floor
    - Bulk Rent Unit
4. Rent Amount (AED/YR)
    - Numerical value type ranging from MIN___________ to MAX
5. Features
    - Custom User Input 
6. Listing Level
    - Featured
    - Premium
    - Not Premium
7. Location
    - City
        - Dubai
        - Abu Dhabi
        - Sharjah
        - Ajman
        - Ras Al Khaimah
        - Al Ain
        - Fujairah
        - Umm Al Quwain
    - Area: Value ranging 300+ areas
8. No of Bedroom
    - Non categorical value type; ranging from 1 to 7+ bedrooms
9. No of Bathroom
    - Non categorical value type; ranging from 1 to 7+ bathrooms
10. Area in sqft
    - Numerical value type ranging from MIN___________ to MAX

### Importing necessary libraries

In [1]:
import pandas as pd

# To avoid warnings
import warnings
warnings.filterwarnings('ignore')

### Loading and Analyzing the data in a dataframe

In [2]:
df = pd.read_csv("Property_Finder_Rent_Items_in_Abu_Dhabi.csv")

In [3]:
df

Unnamed: 0,Verification Status,Agent Type,Property Type,Rent Amount (AED/YR),Features,Listing Level,Location,No of Bedroom,No of Bathroom,Area in sqft
0,Verified,Superagent,Villa,500000.0,Brand New | Spacious | Modern | Garden Suite,Premium,"Harmony 2, Harmony, Tilal Al Ghaf, Dubai",4,5,4016.0
1,Verified,Superagent,Apartment,150000.0,Canal Views | Ready to Move in | Multiple Cheques,Premium,"Elite Business Bay Residence, Business Bay, Dubai",2,3,1309.0
2,Verified,Superagent,Villa,450000.0,Biggest Plot | Corner Villa | New To The Market,Premium,"Murooj Al Furjan, Al Furjan, Dubai",4,5,7767.0
3,Verified,Superagent,Villa,500000.0,Prime Location | Lake Views | Vacant Now,Premium,"Legacy, Jumeirah Park, Dubai",5,6,4689.0
4,Verified,Superagent,Townhouse,225000.0,4-Story Townhouse | Upgraded | Vacant Now,Premium,"Mirabella 1, Mirabella, Jumeirah Village Circl...",4,5,3365.0
...,...,...,...,...,...,...,...,...,...,...
99970,Not Verified,Not Superagent,Apartment,33999.0,Two-Bedroom Apartment for Rent in Ajman Corniche,Not Premium,"Ajman Corniche Road, Ajman",2,2,1450.0
99971,Not Verified,Not Superagent,Apartment,47999.0,Three-Bedroom Apartment for Rent in Al Naimiyah2,Not Premium,"Al Nuaimiya, Ajman",3,2,1550.0
99972,Not Verified,Not Superagent,Apartment,17000.0,Separate kitchen studio available with balcony...,Not Premium,"Muwaileh, Sharjah",studio,1,450.0
99973,Not Verified,Not Superagent,Apartment,29000.0,Brand New Studio with Kitchen &amp;Proper Wash...,Not Premium,"Khalifa City, Abu Dhabi",studio,1,600.0


In [4]:
df.shape  # The dataframe has 10 columns and 99975 rows

(99975, 10)

In [5]:
df.columns  # The name of the columns

Index(['Verification Status', 'Agent Type', 'Property Type',
       'Rent Amount (AED/YR)', 'Features', 'Listing Level', 'Location',
       'No of Bedroom', 'No of Bathroom', 'Area in sqft'],
      dtype='object')

In [6]:
df.info()  # overview of the data

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 99975 entries, 0 to 99974
Data columns (total 10 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   Verification Status   99975 non-null  object 
 1   Agent Type            99975 non-null  object 
 2   Property Type         99867 non-null  object 
 3   Rent Amount (AED/YR)  99867 non-null  float64
 4   Features              99867 non-null  object 
 5   Listing Level         99867 non-null  object 
 6   Location              99867 non-null  object 
 7   No of Bedroom         99867 non-null  object 
 8   No of Bathroom        99867 non-null  object 
 9   Area in sqft          99867 non-null  float64
dtypes: float64(2), object(8)
memory usage: 7.6+ MB


> We can see that _Verification Status_ and _Agent Type_ columns have 99975 values; but the rest of the columns have 99867 values.

#### Dealing with the missing values

In [7]:
df.isnull().sum()  # Number of missing values for each column

Verification Status       0
Agent Type                0
Property Type           108
Rent Amount (AED/YR)    108
Features                108
Listing Level           108
Location                108
No of Bedroom           108
No of Bathroom          108
Area in sqft            108
dtype: int64

In [8]:
df.dropna(how="any", inplace=True)  # dropping the rows with missing values

In [9]:
display(df.isnull().sum())

Verification Status     0
Agent Type              0
Property Type           0
Rent Amount (AED/YR)    0
Features                0
Listing Level           0
Location                0
No of Bedroom           0
No of Bathroom          0
Area in sqft            0
dtype: int64

>Now there are no missing values in any column.

In [10]:
df.reset_index(drop=True, inplace=True)  # reseting the indices of the rows, and dropping the previous indices
df

Unnamed: 0,Verification Status,Agent Type,Property Type,Rent Amount (AED/YR),Features,Listing Level,Location,No of Bedroom,No of Bathroom,Area in sqft
0,Verified,Superagent,Villa,500000.0,Brand New | Spacious | Modern | Garden Suite,Premium,"Harmony 2, Harmony, Tilal Al Ghaf, Dubai",4,5,4016.0
1,Verified,Superagent,Apartment,150000.0,Canal Views | Ready to Move in | Multiple Cheques,Premium,"Elite Business Bay Residence, Business Bay, Dubai",2,3,1309.0
2,Verified,Superagent,Villa,450000.0,Biggest Plot | Corner Villa | New To The Market,Premium,"Murooj Al Furjan, Al Furjan, Dubai",4,5,7767.0
3,Verified,Superagent,Villa,500000.0,Prime Location | Lake Views | Vacant Now,Premium,"Legacy, Jumeirah Park, Dubai",5,6,4689.0
4,Verified,Superagent,Townhouse,225000.0,4-Story Townhouse | Upgraded | Vacant Now,Premium,"Mirabella 1, Mirabella, Jumeirah Village Circl...",4,5,3365.0
...,...,...,...,...,...,...,...,...,...,...
99862,Not Verified,Not Superagent,Apartment,33999.0,Two-Bedroom Apartment for Rent in Ajman Corniche,Not Premium,"Ajman Corniche Road, Ajman",2,2,1450.0
99863,Not Verified,Not Superagent,Apartment,47999.0,Three-Bedroom Apartment for Rent in Al Naimiyah2,Not Premium,"Al Nuaimiya, Ajman",3,2,1550.0
99864,Not Verified,Not Superagent,Apartment,17000.0,Separate kitchen studio available with balcony...,Not Premium,"Muwaileh, Sharjah",studio,1,450.0
99865,Not Verified,Not Superagent,Apartment,29000.0,Brand New Studio with Kitchen &amp;Proper Wash...,Not Premium,"Khalifa City, Abu Dhabi",studio,1,600.0


In [11]:
display(df.shape)

(99867, 10)

>The new shape of the dataframe is 99867 rows and 10 columns.

### Exploring the value types

#### Verification types

In [12]:
verification_types = "Verification Status"
display(df[verification_types].value_counts())

Not Verified    76324
Verified        23543
Name: Verification Status, dtype: int64

#### Agent types

In [13]:
agent_types = "Agent Type"
print(df[agent_types].value_counts())

Not Superagent    70203
Superagent        29664
Name: Agent Type, dtype: int64


#### Property types

In [14]:
property_types = "Property Type"
print(df[property_types].value_counts())

Apartment                  73380
Villa                      19219
Townhouse                   5124
Hotel & Hotel Apartment     1024
Duplex                       549
Penthouse                    389
Compound                     146
Bungalow                      21
Full Floor                     9
Half Floor                     3
Bulk Rent Unit                 3
Name: Property Type, dtype: int64


#### Premiumness types

In [15]:
premium_types = "Listing Level"
print(df[premium_types].value_counts())

Not Premium    88657
Featured        6465
Premium         4745
Name: Listing Level, dtype: int64


#### Unique no of Bedrooms

In [16]:
bedroom_count = "No of Bedroom"
print(df[bedroom_count].value_counts())

1                                                                                                      27620
2                                                                                                      25872
3                                                                                                      16761
studio                                                                                                 10795
4                                                                                                       9278
5                                                                                                       6508
6                                                                                                       1872
7                                                                                                        742
7+                                                                                                       400
Al Khalidiya, Abu D

>To convert the _No of Bedrooms_ column into an integer type, replacing the studio type as having _1_ bedroom  and _7+_ bedrooms as _8_ bedrooms

In [17]:
for row in range(len(df)):
    try:
        df["No of Bedroom"][row] = int(df["No of Bedroom"][row])
    except Exception as e:
        if df["No of Bedroom"][row] == "studio":
            df["No of Bedroom"][row] = 1
        elif df["No of Bedroom"][row] == "7+":
            df["No of Bedroom"][row] = 8
        else:
            df = df.drop(index=row)
    finally:
        continue
        
display(df["No of Bedroom"].value_counts()) # Checking the result of the operation

1    38415
2    25872
3    16761
4     9278
5     6508
6     1872
7      742
8      400
Name: No of Bedroom, dtype: int64

#### Converting the data types to integer for numerical calculations

In [18]:
df["No of Bedroom"] = df["No of Bedroom"].astype(int)
display(df["No of Bedroom"].dtypes)

dtype('int32')

In [19]:
df.reset_index(drop=True, inplace=True)

#### Unique no of Bathrooms

In [20]:
bathroom_count = "No of Bathroom"
print(df[bathroom_count].value_counts())

2     28386
1     21864
3     17917
4     13213
5      7872
6      4698
7      3470
7+     2428
Name: No of Bathroom, dtype: int64


>Converting the _No of Bathrooms_ column into an integer type; replacing the value _7+_ to _8_

In [21]:
for row in range(len(df)):
    try:
        df["No of Bathroom"][row] = int(df["No of Bathroom"][row])
    except Exception as e:
        if df["No of Bathroom"][row] == "7+":
            df["No of Bathroom"][row] = 8
    finally:
        continue

In [22]:
df["No of Bathroom"].value_counts()  # Checking the result of the operation

2    28386
1    21864
3    17917
4    13213
5     7872
6     4698
7     3470
8     2428
Name: No of Bathroom, dtype: int64

#### Converting the data types to integer for numerical calculations

In [23]:
df["No of Bathroom"] = df["No of Bathroom"].astype(int)
display(df["No of Bathroom"].dtypes)

dtype('int32')

In [24]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 99848 entries, 0 to 99847
Data columns (total 10 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   Verification Status   99848 non-null  object 
 1   Agent Type            99848 non-null  object 
 2   Property Type         99848 non-null  object 
 3   Rent Amount (AED/YR)  99848 non-null  float64
 4   Features              99848 non-null  object 
 5   Listing Level         99848 non-null  object 
 6   Location              99848 non-null  object 
 7   No of Bedroom         99848 non-null  int32  
 8   No of Bathroom        99848 non-null  int32  
 9   Area in sqft          99848 non-null  float64
dtypes: float64(2), int32(2), object(6)
memory usage: 6.9+ MB


### Working with the location data separately

In [25]:
dfl = df["Location"].copy()  # taking an independent copy of the column

In [26]:
dfl

0                 Harmony 2, Harmony, Tilal Al Ghaf, Dubai
1        Elite Business Bay Residence, Business Bay, Dubai
2                       Murooj Al Furjan, Al Furjan, Dubai
3                             Legacy, Jumeirah Park, Dubai
4        Mirabella 1, Mirabella, Jumeirah Village Circl...
                               ...                        
99843                           Ajman Corniche Road, Ajman
99844                                   Al Nuaimiya, Ajman
99845                                    Muwaileh, Sharjah
99846                              Khalifa City, Abu Dhabi
99847                                    Al Nahda, Sharjah
Name: Location, Length: 99848, dtype: object

#### Separating the street, area and city information from the location data.

In [27]:
temp_dic = {}
street = []
area = []
city = []

for address in range(len(dfl)):
    add_set = dfl[address].split(",") # Common Operation
    street.append(",".join(add_set[:-2])) # Street data
    area.append(add_set[-2].strip()) # Area data
    city.append(add_set[-1].strip()) # City data
    

temp_dic['Street'] = street
temp_dic['Area'] = area
temp_dic['City'] = city

In [28]:
len(temp_dic)

3

In [29]:
temp_dic.keys()

dict_keys(['Street', 'Area', 'City'])

In [30]:
df = df.assign(**temp_dic)  # appending the values in the main dataframe

In [31]:
df

Unnamed: 0,Verification Status,Agent Type,Property Type,Rent Amount (AED/YR),Features,Listing Level,Location,No of Bedroom,No of Bathroom,Area in sqft,Street,Area,City
0,Verified,Superagent,Villa,500000.0,Brand New | Spacious | Modern | Garden Suite,Premium,"Harmony 2, Harmony, Tilal Al Ghaf, Dubai",4,5,4016.0,"Harmony 2, Harmony",Tilal Al Ghaf,Dubai
1,Verified,Superagent,Apartment,150000.0,Canal Views | Ready to Move in | Multiple Cheques,Premium,"Elite Business Bay Residence, Business Bay, Dubai",2,3,1309.0,Elite Business Bay Residence,Business Bay,Dubai
2,Verified,Superagent,Villa,450000.0,Biggest Plot | Corner Villa | New To The Market,Premium,"Murooj Al Furjan, Al Furjan, Dubai",4,5,7767.0,Murooj Al Furjan,Al Furjan,Dubai
3,Verified,Superagent,Villa,500000.0,Prime Location | Lake Views | Vacant Now,Premium,"Legacy, Jumeirah Park, Dubai",5,6,4689.0,Legacy,Jumeirah Park,Dubai
4,Verified,Superagent,Townhouse,225000.0,4-Story Townhouse | Upgraded | Vacant Now,Premium,"Mirabella 1, Mirabella, Jumeirah Village Circl...",4,5,3365.0,"Mirabella 1, Mirabella",Jumeirah Village Circle,Dubai
...,...,...,...,...,...,...,...,...,...,...,...,...,...
99843,Not Verified,Not Superagent,Apartment,33999.0,Two-Bedroom Apartment for Rent in Ajman Corniche,Not Premium,"Ajman Corniche Road, Ajman",2,2,1450.0,,Ajman Corniche Road,Ajman
99844,Not Verified,Not Superagent,Apartment,47999.0,Three-Bedroom Apartment for Rent in Al Naimiyah2,Not Premium,"Al Nuaimiya, Ajman",3,2,1550.0,,Al Nuaimiya,Ajman
99845,Not Verified,Not Superagent,Apartment,17000.0,Separate kitchen studio available with balcony...,Not Premium,"Muwaileh, Sharjah",1,1,450.0,,Muwaileh,Sharjah
99846,Not Verified,Not Superagent,Apartment,29000.0,Brand New Studio with Kitchen &amp;Proper Wash...,Not Premium,"Khalifa City, Abu Dhabi",1,1,600.0,,Khalifa City,Abu Dhabi


In [32]:
df.drop(columns='Location', inplace=True)  # dropping the location column to avoid redundancy

In [33]:
df.info()
display(df.shape)

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 99848 entries, 0 to 99847
Data columns (total 12 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   Verification Status   99848 non-null  object 
 1   Agent Type            99848 non-null  object 
 2   Property Type         99848 non-null  object 
 3   Rent Amount (AED/YR)  99848 non-null  float64
 4   Features              99848 non-null  object 
 5   Listing Level         99848 non-null  object 
 6   No of Bedroom         99848 non-null  int32  
 7   No of Bathroom        99848 non-null  int32  
 8   Area in sqft          99848 non-null  float64
 9   Street                99848 non-null  object 
 10  Area                  99848 non-null  object 
 11  City                  99848 non-null  object 
dtypes: float64(2), int32(2), object(8)
memory usage: 8.4+ MB


(99848, 12)

> Finally, the new shape of the dataframe is _12_ columns and _999848_ rows

In [34]:
df['City'].value_counts()  # data count for each city

Dubai             54941
Abu Dhabi         21632
Sharjah           12021
Ajman              7147
Ras Al Khaimah     2381
Al Ain             1632
Fujairah             86
Umm Al Quwain         8
Name: City, dtype: int64

> Distribution of entries for the 8 cities

In [35]:
df

Unnamed: 0,Verification Status,Agent Type,Property Type,Rent Amount (AED/YR),Features,Listing Level,No of Bedroom,No of Bathroom,Area in sqft,Street,Area,City
0,Verified,Superagent,Villa,500000.0,Brand New | Spacious | Modern | Garden Suite,Premium,4,5,4016.0,"Harmony 2, Harmony",Tilal Al Ghaf,Dubai
1,Verified,Superagent,Apartment,150000.0,Canal Views | Ready to Move in | Multiple Cheques,Premium,2,3,1309.0,Elite Business Bay Residence,Business Bay,Dubai
2,Verified,Superagent,Villa,450000.0,Biggest Plot | Corner Villa | New To The Market,Premium,4,5,7767.0,Murooj Al Furjan,Al Furjan,Dubai
3,Verified,Superagent,Villa,500000.0,Prime Location | Lake Views | Vacant Now,Premium,5,6,4689.0,Legacy,Jumeirah Park,Dubai
4,Verified,Superagent,Townhouse,225000.0,4-Story Townhouse | Upgraded | Vacant Now,Premium,4,5,3365.0,"Mirabella 1, Mirabella",Jumeirah Village Circle,Dubai
...,...,...,...,...,...,...,...,...,...,...,...,...
99843,Not Verified,Not Superagent,Apartment,33999.0,Two-Bedroom Apartment for Rent in Ajman Corniche,Not Premium,2,2,1450.0,,Ajman Corniche Road,Ajman
99844,Not Verified,Not Superagent,Apartment,47999.0,Three-Bedroom Apartment for Rent in Al Naimiyah2,Not Premium,3,2,1550.0,,Al Nuaimiya,Ajman
99845,Not Verified,Not Superagent,Apartment,17000.0,Separate kitchen studio available with balcony...,Not Premium,1,1,450.0,,Muwaileh,Sharjah
99846,Not Verified,Not Superagent,Apartment,29000.0,Brand New Studio with Kitchen &amp;Proper Wash...,Not Premium,1,1,600.0,,Khalifa City,Abu Dhabi


### <i>Taking the transformed dataframe and exporting as a CSV file for further usage</i>

In [36]:
df.to_csv('Property_Finder_Rent_Items_in_Abu_Dhabi_Transformed.csv', index=False)