### Importing necessary library

In [1]:
import pandas as pd
import numpy as np

import datetime

### Reading the data file

In [2]:
rent_df = pd.read_csv('../Data/canada_rent.csv')
rent_df

Unnamed: 0,rentfaster_id,city,province,address,latitude,longitude,lease_term,type,price,beds,baths,sq_feet,link,furnishing,availability_date,smoking,cats,dogs
0,468622,Airdrie,Alberta,69 Gateway Dr NE,51.305962,-114.012515,Long Term,Townhouse,2495.0,2 Beds,2.5,1403,/ab/airdrie/rentals/townhouse/2-bedrooms/pet-f...,Unfurnished,Immediate,Non-Smoking,True,True
1,468622,Airdrie,Alberta,69 Gateway Dr NE,51.305962,-114.012515,Long Term,Townhouse,2695.0,3 Beds,2.5,1496,/ab/airdrie/rentals/townhouse/2-bedrooms/pet-f...,Unfurnished,Immediate,Non-Smoking,True,True
2,468622,Airdrie,Alberta,69 Gateway Dr NE,51.305962,-114.012515,Long Term,Townhouse,2295.0,2 Beds,2.5,1180,/ab/airdrie/rentals/townhouse/2-bedrooms/pet-f...,Unfurnished,Immediate,Non-Smoking,True,True
3,468622,Airdrie,Alberta,69 Gateway Dr NE,51.305962,-114.012515,Long Term,Townhouse,2095.0,2 Beds,2.5,1403,/ab/airdrie/rentals/townhouse/2-bedrooms/pet-f...,Unfurnished,November 18,Non-Smoking,True,True
4,468622,Airdrie,Alberta,69 Gateway Dr NE,51.305962,-114.012515,Long Term,Townhouse,2495.0,2 Beds,2.5,1403,/ab/airdrie/rentals/townhouse/2-bedrooms/pet-f...,Unfurnished,Immediate,Non-Smoking,True,True
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
25766,334285,Yorkton,Saskatchewan,18 Allanbrooke Dr,51.203024,-102.484782,Long Term,Apartment,920.0,Studio,1,400,/sk/yorkton/rentals/apartment/3-bedrooms/pet-f...,Unfurnished,Immediate,Non-Smoking,True,True
25767,334285,Yorkton,Saskatchewan,18 Allanbrooke Dr,51.203024,-102.484782,Long Term,Apartment,1450.0,3 Beds,1,810,/sk/yorkton/rentals/apartment/3-bedrooms/pet-f...,Unfurnished,Immediate,Non-Smoking,True,True
25768,334284,Yorkton,Saskatchewan,363 Northview Rd,51.202667,-102.485765,Long Term,Apartment,1125.0,1 Bed,1,520,/sk/yorkton/rentals/apartment/1-bedroom/pet-fr...,Unfurnished,Immediate,Non-Smoking,True,True
25769,334284,Yorkton,Saskatchewan,363 Northview Rd,51.202667,-102.485765,Long Term,Apartment,920.0,Studio,1,400,/sk/yorkton/rentals/apartment/1-bedroom/pet-fr...,Unfurnished,Immediate,Non-Smoking,True,True


### Investigate the possibility of each categorical columns

In [3]:

print('The options for lease_term are : ', rent_df['lease_term'].unique(), '\n')

print('The options for type are : ', rent_df['type'].unique(), '\n')

print('The options for furnishing are : ', rent_df['furnishing'].unique(), '\n')

print('The options for smoking are : ', rent_df['smoking'].unique(), '\n')

print('The options for cats are : ', rent_df['cats'].unique(), '\n')

print('The options for dogs are : ', rent_df['dogs'].unique(), '\n')

print('The options for beds are : ', rent_df['beds'].unique(), '\n')

The options for lease_term are :  ['Long Term' 'Negotiable' 'Short Term' '12 months' 'months' '6 months' nan] 

The options for type are :  ['Townhouse' 'Apartment' 'Main Floor' 'House' 'Basement' 'Duplex'
 'Storage' 'Condo Unit' 'Room For Rent' 'Loft' 'Acreage' 'Office Space'
 'Parking Spot' 'Vacation Home' 'Mobile'] 

The options for furnishing are :  ['Unfurnished' 'Furnished' 'Negotiable' 'Unfurnished, Negotiable'] 

The options for smoking are :  ['Non-Smoking' 'Smoke Free Building' nan 'Negotiable' 'Smoking Allowed'] 

The options for cats are :  [True False nan] 

The options for dogs are :  [True False nan] 

The options for beds are :  ['2 Beds' '3 Beds' 'Studio' '1 Bed' '5 Beds' '4 Beds' nan '6 Beds'
 'none Beds' '8 Beds' '7 Beds' '9 Beds'] 



In [4]:
rent_df['furnishing'].replace({'Unfurnished, Negotiable': 'Negotiable'}, inplace=True)

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  rent_df['furnishing'].replace({'Unfurnished, Negotiable': 'Negotiable'}, inplace=True)


### Cleaning up the dataframe

I will be removing columns that will bring no additional value to the regression analysis
  * such as link and rentfaster_id columns.

I will also be working with the latitude and longitude instead of address, since converting the addresses to 1/0 columns would be massive.
  * I will be removing the address column.
  * However, I will keep the city and Province for now.

I will be removing the string 'Beds' from the beds column and will be replacing studio and none Beds by 0.

I will be replacing the string immediate in the column availability_date by today's date, in order to change the column to a datetime stamp.

I will be dropping all the rows where the column availability_date has the string 'No Vacancy'.

I will be replacing the string 'Negotiable' and 'Call for Availability' in the column availability_date by today's date.

I will be combining similar terms in the furnishing column.  'Unfurnished, Negotiable' will become 'Negotiable'.

I will be replacingsimilar terms in the smoking column. 'Smoke Free Building' replaced by 'Non-Smoking'.

I will be removing rental adds with the price of 0.

In [5]:
# Removing unecessary columns
rent_df.drop(columns=['link', 'rentfaster_id','address'], inplace=True)

**Filtering the beds column**

In [6]:
# Looking at the beds possibilities
print('Before =>',rent_df['beds'].unique())

# Removing Beds, Bed from column beds
rent_df['beds'].replace(r'\s*Beds?\s*','', regex=True, inplace=True)

# Replacing Studio and none by 0
rent_df['beds'].replace(r'\b(Studio|none)\b','0', regex=True, inplace=True)

# Verifying that the string replacement was done correctly
print('After =>',rent_df['beds'].unique())

Before => ['2 Beds' '3 Beds' 'Studio' '1 Bed' '5 Beds' '4 Beds' nan '6 Beds'
 'none Beds' '8 Beds' '7 Beds' '9 Beds']
After => ['2' '3' '0' '1' '5' '4' nan '6' '8' '7' '9']


The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  rent_df['beds'].replace(r'\s*Beds?\s*','', regex=True, inplace=True)
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  rent_df['beds'].replace(r'\b(Studio|none)\b','0', regex=True, inplace=True)


**Filtering the availability_date column**

In [7]:
# Look the possible string the column availability_date
print('Before =>', rent_df['availability_date'].unique())

# Looking at frequency of occurency of 'Negotiable', 'Call for Availability' in the column availability_date
print(rent_df[rent_df['availability_date'].isin(['Negotiable', 'Call for Availability'])].count())

# Looking at frequency of occurency of 'No Vacancy' in the column availability_dateabs
print(rent_df[rent_df['availability_date'].isin(['No Vacancy'])].count())


Before => ['Immediate' 'November 18' 'July 01' 'August 01' 'June 15' 'July 31'
 'June 30' 'July 15' 'July 07' 'July 18' 'August 07' 'August 15'
 'September 01' 'Negotiable' 'No Vacancy' 'June 14' 'July 04' 'July 05'
 'August 06' 'July 11' 'June 19' 'Call for Availability' 'September 07'
 'July 12' 'June 17' 'October 03' 'June 25' 'June 18' 'August 23'
 'July 23' 'July 02' 'October 01' 'July 08' 'June 16' 'June 20' 'June 21'
 'June 22' 'June 23' 'June 24' 'June 26' 'June 28' 'June 29' 'July 22'
 'November 01' 'July 03' 'July 09' 'July 06' 'July 10' 'July 16' 'July 17'
 'July 20' 'July 27' 'July 30' 'August 02' 'August 04' 'August 05'
 'August 09' 'August 10' 'August 11' 'August 16' 'August 20' 'August 22'
 'August 25' 'August 26' 'August 29' 'August 30' 'August 31'
 'September 02' 'September 03' 'September 05' 'September 08'
 'September 15' 'September 30' 'October 04' 'October 07' 'October 11'
 'October 16' 'October 19' 'October 31' 'November 15' 'December 01'
 'December 30' nan 'Septem

In [8]:
# Removing adds (removing the entire row) where there is 'No Vacancy'
rent_df = rent_df[~rent_df['availability_date'].isin(['No Vacancy'])]

In [9]:
# Retrieve today's date
today = datetime.date.today().strftime('%B %d') #pd.Timestamp.today()

# Replace the strings by today's date
rent_df['availability_date'].replace({'Immediate': today}, inplace=True)
rent_df['availability_date'].replace({'Negotiable': today}, inplace=True)
rent_df['availability_date'].replace({'Call for Availability': today}, inplace=True)

print('After =>', rent_df['availability_date'].unique())

# Convert the column availability_date to a timestamp
rent_df['availability_date'] = pd.to_datetime(rent_df['availability_date'] + f", {datetime.date.today().year}", format="%B %d, %Y")

rent_df['availability_date']

After => ['February 22' 'November 18' 'July 01' 'August 01' 'June 15' 'July 31'
 'June 30' 'July 15' 'July 07' 'July 18' 'August 07' 'August 15'
 'September 01' 'June 14' 'July 04' 'July 05' 'August 06' 'July 11'
 'June 19' 'September 07' 'July 12' 'June 17' 'October 03' 'June 25'
 'June 18' 'August 23' 'July 23' 'July 02' 'October 01' 'July 08'
 'June 16' 'June 20' 'June 21' 'June 22' 'June 23' 'June 24' 'June 26'
 'June 28' 'June 29' 'July 22' 'November 01' 'July 03' 'July 09' 'July 06'
 'July 10' 'July 16' 'July 17' 'July 20' 'July 27' 'July 30' 'August 02'
 'August 04' 'August 05' 'August 09' 'August 10' 'August 11' 'August 16'
 'August 20' 'August 22' 'August 25' 'August 26' 'August 29' 'August 30'
 'August 31' 'September 02' 'September 03' 'September 05' 'September 08'
 'September 15' 'September 30' 'October 04' 'October 07' 'October 11'
 'October 16' 'October 19' 'October 31' 'November 15' 'December 01'
 'December 30' nan 'September 10' 'June 27' 'October 15' 'July 13'
 'July 14

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  rent_df['availability_date'].replace({'Immediate': today}, inplace=True)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  rent_df['availability_date'].replace({'Immediate': today}, inplace=True)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  rent_df['availability_date'].replace({'Negotiable': to

0       2025-02-22
1       2025-02-22
2       2025-02-22
3       2025-11-18
4       2025-02-22
           ...    
25766   2025-02-22
25767   2025-02-22
25768   2025-02-22
25769   2025-02-22
25770   2025-02-22
Name: availability_date, Length: 24800, dtype: datetime64[ns]

**Combining similar string in the furnishing column**

In [10]:
# Look the possible string the column furnishing
print('Before =>', rent_df['furnishing'].unique())

rent_df['furnishing'].replace({'Unfurnished, Negotiable': 'Negotiable'}, inplace=True)

# Look the possible string the column furnishing
print('After =>', rent_df['furnishing'].unique())

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  rent_df['furnishing'].replace({'Unfurnished, Negotiable': 'Negotiable'}, inplace=True)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  rent_df['furnishing'].replace({'Unfurnished, Negotiable': 'Negotiable'}, inplace=True)


Before => ['Unfurnished' 'Furnished' 'Negotiable']
After => ['Unfurnished' 'Furnished' 'Negotiable']


**Combining similar string in the smoking column**

In [11]:
# Look the possible string the column smoking
print('Before =>', rent_df['smoking'].unique())

# Replace 'Smoke Free Building' by 'Non-Smoking'
rent_df['smoking'].replace({'Smoke Free Building': 'Non-Smoking'}, inplace=True)

# Replace 'Negotiable' by 'Smoking Allowed'
rent_df['smoking'].replace({'Negotiable': 'Smoking Allowed'}, inplace=True)

# Look the possible string the column smoking
print('After =>', rent_df['smoking'].unique())

Before => ['Non-Smoking' 'Smoke Free Building' nan 'Negotiable' 'Smoking Allowed']
After => ['Non-Smoking' nan 'Smoking Allowed']


The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  rent_df['smoking'].replace({'Smoke Free Building': 'Non-Smoking'}, inplace=True)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  rent_df['smoking'].replace({'Smoke Free Building': 'Non-Smoking'}, inplace=True)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  rent_df['smoking'].replace({'Negotiabl

**Cleaning up the sq_feet column**

In [12]:
# Removing the sq ft unit sometimes mentioned
rent_df['sq_feet'] = rent_df['sq_feet'].str.split(' ', expand=True)[0]

# Replace + sign
rent_df['sq_feet'] = rent_df['sq_feet'].str.split('+', expand=True)[0]

# Replace , 
rent_df['sq_feet'] = rent_df['sq_feet'].str.replace(',', '', regex=True)


# Removing range.
rent_df['sq_feet'] = rent_df['sq_feet'].str.split('-', expand=True)[0]

# Removing range.
rent_df['sq_feet'] = rent_df['sq_feet'].str.split('.', expand=True)[0]

# Replace , 
rent_df['sq_feet'] = rent_df['sq_feet'].str.replace('~', '', regex=True)

# Replace , 
rent_df['sq_feet'] = rent_df['sq_feet'].str.replace('sf', '', regex=True)

# Replace , 
rent_df['sq_feet'] = rent_df['sq_feet'].str.replace('Large', '', regex=True)

# Replace , 
rent_df['sq_feet'] = rent_df['sq_feet'].str.replace('Large', '', regex=True)

# Replace 'approx.' 
rent_df['sq_feet'] = rent_df['sq_feet'].replace(r'approx', np.nan, regex=True)
rent_df['sq_feet'] = rent_df['sq_feet'].str.replace('Approximately.', '', regex=True)
rent_df['sq_feet'] = rent_df['sq_feet'].str.replace('Approximately', '', regex=True)

# Replace 'approx.' 
rent_df['sq_feet'] = rent_df['sq_feet'].replace(r'TOL', np.nan, regex=True)

rent_df['sq_feet'] = rent_df['sq_feet'].str.replace(r'[^\d]', '', regex=True)

# Replace '' 
rent_df['sq_feet'] = rent_df['sq_feet'].replace('', np.nan, regex=True)

rent_df['sq_feet'] = pd.to_numeric(rent_df['sq_feet'], errors='coerce')

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  rent_df['sq_feet'] = rent_df['sq_feet'].str.split(' ', expand=True)[0]
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  rent_df['sq_feet'] = rent_df['sq_feet'].str.split('+', expand=True)[0]
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  rent_df['sq_feet'] = rent_df['sq_feet'].str.replace(',', '', re

**Cleaning up the price column**
Removing all potential typo and no mention of price (0$)

In [13]:
rent_df = rent_df[rent_df['price'] > 100.]

**Cleaning up the baths column**

In [14]:
# Replacing Studio and none by 0
rent_df['baths'].replace(r'\b(none)\b','0', regex=True, inplace=True)

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  rent_df['baths'].replace(r'\b(none)\b','0', regex=True, inplace=True)


**Creating new column for pets**

**Removing previous cats/dogs column**

### Investigate if the dataset contains missing values

In [15]:
rent_df.isna().sum()

city                    0
province                0
latitude                0
longitude               0
lease_term             12
type                    0
price                   0
beds                  128
baths                 130
sq_feet              3936
furnishing              0
availability_date      12
smoking              2656
cats                  143
dogs                  143
dtype: int64

I want to dig into the missing value.

I will look for what type of property has no square footage indicated. I might want to drop certain property type (ie: storage, garage, ...)

In [16]:
(rent_df[rent_df['sq_feet'].isna()].groupby(['type'])['city'].count() / rent_df.groupby(['type'])['city'].count()) *100

type
Acreage           36.363636
Apartment         12.109146
Basement          25.890163
Condo Unit        15.193247
Duplex            20.000000
House             18.306351
Loft              29.166667
Main Floor        19.876543
Mobile            50.000000
Office Space      91.935484
Parking Spot      88.095238
Room For Rent     51.675978
Storage          100.000000
Townhouse         15.350032
Vacation Home           NaN
Name: city, dtype: float64

I will be removing all the storage, since they all don't include square footage.

I will also be removing the Parking spot category, since most of them are missing the square footage.

I will also be removing the Office Space and Acreage to focus mainly on standard housing


In [17]:
# Removing the Property type based on the previous analysis
rent_df = rent_df[~rent_df['type'].isin(['Acreage', 'Office Space', 'Parking Spot', 'Storage'])]

In [18]:
rent_df.isna().sum()

city                    0
province                0
latitude                0
longitude               0
lease_term             12
type                    0
price                   0
beds                    0
baths                   0
sq_feet              3797
furnishing              0
availability_date       8
smoking              2654
cats                    0
dogs                    0
dtype: int64

In [19]:
(rent_df[rent_df['sq_feet'].isna()].groupby(['type'])['city'].count() / rent_df.groupby(['type'])['city'].count()) *100

type
Apartment        12.109146
Basement         25.890163
Condo Unit       15.193247
Duplex           20.000000
House            18.306351
Loft             29.166667
Main Floor       19.876543
Mobile           50.000000
Room For Rent    51.675978
Townhouse        15.350032
Vacation Home          NaN
Name: city, dtype: float64

Replacing the the Nan value in sq_feet by the grouped mean on building type and # of beds
  * Even after replacing the Nan value there are some group categories that just don't have non nan data to produce a mean.
  * I will be dropping two remaining rows of nan value

In [20]:
# Replacing Nan
rent_df['sq_feet'] = rent_df.groupby(['type', 'beds'])['sq_feet'].transform(lambda x: x.fillna(x.mean()))

# Verifying that there is no longer nan value for the sq_feet
rent_df.isna().sum()

# Investigating the remaining Nan
group_means = rent_df.groupby(['type', 'beds'])['sq_feet'].mean()

print(group_means[group_means.isna()])

print(rent_df[rent_df['type'] == 'House'].groupby('beds')['sq_feet'].mean())

# Dropping remaininf Nan in sq_feet
rent_df = rent_df[~rent_df['sq_feet'].isna()]

# Converting the column to int
rent_df['sq_feet'] = rent_df['sq_feet'].astype(int)

# Verifying that there is no longer nan value for the sq_feet
rent_df.isna().sum()

type           beds
Duplex         0      NaN
               6      NaN
House          9      NaN
Loft           0      NaN
Mobile         2      NaN
Room For Rent  0      NaN
Name: sq_feet, dtype: float64
beds
0    2056.666667
1     716.833333
2    1086.728395
3    1707.000000
4    2108.715311
5    2262.888158
6    2088.000000
7    4000.000000
8    9133.333333
9            NaN
Name: sq_feet, dtype: float64


city                    0
province                0
latitude                0
longitude               0
lease_term             12
type                    0
price                   0
beds                    0
baths                   0
sq_feet                 0
furnishing              0
availability_date       8
smoking              2653
cats                    0
dogs                    0
dtype: int64

I will be dropping the Nan value of lease_term because it implies only a small amount of data on the entire dataset

In [21]:
# Dropping remaininf Nan in sq_feet
rent_df = rent_df[~rent_df['lease_term'].isna()]

# Verifying that there is no longer nan value for the sq_feet
rent_df.isna().sum()

city                    0
province                0
latitude                0
longitude               0
lease_term              0
type                    0
price                   0
beds                    0
baths                   0
sq_feet                 0
furnishing              0
availability_date       8
smoking              2653
cats                    0
dogs                    0
dtype: int64

I will be replacing Nan value in smoking according to the most frequent smoking category, which is Non-Smocking.

In [22]:
# Look at the occurence count of each smoking category
print(rent_df.groupby(['smoking'])['city'].count())

# Replace Nan value in smoking
rent_df['smoking'] = rent_df['smoking'].fillna('Non-Smoking')

# Verifying that there is no longer nan value for the smoking
rent_df.isna().sum()

smoking
Non-Smoking        21497
Smoking Allowed      269
Name: city, dtype: int64


city                 0
province             0
latitude             0
longitude            0
lease_term           0
type                 0
price                0
beds                 0
baths                0
sq_feet              0
furnishing           0
availability_date    8
smoking              0
cats                 0
dogs                 0
dtype: int64

I will be replacing Nan value in availability_date by today's date 

In [23]:
rent_df['availability_date'] = rent_df['availability_date'].fillna(f"{datetime.date.today().strftime('%B %d')}, {datetime.date.today().year}")

# Verifying that there is no longer nan value for the smoking
rent_df.isna().sum()

city                 0
province             0
latitude             0
longitude            0
lease_term           0
type                 0
price                0
beds                 0
baths                0
sq_feet              0
furnishing           0
availability_date    0
smoking              0
cats                 0
dogs                 0
dtype: int64

#### Converting price column data type to int

In [24]:
rent_df['price'] = rent_df['price'].astype(int)

### Saving the cleaned data to a csv file

In [25]:
# Reset index
rent_df.reset_index(drop=True)

# Writing to file 
rent_df.to_csv('../Data/canada_rent_clean.csv', index=False)
