# Apartments Analysis

In this project, we will analyze Apartments for rent in the city of Tel Aviv.        
Our main goals are to try to understand in depth the area of apartment rental in Tel Aviv.   

### Introduction and Background
Finding an apartment for rent is not a simple task. And it is usually necessary to see a large number of apartments 
and be prepared for certain compromises before we can make a decision and enter any apartment.  
For that, we performed in-depth analysis to understand the field in depth.   
Our motivation in choosing a project topic in this area comes from the fact that we find ourselves  
spending a lot of time, each day, for finding apartments for rent. 

### Data acquisition
For the purposes of this project, we will use data collected from ads which have been published on the site Yad2.   
All the information has been collected using a Web crawler.    


### Loading the data  
Our data is stored and organized in a csv file.  
We'll start reading in the data into pandas dataframes and see some stats.  

In [91]:
import pandas as pd

In [92]:
# Read in the data
apartments = pd.read_csv('data/apartments.csv', encoding='ISO-8859-8')

#features = ', '.join(apartments.columns.values)

print(f'Apartments dataset has {apartments.shape[0]} entries with {apartments.shape[1]} variables.')
print(f'the data features are: ')
apartments[0:0]

Apartments dataset has 360 entries with 11 variables.
the data features are: 


Unnamed: 0,Title,Area,Price,Rooms,Floor,Square-meter,Area-text,Elevator,Air-condition,Refurbished,Furniture


### Feature Observation
- **title**: the actual apartment address
- **area**: we divided the city into areas north-south-east-west
- **price**: price of rent per month in ILS
- **rooms**: number of rooms in the apartment
- **floor**: number of floor of the apartment in the building
- **square-meter**: the size of the apartment space in square meters
- **elevator/air-condition/refurbished/furniture**: additional parameters that can be exist or not

## Data cleaning


### Handling missing data
Some of the values are missing, they anotated as 'לא צוין'

In [93]:
# Remove missing values
apartments = apartments[apartments['Price'] != 'לא צוין מחיר']
apartments = apartments[apartments['Rooms'] != 'לא צוין']

print(f'Now the dataset has {apartments.shape[0]} entries.')

Now the dataset has 343 entries.


### Handling Data types
Lets look at the variables type

In [94]:
apartments.dtypes

Title            object
Area             object
Price            object
Rooms            object
Floor            object
Square-meter      int64
Area-text        object
Elevator           bool
Air-condition      bool
Refurbished        bool
Furniture          bool
dtype: object

We can see that some of the variables are not in the correct type, such as **Price**, **Rooms**, **Floor** so we will need to convert them.

In [95]:
# Convert data types into a valid type
apartments['Price'] = apartments['Price'].apply(lambda s: s.replace(',', '').replace('¤', '').replace('₪', '')).astype('float')
apartments['Rooms'] = apartments['Rooms'].astype('float')
apartments.loc[apartments['Floor'] == 'קרקע', 'Floor'] = 0
apartments['Floor'] = apartments['Floor'].astype('int')
apartments.dtypes

Title             object
Area              object
Price            float64
Rooms            float64
Floor              int32
Square-meter       int64
Area-text         object
Elevator            bool
Air-condition       bool
Refurbished         bool
Furniture           bool
dtype: object