## Explore the rent amount (R$)

### Multivariate statistics

#### Questions 
- Which city has the most expensive rental average?
- How many bathrooms are there on average in homes with higher rent?
- Do the most expensive properties accept animals?
- Are the most expensive properties furnished?


#### Hypotheses
- São Paulo is the city with the most expensive rent.
- The more bathrooms in a property the higher the value of the rent.
- Furnished properties have the most expensive rent.

In [155]:
import pandas as pd

In [100]:
df = pd.read_csv('houses_to_rent_v2.csv')

In [101]:
df.rename(columns = {'rent amount (R$)': 'rent_amount'}, inplace = True)

In [102]:
df.columns

Index(['city', 'area', 'rooms', 'bathroom', 'parking spaces', 'floor',
       'animal', 'furniture', 'hoa (R$)', 'rent_amount', 'property tax (R$)',
       'fire insurance (R$)', 'total (R$)'],
      dtype='object')

In [103]:
# Which city has the most expensive rental average?
# R: São Paulo

df.groupby('city')['rent_amount'].mean().reset_index().sort_values('rent_amount', ascending = False)

Unnamed: 0,city,rent_amount
4,São Paulo,4652.793783
0,Belo Horizonte,3664.127981
3,Rio de Janeiro,3232.904064
1,Campinas,2364.290739
2,Porto Alegre,2337.699916


In [184]:
# How many bathrooms are there on average in homes with higher rent?
## definition: higher rents are above 5000

df['rent_amount_classification'] = ['Higher' if x > 5000 else 'Low' for x in df ['rent_amount']]

In [185]:
df['rent_amount_classification'].value_counts()

Low       8153
Higher    2539
Name: rent_amount_classification, dtype: int64

In [186]:
df.groupby('rent_amount_classification')['bathroom'].mean()

#R: Approximately 4 bathrooms

rent_amount_classification
Higher    3.729027
Low       1.772108
Name: bathroom, dtype: float64

In [208]:
# Do the most expensive properties accept animals?
## definition: higher rents are above 5000

aux = df.query('rent_amount_classification == "Higher" & animal == "acept"')['animal'].count()
# print('acept: ' + str(aux))

# aux = df.query('rent_amount_classification == "Higher" & animal == "not acept"')['animal'].count()
# print('not acept: ' + str(aux))

# print('\ntotal: ' + str(df[filter_higher]['animal'].count()))

aux * 100 / df[filter_higher]['animal'].count()

#R: Approximately 82%, yes

82.07955888144939

In [209]:
# Are the most expensive properties furnished?
## definition: higher rents are above 5000

aux = df.query('rent_amount_classification == "Higher" & furniture == "furnished"')['furniture'].count()
# print('furnished: ' + str(aux))

# aux = df.query('rent_amount_classification == "Higher" & furniture == "not furnished"')['furniture'].count()
# print('not furnished: ' + str(aux))

# print('\ntotal: ' + str(df[filter_higher]['furniture'].count()))

aux * 100 / df[filter_higher]['furniture'].count()

#R: Approximately only for 35%

34.580543521071284

In [216]:
# São Paulo is the city with the most expensive rent.

df[['rent_amount', 'bathroom']].corr(method = 'pearson')

Unnamed: 0,rent_amount,bathroom
rent_amount,1.0,0.668504
bathroom,0.668504,1.0


In [213]:
df[['rent_amount', 'bathroom']].corr(method = 'spearman')

#R: Yes, has a strong correlation

Unnamed: 0,rent_amount,bathroom
rent_amount,1.0,0.71589
bathroom,0.71589,1.0


### Example of correlation checking

In [220]:
aux = pd.DataFrame({ 'columns': df.columns, 'types': df.dtypes })

In [223]:
list = list(aux[aux['types'] == 'int64']['columns'])

In [233]:
for columns in list:
    print(columns)
    print(df[['rent_amount', columns]].corr(method = 'spearman'), '\n\n')

area
             rent_amount      area
rent_amount     1.000000  0.728095
area            0.728095  1.000000 


rooms
             rent_amount     rooms
rent_amount     1.000000  0.600969
rooms           0.600969  1.000000 


bathroom
             rent_amount  bathroom
rent_amount      1.00000   0.71589
bathroom         0.71589   1.00000 


parking spaces
                rent_amount  parking spaces
rent_amount        1.000000        0.620175
parking spaces     0.620175        1.000000 


hoa (R$)
             rent_amount  hoa (R$)
rent_amount     1.000000  0.355785
hoa (R$)        0.355785  1.000000 


rent_amount
             rent_amount  rent_amount
rent_amount          1.0          1.0
rent_amount          1.0          1.0 


property tax (R$)
                   rent_amount  property tax (R$)
rent_amount            1.00000            0.65923
property tax (R$)      0.65923            1.00000 


fire insurance (R$)
                     rent_amount  fire insurance (R$)
rent_amount    