___
# Ciência dos Dados - PROJETO 3

___
### *Bruno Kaczelnik, Guilherme Lotaif, Renato Yasuo Chopard Tajima, Thiago Verardo*
___

# Análise de Airbnb nas maiores cidades americanas

___

## 1. Introdução

   Neste Projeto será feita uma análise dos dados que possuimos de Airbnb nos Estados Unidos, e a partir dos dados que julgarmos pertinentes ao nosso estudo. Serão usados diferentes **métodos de predição** para descobrir`o preço de um aluguel dependendo das características fornecidas`, após usarmos os diferentes métodos, vamos comparar os resultados de cada método, finalmente poderemos concluir o projeto..

   Um pouco mais sobre a plataforma digital Airbnb: Ela é utilizada para efetuar o aluguel de uma casa ou apartamento de outros usuários da própria plataforma. Com anúncios em 192 países, podemos concluir que é uma plataforma é bem grande, somente nos Estados Unidos existem aproximadamente 600.000 possiveis locais para aluguel. Com tantas opções, a escolha que o usuário faz quanto ao local é feita baseada nas **informações fornecidas pelo propretario**, tais informações podem variar desde quantos banheiros a residência possui, até se ela possui acesso a uma rede Wifi.

**Fonte do dataset:** www.kaggle.com/rudymizrahi/airbnb-listings-in-major-us-cities-deloitte-ml

___
## 2. Minerando dados e características do dataset

O dataset que vamos utilizar nesse projeto possui diversas colunas com informações dos Airbnb nos Estados Unidos, portannto temos que percorrer todas essas colunas para fazer uma limpeza, e deixar somente as informações que serão pertinentes a nossa análise.

**Importando as bibliotecas que serão utilizadas:**

In [1]:
#Importando bibliotecas:
import os
import numpy as np
import pandas as pd
import seaborn as sns
import pandas_profiling
import matplotlib.pyplot as plt
from scipy.stats import norm,probplot

plt.style.use('ggplot')

In [2]:
print('Esperamos trabalhar no diretório')
print(os.getcwd())

Esperamos trabalhar no diretório
/Users/guilherme/Downloads/Projeto-3-CDD---DPA-master


### - Base de dados

In [3]:
#Importado o arquivo de treinamento:
df = pd.read_csv('train.csv')

In [4]:
#Análisando o tamanho do dataframe:
linhas, colunas = df.shape
print("O Dataframe possui {0} linhas por {1} colunas.".format(linhas, colunas))

O Dataframe possui 74111 linhas por 29 colunas.


...

#### Limpando o dataframe para ser análisado:
Esta etapa consiste em uma preparação do dataframe para facilitar a análise no mesmo, assim evitando ocorrências de complicações ou erros que atrapalhem futuramento os nosssos classificadores. Portanto iremos limpar os titulos de cada coluna, vamos remover os valores nulos de cada categoria, assim como colunas desnecessárias.

In [5]:
#Removendo os espaços em branco dos nomes das colunas:
df.columns = [espaços.strip() for espaços in df.columns.tolist()]

In [6]:
#Removendo colunas que não sao pertinentes ao estudo:
df1 = df.drop(["latitude","longitude","name","thumbnail_url","id","description","log_price"],axis=1);

In [7]:
#Apagando valores nulos que podem causar problemas posteriores:
df1 = df.dropna(axis=0, subset=['bathrooms','first_review','host_has_profile_pic','host_identity_verified',
                               'host_response_rate','host_since','last_review','neighbourhood','review_scores_rating',
                               'zipcode','bedrooms','beds'])

In [10]:
#Corrigindo simplificação no arquivo:
df1.loc[(df1["instant_bookable"] == "f"),"instant_bookable"] = "False"
df1.loc[(df1["instant_bookable"] == "t"),"instant_bookable"] = "True"

df1.loc[(df1["host_has_profile_pic"] == "f"),"host_has_profile_pic"] = "False"
df1.loc[(df1["host_has_profile_pic"] == "t"),"host_has_profile_pic"] = "True"


df1.loc[(df1["host_identity_verified"] == "f"),"host_identity_verified"] = "False"
df1.loc[(df1["host_identity_verified"] == "t"),"host_identity_verified"] = "True"

In [11]:
df1.sample(2)

Unnamed: 0,id,log_price,property_type,room_type,amenities,accommodates,bathrooms,bed_type,cancellation_policy,cleaning_fee,...,latitude,longitude,name,neighbourhood,number_of_reviews,review_scores_rating,thumbnail_url,zipcode,bedrooms,beds
54453,19334358,4.077537,Apartment,Private room,"{Internet,""Wireless Internet"",""Wheelchair acce...",2,1.0,Real Bed,flexible,False,...,33.99974,-118.436138,"Cozy Room with Private Bath, close to Venice",Mar Vista,6,97.0,https://a0.muscache.com/im/pictures/47638b66-e...,90066,1.0,1.0
14231,17206784,5.703782,Condominium,Entire home/apt,"{TV,Internet,""Wireless Internet"",""Air conditio...",3,1.0,Real Bed,flexible,False,...,40.744327,-73.985456,Light filled One bedroom apartment,Flatiron District,2,100.0,https://a0.muscache.com/im/pictures/008e6862-0...,10016,1.0,1.0


### CHECAR DESCRIPTION

#### Separando os atributos na coluna de características:
<br>
Para um melhor rendimento dos métodos de prdição, vamos separar cada atributo para que eles possam ser análisados e comparados de maneira unitária, e para isso vamos juntar todas palavras em listas para cada anúncio, e depois comparar não só os itens quanto o tamanho das listas também.

In [None]:
'''#Criando uma lista para as palavras dos emails SPAM:
dic_amenities = {}
#Criando a variável contador para o total de palavras SPAM:
contador_amenities = 0

#Criando um loop para atribuir os emails SPAM a uma variável:
for qualidades in df.amenities:
    #Criando um loop para alocar as palavras de SPAM no dicionário:
    for caracteristica in qualidades:
        #print(caracteristica)
        if caracteristica not in dic_amenities:
            print(caracteristica)
            dic_amenities[caracteristica] =1
            contador_amenities +=1
        else:
            dic_amenities[caracteristica] +=1
            contador_amenities +='''

<br>

### - Análise descritiva

Após realizarmos o filtro, deve-se realizar uma análise exploratória dos dados, com o  objetivo de achar as váriaveis que mais influenciam no nosso objetivo e que assim possam nos ajudar a prever qual será a avaliação de um hotel aleatório. Ela será feita com o auxílio do pandas_profiling e seaborn.


In [12]:
#utilizandoo o pandas_profiling
#df é o dataframe após o filtro
pandas_profiling.ProfileReport(df1)

0,1
Number of variables,30
Number of observations,42775
Total Missing (%),0.3%
Total size in memory,9.5 MiB
Average record size in memory,233.0 B

0,1
Numeric,11
Categorical,18
Boolean,1
Date,0
Text (Unique),0
Rejected,0
Unsupported,0

0,1
Distinct count,16
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,3.3071
Minimum,1
Maximum,16
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,1
Q1,2
Median,2
Q3,4
95-th percentile,8
Maximum,16
Range,15
Interquartile range,2

0,1
Standard deviation,2.2063
Coef of variation,0.66712
Kurtosis,6.582
Mean,3.3071
MAD,1.6341
Skewness,2.1198
Sum,141463
Variance,4.8676
Memory size,334.3 KiB

Value,Count,Frequency (%),Unnamed: 3
2,18235,42.6%,
4,7358,17.2%,
3,4662,10.9%,
1,4196,9.8%,
6,3206,7.5%,
5,2110,4.9%,
8,1193,2.8%,
7,640,1.5%,
10,458,1.1%,
9,177,0.4%,

Value,Count,Frequency (%),Unnamed: 3
1,4196,9.8%,
2,18235,42.6%,
3,4662,10.9%,
4,7358,17.2%,
5,2110,4.9%,

Value,Count,Frequency (%),Unnamed: 3
12,174,0.4%,
13,26,0.1%,
14,75,0.2%,
15,35,0.1%,
16,173,0.4%,

0,1
Distinct count,39904
Unique (%),93.3%
Missing (%),0.0%
Missing (n),0

0,1
{},69
"{""translation missing: en.hosting_amenity_49"",""translation missing: en.hosting_amenity_50""}",29
"{""Family/kid friendly""}",19
Other values (39901),42658

Value,Count,Frequency (%),Unnamed: 3
{},69,0.2%,
"{""translation missing: en.hosting_amenity_49"",""translation missing: en.hosting_amenity_50""}",29,0.1%,
"{""Family/kid friendly""}",19,0.0%,
"{TV,""Cable TV"",Internet,""Wireless Internet"",""Air conditioning"",Kitchen,""Free parking on premises"",Heating,""Family/kid friendly"",Washer,Dryer,""Smoke detector"",""Carbon monoxide detector"",""First aid kit"",""Safety card"",""Fire extinguisher"",Essentials,Shampoo,""24-hour check-in"",Hangers,""Hair dryer"",Iron,""Laptop friendly workspace""}",18,0.0%,
"{TV,""Cable TV"",Internet,""Wireless Internet"",""Air conditioning"",Kitchen,Heating,""Family/kid friendly"",Washer,Dryer,""Smoke detector"",""Carbon monoxide detector"",""First aid kit"",""Safety card"",""Fire extinguisher"",Essentials,Shampoo,""24-hour check-in"",Hangers,""Hair dryer"",Iron,""Laptop friendly workspace""}",16,0.0%,
"{TV,""Cable TV"",Internet,""Wireless Internet"",""Air conditioning"",Kitchen,""Free parking on premises"",Heating,""Family/kid friendly"",Washer,Dryer,""Smoke detector"",""Carbon monoxide detector"",Essentials,Shampoo,""24-hour check-in"",Hangers,""Hair dryer"",Iron,""Laptop friendly workspace""}",16,0.0%,
"{TV,""Cable TV"",Internet,""Wireless Internet"",""Air conditioning"",Kitchen,Heating,""Family/kid friendly"",Washer,Dryer,""Smoke detector"",""Carbon monoxide detector"",Essentials,Shampoo,""24-hour check-in"",Hangers,""Hair dryer"",Iron,""Laptop friendly workspace""}",16,0.0%,
"{TV,""Cable TV"",Internet,""Wireless Internet"",""Air conditioning"",Kitchen,Heating,""Family/kid friendly"",Washer,Dryer,""Smoke detector"",""Carbon monoxide detector"",""Fire extinguisher"",Essentials,Shampoo,""24-hour check-in"",Hangers,""Hair dryer"",Iron,""Laptop friendly workspace""}",15,0.0%,
"{TV,""Cable TV"",Internet,""Wireless Internet"",""Air conditioning"",Kitchen,""Buzzer/wireless intercom"",Heating,""Family/kid friendly"",""Smoke detector"",""Carbon monoxide detector"",Essentials,Shampoo,""24-hour check-in"",Hangers,""Hair dryer"",Iron,""Laptop friendly workspace""}",15,0.0%,
"{TV,""Cable TV"",Internet,""Wireless Internet"",""Air conditioning"",Kitchen,Heating,""Family/kid friendly"",""Smoke detector"",""Carbon monoxide detector"",Essentials,Shampoo,""24-hour check-in"",Hangers,""Hair dryer"",Iron,""Laptop friendly workspace""}",12,0.0%,

0,1
Distinct count,17
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,1.2342
Minimum,0
Maximum,8
Zeros (%),0.3%

0,1
Minimum,0
5-th percentile,1
Q1,1
Median,1
Q3,1
95-th percentile,2
Maximum,8
Range,8
Interquartile range,0

0,1
Standard deviation,0.57687
Coef of variation,0.46739
Kurtosis,22.552
Mean,1.2342
MAD,0.37788
Skewness,3.7057
Sum,52795
Variance,0.33278
Memory size,334.3 KiB

Value,Count,Frequency (%),Unnamed: 3
1.0,33649,78.7%,
2.0,4666,10.9%,
1.5,2216,5.2%,
2.5,846,2.0%,
3.0,618,1.4%,
3.5,235,0.5%,
4.0,168,0.4%,
0.0,107,0.3%,
0.5,92,0.2%,
4.5,62,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.0,107,0.3%,
0.5,92,0.2%,
1.0,33649,78.7%,
1.5,2216,5.2%,
2.0,4666,10.9%,

Value,Count,Frequency (%),Unnamed: 3
6.0,13,0.0%,
6.5,6,0.0%,
7.0,3,0.0%,
7.5,3,0.0%,
8.0,25,0.1%,

0,1
Distinct count,5
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Real Bed,41588
Futon,467
Pull-out Sofa,378
Other values (2),342

Value,Count,Frequency (%),Unnamed: 3
Real Bed,41588,97.2%,
Futon,467,1.1%,
Pull-out Sofa,378,0.9%,
Airbed,226,0.5%,
Couch,116,0.3%,

0,1
Distinct count,11
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,1.2789
Minimum,0
Maximum,10
Zeros (%),9.5%

0,1
Minimum,0
5-th percentile,0
Q1,1
Median,1
Q3,2
95-th percentile,3
Maximum,10
Range,10
Interquartile range,1

0,1
Standard deviation,0.86367
Coef of variation,0.67533
Kurtosis,6.8051
Mean,1.2789
MAD,0.60724
Skewness,1.8745
Sum,54704
Variance,0.74593
Memory size,334.3 KiB

Value,Count,Frequency (%),Unnamed: 3
1.0,28011,65.5%,
2.0,7021,16.4%,
0.0,4047,9.5%,
3.0,2619,6.1%,
4.0,766,1.8%,
5.0,210,0.5%,
6.0,57,0.1%,
7.0,27,0.1%,
8.0,9,0.0%,
10.0,5,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.0,4047,9.5%,
1.0,28011,65.5%,
2.0,7021,16.4%,
3.0,2619,6.1%,
4.0,766,1.8%,

Value,Count,Frequency (%),Unnamed: 3
6.0,57,0.1%,
7.0,27,0.1%,
8.0,9,0.0%,
9.0,3,0.0%,
10.0,5,0.0%,

0,1
Distinct count,18
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,1.7898
Minimum,0
Maximum,18
Zeros (%),0.0%

0,1
Minimum,0
5-th percentile,1
Q1,1
Median,1
Q3,2
95-th percentile,4
Maximum,18
Range,18
Interquartile range,1

0,1
Standard deviation,1.3124
Coef of variation,0.73328
Kurtosis,17.098
Mean,1.7898
MAD,0.90936
Skewness,3.1387
Sum,76557
Variance,1.7224
Memory size,334.3 KiB

Value,Count,Frequency (%),Unnamed: 3
1.0,24624,57.6%,
2.0,10188,23.8%,
3.0,4238,9.9%,
4.0,1995,4.7%,
5.0,837,2.0%,
6.0,440,1.0%,
7.0,142,0.3%,
8.0,124,0.3%,
10.0,61,0.1%,
9.0,50,0.1%,

Value,Count,Frequency (%),Unnamed: 3
0.0,1,0.0%,
1.0,24624,57.6%,
2.0,10188,23.8%,
3.0,4238,9.9%,
4.0,1995,4.7%,

Value,Count,Frequency (%),Unnamed: 3
13.0,8,0.0%,
14.0,1,0.0%,
15.0,2,0.0%,
16.0,22,0.1%,
18.0,1,0.0%,

0,1
Distinct count,5
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
strict,22309
moderate,12417
flexible,7970
Other values (2),79

Value,Count,Frequency (%),Unnamed: 3
strict,22309,52.2%,
moderate,12417,29.0%,
flexible,7970,18.6%,
super_strict_30,71,0.2%,
super_strict_60,8,0.0%,

0,1
Distinct count,6
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
NYC,19160
LA,11692
SF,3948
Other values (3),7975

Value,Count,Frequency (%),Unnamed: 3
NYC,19160,44.8%,
LA,11692,27.3%,
SF,3948,9.2%,
DC,2988,7.0%,
Chicago,2537,5.9%,
Boston,2450,5.7%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.82396

0,1
True,35245
(Missing),7530

Value,Count,Frequency (%),Unnamed: 3
True,35245,82.4%,
(Missing),7530,17.6%,

0,1
Distinct count,42424
Unique (%),99.2%
Missing (%),0.0%
Missing (n),0

0,1
"Private room in the heart of Little Italy with FREE PARKING :) You will have your own personal door code and may arrive at ANY time you would like and as late as you wish. You can store your luggage in the closet before check-in and after check-out times :) Price adapts to demand, my rates change constantly - every night is different! Put your dates in the calendar, with the number of guests, and what you see is what you'll pay :) The neighborhood is extremely safe even at night :) Restaurants and bars are right round the corner, both upscale and casual. This room is 6 minutes away by car or uber to Chicago's famous Willis (Sears) tower. Uber costs about $7. If you are arriving by car, I am providing free resident parking passes for anywhere in the area. (Yes, that means you can drive a few blocks and still park anywhere you want!) The room will be solely yours for the time you'll stay in the Windy City. Shall you have any concerns or need any tips from the local, I'll always be happy",7
"Welcome to RMH, a co-ed hostel vibe home for exploring travelers or working individuals needing temporary housing. (Guests access is from 3pm to 11am daily). NON-SMOKERS ONLY! Host & small dog live on property. NOTE: Guests' bedroom is dog free. GUARANTEED In Our Home: *You receive a clean home in a safe neighborhood, clean sheets, pillow, towel and covers. **You MUST Read House Rules BEFORE Booking :) ***IMPORTANT ***SMOKERS who book: reservation cancelled upon arrival + NO Refund. The bedroom is very large with enough space for everyone (6 guests) to have peace and quiet. The kitchen and bath are very spacious. Guests will have their own keys and have access to the bedroom in which they stay, kitchen, bathroom, living room and deck. OPTIONAL: (Normal Check IN is 3pm, Check OUT is 10am) If you would like an EARLY CHECK IN before 3pm ($25 fee) Guests trying to check in after 10pm will need to get pre-approval from host and pay the late check in fee of $25. There is no key box and no gu",6
"Hello, I've been running guest house for Koreans visiting U.S. for 3years, and recently decided to run this place for other travelers also. There are 10 room in the house. They are mostly dormitory rooms and couple of couple room and family room. This places are our women's dormitory in third floor. There are three rooms, but no doors. It is basically open space. There are 2 beds in two rooms and 4 in one room. I do not have closet in this room but there are hangers and mini shelves. My travelers usually put their baggage on the floor. There is one full bathroom only for women in 2nd floor, which you will be sharing with other women guests. Right next that bathroom, there is unisex half bathroom. All bathrooms have hair dryers. You cannot use kitchen, but you can use refrigerator. I offer breakfast every morning from 7-10 am. Bread, cereal, fruits, coffee, milk and juice will be served. You can eat take-out food in the kitchen, but please wash dishes that you used and put trash in the",6
Other values (42421),42756

Value,Count,Frequency (%),Unnamed: 3
"Private room in the heart of Little Italy with FREE PARKING :) You will have your own personal door code and may arrive at ANY time you would like and as late as you wish. You can store your luggage in the closet before check-in and after check-out times :) Price adapts to demand, my rates change constantly - every night is different! Put your dates in the calendar, with the number of guests, and what you see is what you'll pay :) The neighborhood is extremely safe even at night :) Restaurants and bars are right round the corner, both upscale and casual. This room is 6 minutes away by car or uber to Chicago's famous Willis (Sears) tower. Uber costs about $7. If you are arriving by car, I am providing free resident parking passes for anywhere in the area. (Yes, that means you can drive a few blocks and still park anywhere you want!) The room will be solely yours for the time you'll stay in the Windy City. Shall you have any concerns or need any tips from the local, I'll always be happy",7,0.0%,
"Welcome to RMH, a co-ed hostel vibe home for exploring travelers or working individuals needing temporary housing. (Guests access is from 3pm to 11am daily). NON-SMOKERS ONLY! Host & small dog live on property. NOTE: Guests' bedroom is dog free. GUARANTEED In Our Home: *You receive a clean home in a safe neighborhood, clean sheets, pillow, towel and covers. **You MUST Read House Rules BEFORE Booking :) ***IMPORTANT ***SMOKERS who book: reservation cancelled upon arrival + NO Refund. The bedroom is very large with enough space for everyone (6 guests) to have peace and quiet. The kitchen and bath are very spacious. Guests will have their own keys and have access to the bedroom in which they stay, kitchen, bathroom, living room and deck. OPTIONAL: (Normal Check IN is 3pm, Check OUT is 10am) If you would like an EARLY CHECK IN before 3pm ($25 fee) Guests trying to check in after 10pm will need to get pre-approval from host and pay the late check in fee of $25. There is no key box and no gu",6,0.0%,
"Hello, I've been running guest house for Koreans visiting U.S. for 3years, and recently decided to run this place for other travelers also. There are 10 room in the house. They are mostly dormitory rooms and couple of couple room and family room. This places are our women's dormitory in third floor. There are three rooms, but no doors. It is basically open space. There are 2 beds in two rooms and 4 in one room. I do not have closet in this room but there are hangers and mini shelves. My travelers usually put their baggage on the floor. There is one full bathroom only for women in 2nd floor, which you will be sharing with other women guests. Right next that bathroom, there is unisex half bathroom. All bathrooms have hair dryers. You cannot use kitchen, but you can use refrigerator. I offer breakfast every morning from 7-10 am. Bread, cereal, fruits, coffee, milk and juice will be served. You can eat take-out food in the kitchen, but please wash dishes that you used and put trash in the",6,0.0%,
"OutpostClub is a network of Coliving locations throughout NewYork. We built it to make it super easy to move to NewYork, and to provide cool convenient places for cheap. Coliving - is a shared housing model - where you share kitchen, living rooms, common spaces with others, having private or shared bedrooms of your choice. Everything is included in the flat fee - utilities, furniture, supplies, events. Book Now, grab your suitcase and move into the House, meet people who will become your friends Coliving Club is a beautifully curated living space in New York with a wide range of amenities such as free workspace with printer and scanner, hi-speed Wi-Fi, fully equipped kitchen, dinner and living space, private backyard, individual safe, coffee, tea, soda, shampoo, shower gel, hand soap, towels and much more! Everything is included in our flat fee. Here we thoroughly picked every detail so that you feel cozy. Expect a great night's sleep with our made in USA memory foam mattresses by Broo",5,0.0%,
"Located at Laurel Hights close to USF, Golden Gate Park, Golden Gate Bridge and Shopping Center. We are steps from MUNI station and 15 minutes Bus ride or Lyft/Uber ride to downtown. The room has a queen sized bed with incredible comfortable mattress which is perfect for a restful night after a day in the city. There is free parking on the street. Free fast Wi-Fi.",4,0.0%,
"The Treat Street Clubhouse is a home you'll never forget. It's more than an Airbnb -- it's a collective of adventurers and unique individuals. We're creating a place that you can't wait to come back to. You'll have your own bunk bed in a shared room. This is a friendly atmosphere with many people and privacy will be a little sparse. With so many people, it may get rowdy at times. Make yourself at home here. There's a stocked kitchen, comfy loft, fridge, speakers, couches, sous vide machines, and the usual stuff. You'll be sharing the home with Duncan, a rugged cartographer, and Zain, a high-powered business tycoon -- both of whom have a proclivity for hyperbole. We also have a resident mascot: Roux, our loving, courageous, part-dingo pooch. We love going on adventures with guests and frequently organize events and outings. We're in the best part of the Mission with a rock climbing gym, incredible food, and the best coffee, all under a 5 minute walk. We're happy to show you the best loc",4,0.0%,
"本地方是个两层楼townhouse温馨之家,一楼我们自己住,二楼五个房间做出租房,有两个洗手间和浴室供您们使用,每个房间都配备双人床 被子枕头床头柜 台灯 书桌椅 电视机Wi-Fi 热水壶 毛巾牙刷 牙膏 拖鞋 餐巾纸 房间干净卫生。大门进出有密码锁 每个房间也都有密码锁 走廊有安装电眼探头,厨房可以煮简单的食物,可以说是麻雀虽小五脏俱全。 地方位于纽约布鲁克林八大道华人集聚地,附近各种餐馆林立,地区治安好交通发达,走路2分钟到小巴站25分钟车程到达曼哈顿唐人街,门口有大巴站坐两站到地铁站也可以走路8分钟到N地铁站或走12分钟到D或R地铁站。 如果您开车过来白天很容易找到车位,晚上8点后如果找不到车位,您可以把车停在我们旁边8大道meter车位上直到第二天早晨7点半后就有很多车位了,欢迎您的到来!",4,0.0%,
"Spacious 2 story 4000 sqft home Enjoy fruit trees swimming pool fountain & gardens lWalking distance to mall, grocery store, movie theater & restaurants. Centrally located to Malibu, Santa Monica,Venice,Hollywood, Beverly Hills, West LA This spacious 2 story 4000 sqft home is fully furnished and beautifully decorated! Coffee and donuts are served every morning;) Please keep in mind that this is a shared space, and that your large room consists of 2 bunk beds, your own dresser, hamper, and closet space. You will have full access to the rest of the home. We provide a safe, clean, and positive environment, perfect for people relocating from out of state, students, and anyone else who needs a landing place while traveling. Near public transportation; ORANGELINE, 405, and 101 FREEWAYS. You will have full access to the rest of the home. We do have private and 2 person and or couples rooms available in the same property. Full kitchen privileges All Premium Cable and movie channels Full se",4,0.0%,
"Come stay in my converted townhome in the heart of Bedford-Stuyvesant. Close to great restaurants, coffeeshops and juice bars, a great place to lay your head after a day or night out on the town. You will have a private room on the middle floor with a full sized bed and clean sheets and towels. There are 4 bathrooms, one on each floor, a living room and a fully equipped kitchen to use. I live on site and will be available to help with anything that you may need or neighborhood suggestions. A great mix of the hip and gentrified new Brooklyn, and the old 'real' Brooklyn. Im just a stones throw form great Jamaican food, cold brew coffee and juice shops. Just a 7 min walk to the G or AC trains and just 30 minutes into downtown manhattan.",4,0.0%,
"A very cozy house located in the heart of Hollywood between two famous streets: Beverly Blvd and Melrose Ave, within a safe and quiet neighborhood, 30 minutes from LAX, 15 minutes from Downtown LA, 15 minutes from Griffith Observatory, 15 minutes from Hollywood Walk of Fame, 15 minutes from Universal Studios, 5 minutes from Paramount Pictures. There are lots of restaurants and shops nearby. My favorite is Osteria La Buca:) A great house in the central area of Los Angeles suitable both for travelers and business trips. It is fully equipped with all you need and freshly renovated. FREE PARKING:)",4,0.0%,

0,1
Distinct count,2489
Unique (%),5.8%
Missing (%),0.0%
Missing (n),0

0,1
2017-01-01,218
2017-09-04,187
2017-01-22,165
Other values (2486),42205

Value,Count,Frequency (%),Unnamed: 3
2017-01-01,218,0.5%,
2017-09-04,187,0.4%,
2017-01-22,165,0.4%,
2017-01-02,146,0.3%,
2017-04-16,130,0.3%,
2017-03-19,122,0.3%,
2016-01-02,119,0.3%,
2017-04-09,113,0.3%,
2016-09-05,113,0.3%,
2016-01-03,113,0.3%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
True,42709
False,66

Value,Count,Frequency (%),Unnamed: 3
True,42709,99.8%,
False,66,0.2%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
True,31304
False,11471

Value,Count,Frequency (%),Unnamed: 3
True,31304,73.2%,
False,11471,26.8%,

0,1
Distinct count,77
Unique (%),0.2%
Missing (%),0.0%
Missing (n),0

0,1
100%,33907
90%,1865
80%,875
Other values (74),6128

Value,Count,Frequency (%),Unnamed: 3
100%,33907,79.3%,
90%,1865,4.4%,
80%,875,2.0%,
70%,390,0.9%,
50%,349,0.8%,
0%,318,0.7%,
99%,314,0.7%,
97%,308,0.7%,
98%,283,0.7%,
96%,267,0.6%,

0,1
Distinct count,3000
Unique (%),7.0%
Missing (%),0.0%
Missing (n),0

0,1
2014-02-14,152
2015-03-30,113
2013-08-07,55
Other values (2997),42455

Value,Count,Frequency (%),Unnamed: 3
2014-02-14,152,0.4%,
2015-03-30,113,0.3%,
2013-08-07,55,0.1%,
2014-09-02,53,0.1%,
2016-01-18,51,0.1%,
2014-07-29,50,0.1%,
2015-03-05,50,0.1%,
2016-08-23,50,0.1%,
2012-08-27,49,0.1%,
2014-09-17,48,0.1%,

0,1
Distinct count,42775
Unique (%),100.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,11250000
Minimum,941
Maximum,21230903
Zeros (%),0.0%

0,1
Minimum,941
5-th percentile,824980
Q1,6237400
Median,12230000
Q3,16393000
95-th percentile,20025000
Maximum,21230903
Range,21229962
Interquartile range,10156000

0,1
Standard deviation,6086600
Coef of variation,0.54103
Kurtosis,-1.1373
Mean,11250000
MAD,5294800
Skewness,-0.25582
Sum,481213975487
Variance,37046000000000
Memory size,334.3 KiB

Value,Count,Frequency (%),Unnamed: 3
18812926,1,0.0%,
15237330,1,0.0%,
18124838,1,0.0%,
11398939,1,0.0%,
9829717,1,0.0%,
62804,1,0.0%,
19385683,1,0.0%,
16423080,1,0.0%,
7986513,1,0.0%,
14472528,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
941,1,0.0%,
2515,1,0.0%,
2864,1,0.0%,
3152,1,0.0%,
3662,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
21218973,1,0.0%,
21227196,1,0.0%,
21227461,1,0.0%,
21228356,1,0.0%,
21230903,1,0.0%,

0,1
Distinct count,42775
Unique (%),100.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,37032
Minimum,1
Maximum,74110
Zeros (%),0.0%

0,1
Minimum,1.0
5-th percentile,3819.7
Q1,18452.0
Median,36927.0
Q3,55638.0
95-th percentile,70407.0
Maximum,74110.0
Range,74109.0
Interquartile range,37187.0

0,1
Standard deviation,21392
Coef of variation,0.57766
Kurtosis,-1.2037
Mean,37032
MAD,18534
Skewness,0.0047068
Sum,1584041924
Variance,457620000
Memory size,334.3 KiB

Value,Count,Frequency (%),Unnamed: 3
2047,1,0.0%,
19747,1,0.0%,
21792,1,0.0%,
44319,1,0.0%,
48413,1,0.0%,
36123,1,0.0%,
34074,1,0.0%,
70587,1,0.0%,
38168,1,0.0%,
60695,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
1,1,0.0%,
2,1,0.0%,
4,1,0.0%,
5,1,0.0%,
7,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
74102,1,0.0%,
74103,1,0.0%,
74107,1,0.0%,
74108,1,0.0%,
74110,1,0.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
False,30289
True,12486

Value,Count,Frequency (%),Unnamed: 3
False,30289,70.8%,
True,12486,29.2%,

0,1
Distinct count,1100
Unique (%),2.6%
Missing (%),0.0%
Missing (n),0

0,1
2017-09-24,1261
2017-09-17,1181
2017-04-30,1018
Other values (1097),39315

Value,Count,Frequency (%),Unnamed: 3
2017-09-24,1261,2.9%,
2017-09-17,1181,2.8%,
2017-04-30,1018,2.4%,
2017-09-18,813,1.9%,
2017-04-23,796,1.9%,
2017-09-25,794,1.9%,
2017-10-01,712,1.7%,
2017-09-16,676,1.6%,
2017-09-04,637,1.5%,
2017-09-28,631,1.5%,

0,1
Distinct count,42775
Unique (%),100.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,38.666
Minimum,33.706
Maximum,42.39
Zeros (%),0.0%

0,1
Minimum,33.706
5-th percentile,33.994
Q1,34.161
Median,40.678
Q3,40.758
95-th percentile,42.305
Maximum,42.39
Range,8.6846
Interquartile range,6.5968

0,1
Standard deviation,3.0337
Coef of variation,0.078458
Kurtosis,-1.2375
Mean,38.666
MAD,2.6883
Skewness,-0.6493
Sum,1654000
Variance,9.2032
Memory size,334.3 KiB

Value,Count,Frequency (%),Unnamed: 3
34.08031070283319,1,0.0%,
34.135611222594854,1,0.0%,
41.93949540547636,1,0.0%,
42.36093263073878,1,0.0%,
40.65203481033671,1,0.0%,
33.89896210846932,1,0.0%,
38.93647072415913,1,0.0%,
41.937788984236796,1,0.0%,
42.341206364144355,1,0.0%,
40.69539328394064,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
33.70583529438538,1,0.0%,
33.70832107320805,1,0.0%,
33.708497599094855,1,0.0%,
33.7086020517297,1,0.0%,
33.7096642361058,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
42.3884156457878,1,0.0%,
42.388483496611904,1,0.0%,
42.38977244945219,1,0.0%,
42.390247544263616,1,0.0%,
42.39043717872241,1,0.0%,

0,1
Distinct count,643
Unique (%),1.5%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,4.772
Minimum,0
Maximum,7.6004
Zeros (%),0.0%

0,1
Minimum,0.0
5-th percentile,3.7842
Q1,4.3175
Median,4.7449
Q3,5.193
95-th percentile,5.9402
Maximum,7.6004
Range,7.6004
Interquartile range,0.87547

0,1
Standard deviation,0.66879
Coef of variation,0.14015
Kurtosis,0.52269
Mean,4.772
MAD,0.52787
Skewness,0.37712
Sum,204120
Variance,0.44728
Memory size,334.3 KiB

Value,Count,Frequency (%),Unnamed: 3
5.0106352940962555,1430,3.3%,
4.605170185988092,1302,3.0%,
4.31748811353631,1118,2.6%,
4.59511985013459,1034,2.4%,
4.8283137373023015,988,2.3%,
4.174387269895637,965,2.3%,
4.0943445622221,957,2.2%,
3.912023005428147,937,2.2%,
5.298317366548036,933,2.2%,
4.248495242049359,891,2.1%,

Value,Count,Frequency (%),Unnamed: 3
0.0,1,0.0%,
1.6094379124341005,1,0.0%,
2.302585092994046,11,0.0%,
2.4849066497880004,1,0.0%,
2.6390573296152584,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
7.546974117516527,1,0.0%,
7.549609165154532,1,0.0%,
7.575584651557793,3,0.0%,
7.598399329323964,1,0.0%,
7.6004023345004,1,0.0%,

0,1
Distinct count,42775
Unique (%),100.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,-91.425
Minimum,-122.51
Maximum,-71
Zeros (%),0.0%

0,1
Minimum,-122.51
5-th percentile,-122.43
Q1,-118.34
Median,-74.017
Q3,-73.951
95-th percentile,-71.131
Maximum,-71.0
Range,51.511
Interquartile range,44.387

0,1
Standard deviation,21.51
Coef of variation,-0.23527
Kurtosis,-1.6755
Mean,-91.425
MAD,20.442
Skewness,-0.50271
Sum,-3910700
Variance,462.67
Memory size,334.3 KiB

Value,Count,Frequency (%),Unnamed: 3
-77.02341087104014,1,0.0%,
-118.17678950244192,1,0.0%,
-73.99077146521645,1,0.0%,
-73.99048483940189,1,0.0%,
-87.67879674455799,1,0.0%,
-73.95749889207795,1,0.0%,
-73.9541605270487,1,0.0%,
-73.93970559762647,1,0.0%,
-87.6554480138358,1,0.0%,
-73.9311700644353,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-122.51149998987214,1,0.0%,
-122.51090525601585,1,0.0%,
-122.50963522313911,1,0.0%,
-122.50936476530724,1,0.0%,
-122.50933564108672,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
-71.00618210560536,1,0.0%,
-71.00574428525941,1,0.0%,
-71.00461976225016,1,0.0%,
-71.0021227970042,1,0.0%,
-71.00046158748074,1,0.0%,

0,1
Distinct count,42522
Unique (%),99.4%
Missing (%),0.0%
Missing (n),0

0,1
Bunk bed in the Treat Street Clubhouse,8
"Location, Location, Location",5
Spacious Private Room in Brooklyn,5
Other values (42519),42757

Value,Count,Frequency (%),Unnamed: 3
Bunk bed in the Treat Street Clubhouse,8,0.0%,
"Location, Location, Location",5,0.0%,
Spacious Private Room in Brooklyn,5,0.0%,
Your home away from home,4,0.0%,
Venice Beach Cottage,4,0.0%,
Make的小屋（地理位置好，交通方便，洛杉矶市中心，提供机场名牌店景点等接送，包车游玩等服务）,4,0.0%,
SHARED ROOM in VENICE BEACH HOSTEL,4,0.0%,
Kanmore Guest House,4,0.0%,
Brooklyn Oasis,4,0.0%,
Cozy Private Room,3,0.0%,

0,1
Distinct count,590
Unique (%),1.4%
Missing (%),0.0%
Missing (n),0

0,1
Williamsburg,1646
Bedford-Stuyvesant,1421
Mid-Wilshire,944
Other values (587),38764

Value,Count,Frequency (%),Unnamed: 3
Williamsburg,1646,3.8%,
Bedford-Stuyvesant,1421,3.3%,
Mid-Wilshire,944,2.2%,
Hollywood,934,2.2%,
Venice,926,2.2%,
Bushwick,924,2.2%,
Hell's Kitchen,868,2.0%,
Harlem,863,2.0%,
Upper West Side,704,1.6%,
Upper East Side,702,1.6%,

0,1
Distinct count,367
Unique (%),0.9%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,31.789
Minimum,1
Maximum,542
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,1
Q1,5
Median,15
Q3,40
95-th percentile,121
Maximum,542
Range,541
Interquartile range,35

0,1
Standard deviation,44.36
Coef of variation,1.3955
Kurtosis,13.804
Mean,31.789
MAD,29.767
Skewness,3.0285
Sum,1359774
Variance,1967.8
Memory size,334.3 KiB

Value,Count,Frequency (%),Unnamed: 3
1,3192,7.5%,
2,2704,6.3%,
3,2332,5.5%,
4,1923,4.5%,
5,1725,4.0%,
6,1424,3.3%,
7,1332,3.1%,
8,1230,2.9%,
9,1074,2.5%,
10,984,2.3%,

Value,Count,Frequency (%),Unnamed: 3
1,3192,7.5%,
2,2704,6.3%,
3,2332,5.5%,
4,1923,4.5%,
5,1725,4.0%,

Value,Count,Frequency (%),Unnamed: 3
505,1,0.0%,
525,1,0.0%,
530,1,0.0%,
532,1,0.0%,
542,1,0.0%,

0,1
Distinct count,31
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Apartment,27617
House,9875
Condominium,1562
Other values (28),3721

Value,Count,Frequency (%),Unnamed: 3
Apartment,27617,64.6%,
House,9875,23.1%,
Condominium,1562,3.7%,
Townhouse,1056,2.5%,
Loft,791,1.8%,
Other,355,0.8%,
Guesthouse,348,0.8%,
Bed & Breakfast,295,0.7%,
Bungalow,241,0.6%,
Guest suite,98,0.2%,

0,1
Distinct count,53
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,94.247
Minimum,20
Maximum,100
Zeros (%),0.0%

0,1
Minimum,20
5-th percentile,80
Q1,92
Median,96
Q3,99
95-th percentile,100
Maximum,100
Range,80
Interquartile range,7

0,1
Standard deviation,7.0497
Coef of variation,0.0748
Kurtosis,21.289
Mean,94.247
MAD,4.7698
Skewness,-3.326
Sum,4031400
Variance,49.699
Memory size,334.3 KiB

Value,Count,Frequency (%),Unnamed: 3
100.0,10424,24.4%,
98.0,3672,8.6%,
97.0,3388,7.9%,
96.0,3320,7.8%,
95.0,2986,7.0%,
93.0,2823,6.6%,
99.0,2289,5.4%,
94.0,2199,5.1%,
90.0,1930,4.5%,
92.0,1690,4.0%,

Value,Count,Frequency (%),Unnamed: 3
20.0,49,0.1%,
27.0,1,0.0%,
30.0,2,0.0%,
35.0,1,0.0%,
40.0,37,0.1%,

Value,Count,Frequency (%),Unnamed: 3
96.0,3320,7.8%,
97.0,3388,7.9%,
98.0,3672,8.6%,
99.0,2289,5.4%,
100.0,10424,24.4%,

0,1
Distinct count,3
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Entire home/apt,24775
Private room,16984
Shared room,1016

Value,Count,Frequency (%),Unnamed: 3
Entire home/apt,24775,57.9%,
Private room,16984,39.7%,
Shared room,1016,2.4%,

0,1
Distinct count,38497
Unique (%),90.0%
Missing (%),10.0%
Missing (n),4273

0,1
https://a0.muscache.com/im/pictures/109405834/9a555e66_original.jpg?aki_policy=small,2
https://a0.muscache.com/im/pictures/23033013/54d62516_original.jpg?aki_policy=small,2
https://a0.muscache.com/im/pictures/61042471/5543b0e0_original.jpg?aki_policy=small,2
Other values (38493),38496
(Missing),4273

Value,Count,Frequency (%),Unnamed: 3
https://a0.muscache.com/im/pictures/109405834/9a555e66_original.jpg?aki_policy=small,2,0.0%,
https://a0.muscache.com/im/pictures/23033013/54d62516_original.jpg?aki_policy=small,2,0.0%,
https://a0.muscache.com/im/pictures/61042471/5543b0e0_original.jpg?aki_policy=small,2,0.0%,
https://a0.muscache.com/im/pictures/70087089/bc66229a_original.jpg?aki_policy=small,2,0.0%,
https://a0.muscache.com/im/pictures/104667326/a7a2b145_original.jpg?aki_policy=small,2,0.0%,
https://a0.muscache.com/im/pictures/28563531/1000de61_original.jpg?aki_policy=small,2,0.0%,
https://a0.muscache.com/im/pictures/95225900/7406e69e_original.jpg?aki_policy=small,1,0.0%,
https://a0.muscache.com/im/pictures/e0e49434-6148-48c3-b0bc-267202af93ff.jpg?aki_policy=small,1,0.0%,
https://a0.muscache.com/im/pictures/1c391895-9f2f-454b-8d77-ecf96ffa790c.jpg?aki_policy=small,1,0.0%,
https://a0.muscache.com/im/pictures/d98f26e8-d3bb-47bd-9cfc-aa2bca9c2ed8.jpg?aki_policy=small,1,0.0%,

0,1
Distinct count,659
Unique (%),1.5%
Missing (%),0.0%
Missing (n),0

0,1
90291,873
11211.0,787
11221,771
Other values (656),40344

Value,Count,Frequency (%),Unnamed: 3
90291,873,2.0%,
11211.0,787,1.8%,
11221,771,1.8%,
94110,624,1.5%,
90046,565,1.3%,
20002,550,1.3%,
20009,481,1.1%,
10019,472,1.1%,
90028,461,1.1%,
20001,458,1.1%,

Unnamed: 0,id,log_price,property_type,room_type,amenities,accommodates,bathrooms,bed_type,cancellation_policy,cleaning_fee,city,description,first_review,host_has_profile_pic,host_identity_verified,host_response_rate,host_since,instant_bookable,last_review,latitude,longitude,name,neighbourhood,number_of_reviews,review_scores_rating,thumbnail_url,zipcode,bedrooms,beds
1,6304928,5.129899,Apartment,Entire home/apt,"{""Wireless Internet"",""Air conditioning"",Kitche...",7,1.0,Real Bed,strict,True,NYC,Enjoy travelling during your stay in Manhattan...,2017-08-05,True,False,100%,2017-06-19,True,2017-09-23,40.766115,-73.98904,Superb 3BR Apt Located Near Times Square,Hell's Kitchen,6,93.0,https://a0.muscache.com/im/pictures/348a55fe-4...,10019,3.0,3.0
2,7919400,4.976734,Apartment,Entire home/apt,"{TV,""Cable TV"",""Wireless Internet"",""Air condit...",5,1.0,Real Bed,moderate,True,NYC,The Oasis comes complete with a full backyard ...,2017-04-30,True,True,100%,2016-10-25,True,2017-09-14,40.80811,-73.943756,The Garden Oasis,Harlem,10,92.0,https://a0.muscache.com/im/pictures/6fae5362-9...,10027,1.0,3.0
4,3808709,4.744932,Apartment,Entire home/apt,"{TV,Internet,""Wireless Internet"",""Air conditio...",2,1.0,Real Bed,moderate,True,DC,"Cool, cozy, and comfortable studio located in ...",2015-05-12,True,True,100%,2015-03-01,True,2017-01-22,38.925627,-77.034596,Great studio in midtown DC,Columbia Heights,4,40.0,,20009,0.0,1.0
5,12422935,4.442651,Apartment,Private room,"{TV,""Wireless Internet"",Heating,""Smoke detecto...",2,1.0,Real Bed,strict,True,SF,Beautiful private room overlooking scenic view...,2017-08-27,True,True,100%,2017-06-07,True,2017-09-05,37.753164,-122.429526,Comfort Suite San Francisco,Noe Valley,3,100.0,https://a0.muscache.com/im/pictures/82509143-4...,94131,1.0,1.0
7,13971273,4.787492,Condominium,Entire home/apt,"{TV,""Cable TV"",""Wireless Internet"",""Wheelchair...",2,1.0,Real Bed,moderate,True,LA,Arguably the best location (and safest) in dow...,2016-12-16,True,True,100%,2013-05-18,False,2017-04-12,34.046737,-118.260439,"Near LA Live, Staple's. Starbucks inside. OWN ...",Downtown,9,93.0,https://a0.muscache.com/im/pictures/61bd05d5-c...,90015,1.0,1.0


___
## 3. Modelos de predição

o MODELO DE PREDIÇÃO PELA MÉDIA (Sem uso de variável explicativa).

o MODELO DOS K VIZINHOS MAIS PRÓXIMOS (K-Nearest Neighbors Regression)

o MODELO DE REGRESSÃO LINEAR (Multiple Linear Regression)

o MODELO DE ÁRVORES DE REGRESSÃO (Decision Tree Regression)

In [13]:
from sklearn.metrics import accuracy_score
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split

In [15]:
#Removendo colunas que não sao pertinentes ao estudo:
df2 = df.drop(["latitude","longitude","name","thumbnail_url","id","description"],axis=1);

In [16]:
#Dividindo os dados do dataframe em teste e treino:
X_train, X_test, y_train, y_test = train_test_split(df, df['log_price'], test_size = 0.33, random_state = 0)                           

#Chamando o classificador:
clf_gini = DecisionTreeClassifier(criterion = "gini", random_state = 100, 
                                   max_depth=8, min_samples_leaf=4)
#Fazendo um fit nos dados de treino:
clf_gini.fit(X_train, y_train)   

y_pred = clf_gini.predict(X_test)

#Descobrindo a acurácia do classificador:
acc = accuracy_score(y_test, y_pred)

acc = acc*100
print("A acurácia é de {:.2f}%".format(acc))

ValueError: could not convert string to float: 'Apartment'

o MODELO DO RANDOM FOREST (Para comparação)

___
## 4. Processo e estatísticas de validação

[Esse item depende dos resultados das modelagens anteriores! Organize-os aqui de forma clara!]

___
## 5. Conclusão

___
## 6. Referências bibliográficas