# Dataset
* Website: [Viva real](https://www.vivareal.com.br/)
* Local: Florianópolis, Santa Catarina, Brasil
* Period: january 2021 (summer)
* Important! The summer period has higher prices, resulting in inaccurate year's price predictions.

## Variables
* Product: all property types
* Area: property size (m2)
* Bedrooms: total number of bedrooms
* Bathrooms: total number of bathrooms
* Garage: total number of car spaces
* Adress: adress, district and city
* Price: daily or monthy rent

## Preparing the notebook

In [1]:
# Imports
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
import requests
from bs4 import BeautifulSoup
from time import sleep
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import pandas as pd

## Web scraping | Selenium
* Partially automatic, because the next page doesn't work with bottom, neither beautiful soap (URL)

In [2]:
# Defining the driver
driver = webdriver.Chrome(ChromeDriverManager().install())
sleep(3)



Current google-chrome version is 97.0.4692
Get LATEST chromedriver version for 97.0.4692 google-chrome
Trying to download new driver from https://chromedriver.storage.googleapis.com/97.0.4692.71/chromedriver_mac64.zip
Driver has been saved in cache [/Users/lpl/.wdm/drivers/chromedriver/mac64/97.0.4692.71]


In [3]:
# Acessing the site and getting the data
url = "https://www.vivareal.com.br/aluguel/santa-catarina/florianopolis/"
driver.get(url)
sleep(2)

In [4]:
# Creating the dataframe structure
df = pd.DataFrame(columns=['product', 
                           'area', 
                           'room', 
                           'bath', 
                           'garage', 
                           'adress', 
                           'price'])

In [5]:
# Creating web elements
prod_lists = driver.find_elements_by_class_name('property-card__title')
area_lists = driver.find_elements_by_class_name('property-card__detail-area')
room_lists = driver.find_elements_by_class_name('property-card__detail-room')
bath_lists = driver.find_elements_by_class_name('property-card__detail-bathroom')
garage_lists = driver.find_elements_by_class_name('property-card__detail-garage')
adress_lists = driver.find_elements_by_class_name('property-card__address')
price_lists = driver.find_elements_by_class_name('property-card__price')
    
# Getting the data from web elements
prod = [n.text for n in prod_lists][:36]
area = [n.text for n in area_lists][:36]
room = [n.text for n in room_lists][:36]
bath = [n.text for n in bath_lists][:36]
garage = [n.text for n in garage_lists][:36]
adress = [n.text for n in adress_lists][:36]
price = [n.text for n in price_lists][:36]

# Checking the len of each list
#lista = [prod, area, room, bath, garage, adress, price]
#for n in lista:
#    print(len(n))
    
# Adding the data at DF
df = df.append(pd.DataFrame({'product':prod, 
                             'area':area, 
                             'room':room, 
                             'bath':bath, 
                             'garage':garage, 
                             'adress':adress, 
                             'price':price}))

## Exporting the dataframe

In [6]:
# Exporting the df to local machine as original df
df.to_csv('df_org.csv', sep=';')

In [7]:
df.head()

Unnamed: 0,product,area,room,bath,garage,adress,price
0,"Apartamento com 2 Quartos para Aluguel, 124m²",124 m²,2 Quartos,3 Banheiros,2 Vagas,"Jurerê Internacional, Florianópolis - SC",R$ 15.000 /mês
1,"Casa com 3 Quartos para Aluguel, 180m²",124,3 Quartos,3 Banheiros,2 Vagas,"Rua Liberato Carioni, 311 - Lagoa da Conceição...",R$ 15.000 /mês
2,"Ponto comercial/Loja/Box para Aluguel, 80m²",180 m²,-- Quarto,1 Banheiro,-- Vaga,"Rua Deputado Paulo Preis, 78 - Jurerê, Florian...",R$ 6.690 /mês
3,"Apartamento com 3 Quartos para Aluguel, 120m²",180,3 Quartos,2 Banheiros,3 Vagas,"Servidão Paulo Simão Martins - Campeche, Flori...",R$ 5.000 /mês
4,"Apartamento com 3 Quartos para Aluguel, 300m²",80 m²,3 Quartos,5 Banheiros,2 Vagas,"Avenida Governador Irineu Bornhausen, 3690 - B...",R$ 9.950 /mês
