### Discussion
First we use the scraping process with scrapy, then some data manipulation.

In a first look in the site, we come to realize that there are many sale items in total, distributed between cities and types of sale, the two main types are:

- Compra
- Aluguel

And then, there are different types of products, the two main types are:
- Apartamentos:
- Casas:


For this scraping, we will focus on categories that will help find answers for the questions:

- What is the mean price for each city?
- What is the city with most offers?
- List the 5 cheapest and the most expensive.

The categories used are:

- Sale Type: 'Casa' or 'Apartamento'
- Price:
    - 'Venda'
    - 'Aluguel'
    - 'Condomínio'
    - 'IPTU'
- Localization:
    - City
    - Neighborhood
    - Street and number
- Features

Having in mind the questions we need to answer, we should focus on the Price and location categories and use the sale type and other features for more especific questions.

#### Scraping

In this first step, it was imported the libraries needed for this notebook.

In [1]:
from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings
from DogHero.spiders import imovelweb

import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sns

In this next cell, the crawl is executed.

In [2]:
%%time
process = CrawlerProcess(get_project_settings())
process.crawl('ImovelWebCrawl_final')

process.start() # the script will block here until the crawling is finished

2018-10-30 17:38:26 [scrapy.utils.log] INFO: Scrapy 1.5.1 started (bot: DogHero)
2018-10-30 17:38:26 [scrapy.utils.log] INFO: Versions: lxml 4.2.1.0, libxml2 2.9.8, cssselect 1.0.3, parsel 1.5.0, w3lib 1.19.0, Twisted 18.9.0, Python 3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)], pyOpenSSL 18.0.0 (OpenSSL 1.0.2o  27 Mar 2018), cryptography 2.2.2, Platform Windows-10-10.0.17134-SP0
2018-10-30 17:38:26 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'DogHero', 'FEED_FORMAT': 'csv', 'FEED_URI': 'imovelweb.csv', 'NEWSPIDER_MODULE': 'DogHero.spiders', 'ROBOTSTXT_OBEY': True, 'SPIDER_MODULES': ['DogHero.spiders']}
2018-10-30 17:38:27 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.feedexport.FeedExporter',
 'scrapy.extensions.logstats.LogStats']
2018-10-30 17:38:30 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermidd

2018-10-30 17:38:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/imoveis-pagina-17.html> (referer: None)
2018-10-30 17:38:45 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.imovelweb.com.br/propriedades/apartamento-de-luxo-no-morro-ipiranga-4-suites-330-2939480007.html> from <GET https://www.imovelweb.com.br/propriedades/apartamento-de-luxo-no-morro-ipiranga-4-suites-2939480007.html#map>
2018-10-30 17:38:45 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.imovelweb.com.br/propriedades/locacao-jundiai-lage-comercial-550-m-sup2--the-one-2938474343.html> from <GET https://www.imovelweb.com.br/propriedades/locacao-jundiai-lage-comercial-550-m2-the-one-2938474343.html>
2018-10-30 17:38:46 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.imovelweb.com.br/propriedades/casa-a-venda-abrokers.-2929975883.html> from <GET https://www.imovelweb.com.br/proprieda

2018-10-30 17:38:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/bosc-eco-residence-2938583759.html>
{'commonareas': 'Fitness/Sala de Ginástica', 'title': 'Bosc Eco Residence'}
2018-10-30 17:38:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/apartamento-para-venda-salvador-ba-bairro-pituba-2939926561.html>
{'Condominio': '700',
 'Venda': '520000',
 'area_total': '90',
 'area_util': '90',
 'banheiro': '3',
 'city': 'Salvador',
 'commonareas': 'Restaurante',
 'neigghborhood': 'Pituba',
 'privateareas': 'Varanda',
 'quarto': '3',
 'saletype': 'Apartamento',
 'street': 'R. Amazonas 388',
 'suite': '1',
 'title': 'Apartamento para Venda - Salvador / Ba, Bairro Pituba',
 'vaga': '1'}
2018-10-30 17:38:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/apartamento-de-luxo-no-morro-ipiranga-4-suites-330-2939480007.html> (referer: https://www.imovelweb.com.br/imoveis-p

2018-10-30 17:38:55 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/chacara-rural-a-venda-zona-rural-pinhalzinho.-2936828642.html> (referer: https://www.imovelweb.com.br/imoveis-pagina-17.html)
2018-10-30 17:38:55 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/chacara-rural-a-venda-zona-rural-pinhalzinho.-2936828642.html>
{'Venda': '240000',
 'area_total': '1000',
 'area_util': '330',
 'banheiro': '2',
 'city': 'Pinhalzinho',
 'neigghborhood': 'Centro',
 'privateareas': 'Churrasqueira',
 'quarto': '3',
 'saletype': 'Rural',
 'street': 'Rua Variante Americo Pedro Benedetti',
 'title': 'Chácara Rural à Venda, Zona Rural, Pinhalzinho.'}
2018-10-30 17:38:55 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.imovelweb.com.br/propriedades/sobrado-proximo-da-av-engenheiro-caetano-alvares-1002000903.html> from <GET https://www.imovelweb.com.br/propriedades/sobrado-proximo-da-ave

2018-10-30 17:38:59 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/casa-a-venda-no-jardim-paulista-2934223131.html> (referer: https://www.imovelweb.com.br/imoveis-pagina-18.html)
2018-10-30 17:38:59 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/joao-bettega-home-club-apartamento-com-3-dormitorios-2938952238.html>
{'Venda': '224990',
 'age_imovel': '1',
 'area_total': '95',
 'area_util': '61',
 'banheiro': '2',
 'city': 'Curitiba',
 'commonareas': 'Fitness/Sala de Ginástica',
 'diverseareas': 'Aceita FGTS',
 'neigghborhood': 'Portão',
 'quarto': '3',
 'saletype': 'Apartamento',
 'street': 'R. João Bettega,',
 'suite': '1',
 'title': 'João Bettega Home Club - Apartamento Com 3 Dormitórios E 1 Vaga '
          'Coberta',
 'vaga': '1'}
2018-10-30 17:38:59 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/casa-a-venda-no-jardim-paulista-2934223131.html>
{'IPTU': '1713'

2018-10-30 17:39:03 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/hub-business-1922205561.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-18.html)
2018-10-30 17:39:03 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/oportunidade-unica!-2928079620.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-18.html)
2018-10-30 17:39:03 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/hub-business-1922205561.html>
{'title': 'Hub - Business'}
2018-10-30 17:39:03 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/oportunidade-unica!-2928079620.html>
{'Aluguel': '2800',
 'Condominio': '1',
 'age_imovel': '5',
 'area_total': '25000',
 'area_util': '83',
 'banheiro': '2',
 'city': 'Santos',
 'commonareas': 'Espaço Gourmet',
 'neigghborhood': 'Marapé',
 'privateareas': 'Permite animais',
 'quarto': '3',
 'saletype': 

2018-10-30 17:39:07 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/cidade-maia-jardim-2939177695.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-13.html)
2018-10-30 17:39:07 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/cidade-maia-jardim-2939177695.html>
{'title': 'Cidade Maia Jardim'}
2018-10-30 17:39:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/cobertura-a-venda-229-m-r$699-mil-recreio-2939012863.html> (referer: https://www.imovelweb.com.br/imoveis-pagina-18.html)
2018-10-30 17:39:08 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.imovelweb.com.br/propriedades/aracoiaba-da-serra-r$20.000-2939705492.html> from <GET https://www.imovelweb.com.br/propriedades/aracoiaba-da-serra-r$-20.000-00-2939705492.html>
2018-10-30 17:39:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/pro

2018-10-30 17:39:11 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/platina-220-residencial-2939509567.html>
{'commonareas': 'Fitness/Sala de Ginástica',
 'privateareas': 'Varanda',
 'title': 'Platina 220 Residencial'}
2018-10-30 17:39:12 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/apartamento-no-bosque-maia-58-m-2-dorms-com-1-suite-2938298066.html> (referer: https://www.imovelweb.com.br/imoveis-pagina-13.html)
2018-10-30 17:39:12 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/apartamento-no-bosque-maia-58-m-2-dorms-com-1-suite-2938298066.html>
{'Condominio': '260',
 'IPTU': '100',
 'Venda': '279900',
 'age_imovel': '2',
 'area_total': '58',
 'area_util': '58',
 'banheiro': '2',
 'city': 'Guarulhos',
 'commonareas': 'Fitness/Sala de Ginástica',
 'diverseareas': 'Aceita FGTS',
 'neigghborhood': 'Jardim Flor da Montanha',
 'privateareas': 'Churrasqueira',
 'quar

2018-10-30 17:39:15 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/apartamento-para-aluguel-em-santana-2932018457.html>
{'Aluguel': '1700',
 'Condominio': '880',
 'IPTU': '290',
 'area_total': '96',
 'area_util': '96',
 'banheiro': '1',
 'city': 'São Paulo',
 'commonareas': 'Fitness/Sala de Ginástica',
 'neigghborhood': 'Santana',
 'privateareas': 'Escritório',
 'quarto': '3',
 'saletype': 'Apartamento',
 'street': 'NOVA CANTAREIRA',
 'suite': '1',
 'title': 'Apartamento para Aluguel - em Santana',
 'vaga': '2'}
2018-10-30 17:39:16 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/apartamento-a-venda-em-praca-da-arvore-2936877476.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-12.html)
2018-10-30 17:39:16 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/apartamento-a-venda-em-praca-da-arvore-2936877476.html>
{'Condominio': '476',
 'IPTU': '1',
 'Venda

2018-10-30 17:39:21 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/sobrado-residencial-a-venda-jabaquara-sao-paulo.-2936997673.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-16.html)
2018-10-30 17:39:21 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/wi-house-2925987397.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-16.html)
2018-10-30 17:39:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/sobrado-residencial-a-venda-jabaquara-sao-paulo.-2936997673.html>
{'IPTU': '412',
 'Venda': '1400000',
 'age_imovel': '48',
 'area_total': '400',
 'area_util': '152',
 'banheiro': '4',
 'city': 'São Paulo',
 'commonareas': 'Acesso asfaltado',
 'neigghborhood': 'Jabaquara',
 'privateareas': 'Esgoto',
 'quarto': '5',
 'saletype': 'Casa',
 'street': 'Rua Jaguarão 227',
 'suite': '1',
 'title': 'Sobrado Residencial à Venda, Jabaquara, São Paulo.',

2018-10-30 17:39:25 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/apartamento-para-venda-sao-jose-dos-campos-sp-2938668094.html>
{'Condominio': '390',
 'Venda': '250000',
 'area_util': '70',
 'banheiro': '2',
 'city': 'São José dos Campos',
 'neigghborhood': 'Jardim São Dimas',
 'privateareas': 'Varanda',
 'quarto': '2',
 'saletype': 'Apartamento',
 'street': 'Sob Consulta',
 'suite': '1',
 'title': 'Apartamento para Venda - São José Dos Campos / Sp, Bairro São Dimas',
 'vaga': '1'}
2018-10-30 17:39:25 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/flat-melia-ibirapuera-para-locacao-ref797-2925124310.html#map> (referer: https://www.imovelweb.com.br/imoveis.html)
2018-10-30 17:39:25 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/flat-melia-ibirapuera-para-locacao-ref797-2925124310.html>
{'Aluguel': '1845',
 'Condominio': '1100',
 'IPTU': '100',
 'area_util': '35

2018-10-30 17:39:28 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/quartier-brooklin-2937025954.html>
{'commonareas': 'Espaço Gourmet', 'title': 'Quartier Brooklin'}
2018-10-30 17:39:28 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/comercial-para-aluguel-em-santana-2936074658.html>
{'Aluguel': '4500',
 'Condominio': '1',
 'IPTU': '452',
 'area_total': '420',
 'area_util': '420',
 'banheiro': '3',
 'city': 'São Paulo',
 'neigghborhood': 'Santana',
 'saletype': 'Comercial',
 'street': 'DUARTE DE AZEVEDO',
 'title': 'Comercial para Aluguel - em Santana',
 'vaga': '3'}
2018-10-30 17:39:29 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.imovelweb.com.br/propriedades/all-you-need-studio-25-m-sup2--localizado-no-2937846043.html> from <GET https://www.imovelweb.com.br/propriedades/all-you-need-studio-25m-localizado-no-centro-de-2937846043.html#map>
2018-10-30 17:39:30 [scra

2018-10-30 17:39:33 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/apartamento-a-venda-em-menino-deus-2937581315.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-15.html)
2018-10-30 17:39:34 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/apartamento-a-venda-em-menino-deus-2937581315.html>
{'Condominio': '790',
 'IPTU': '900',
 'Venda': '927000',
 'age_imovel': '17',
 'area_total': '172',
 'area_util': '111',
 'banheiro': '3',
 'city': 'Porto Alegre',
 'commonareas': 'Fitness/Sala de Ginástica',
 'neigghborhood': 'Menino Deus',
 'privateareas': 'Churrasqueira',
 'quarto': '3',
 'saletype': 'Apartamento',
 'street': 'VISCONDE DO HERVAL',
 'suite': '1',
 'title': 'Apartamento à Venda - em Menino Deus',
 'vaga': '2'}
2018-10-30 17:39:34 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/apartamento-residencial-a-venda-jardim-prudencia-sao-2939277174.html#

2018-10-30 17:39:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/fechamento-de-mes-faca-sua-proposta-casas-alto-de-2938435610.html>
{'Venda': '6950500',
 'age_imovel': 'Em construção',
 'area_total': '343',
 'area_util': '343',
 'banheiro': '6',
 'city': 'São Paulo',
 'commonareas': 'Espaço Gourmet',
 'neigghborhood': 'Alto de Pinheiros',
 'quarto': '4',
 'saletype': 'Casa',
 'street': 'Rua Professor Fonseca Rodrigues',
 'suite': '4',
 'title': 'Fechamento De Mês Faça Sua Proposta - Casas Alto De Pinheiros!',
 'vaga': '4'}
2018-10-30 17:39:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/apartamento-a-venda-edificio-amazonas-centro-2939070172.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-15.html)
2018-10-30 17:39:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/apartamento-a-venda-edificio-amazonas-centro-2939070172.html>
{'Condominio': '38

2018-10-30 17:39:42 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/casa-3-dormitorios-em-bairro-nobre-de-atibaia-2928089241.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-14.html)
2018-10-30 17:39:42 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/casa-3-dormitorios-em-bairro-nobre-de-atibaia-2928089241.html>
{'Aluguel': '1900',
 'IPTU': '118',
 'age_imovel': '20',
 'area_util': '167',
 'banheiro': '3',
 'city': 'Atibaia',
 'commonareas': 'Acesso asfaltado',
 'neigghborhood': 'Vila Santista',
 'privateareas': 'Piso frio',
 'quarto': '3',
 'saletype': 'Casa',
 'street': 'Rua Floresta 62',
 'suite': '1',
 'title': 'Casa 3 Dormitórios em Bairro Nobre De Atibaia - Locação',
 'vaga': '2'}
2018-10-30 17:39:43 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/viva-benx-nac-es-unidas-2938780677.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina

2018-10-30 17:39:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/ed.-phoenix-salao-barbearia-2940057900.html> (referer: https://www.imovelweb.com.br/imoveis-pagina-10.html)
2018-10-30 17:39:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/praticamente-uma-pousada-particular-dentro-da-fazenda-2937302089.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-10.html)
2018-10-30 17:39:46 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/rio-da-prata-casa-duplex-3-quartos-1-suite-em-2939835627.html>
{'Condominio': '100',
 'IPTU': '30',
 'Venda': '650000',
 'age_imovel': '1',
 'area_total': '130',
 'area_util': '130',
 'banheiro': '3',
 'city': 'Rio de Janeiro',
 'neigghborhood': 'Campo Grande',
 'quarto': '3',
 'saletype': 'Casa',
 'street': 'Estrada do Lameirão Pequeno',
 'suite': '1',
 'title': 'Rio Da Prata - Casa Duplex 3 Quartos/1 Suíte em Condomínio -

2018-10-30 17:39:50 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.imovelweb.com.br/propriedades/saint-barth-flamands-301-m-sup2--04-suites-2939907893.html> from <GET https://www.imovelweb.com.br/propriedades/saint-barth-flamands-301m-04-suites-peninsula-2939907893.html#map>
2018-10-30 17:39:50 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/apartamento-vila-izabel-1-dormitorio-face-norte-com-2938126587.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-10.html)
2018-10-30 17:39:50 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/apartamento-vila-izabel-1-dormitorio-face-norte-com-2938126587.html>
{'Condominio': '319',
 'IPTU': '493',
 'Venda': '320000',
 'age_imovel': '4',
 'area_total': '89',
 'area_util': '51',
 'banheiro': '1',
 'city': 'Curitiba',
 'commonareas': 'Salão de Jogos',
 'neigghborhood': 'Vila Izabel',
 'privateareas': 'Aceita Financiamento

2018-10-30 17:39:54 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/apartamento-para-aluguel-em-santana-2934828709.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-11.html)
2018-10-30 17:39:54 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/casa-a-venda-no-centro-2930480497.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-11.html)
2018-10-30 17:39:54 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/apartamento-para-aluguel-em-santana-2934828709.html>
{'Aluguel': '1800',
 'Condominio': '1100',
 'area_util': '110',
 'banheiro': '2',
 'city': 'São Paulo',
 'commonareas': 'Salão de festas',
 'neigghborhood': 'Santana',
 'privateareas': 'Armário embutido',
 'quarto': '3',
 'saletype': 'Apartamento',
 'street': 'Rua Voluntários da Pátria 3714',
 'suite': '1',
 'title': 'Apartamento para Aluguel - em Santana',
 'vaga': '2'}
2018-10-30 17:39:54

2018-10-30 17:39:59 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/linda-cobertura-duplex-240-m-sup2--4-dorm-5-banheiros-2939651154.html> (referer: https://www.imovelweb.com.br/imoveis-pagina-11.html)
2018-10-30 17:39:59 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/linda-cobertura-duplex-240-m-sup2--4-dorm-5-banheiros-2939651154.html>
{'Venda': '1050000',
 'age_imovel': '2',
 'area_total': '240',
 'area_util': '240',
 'banheiro': '5',
 'city': 'São Paulo',
 'commonareas': 'Fitness/Sala de Ginástica',
 'diverseareas': 'Aceita FGTS',
 'neigghborhood': 'Penha',
 'privateareas': 'Suítes',
 'quarto': '4',
 'saletype': 'Apartamento',
 'street': 'Rua faustino paganini , penha SP',
 'suite': '1',
 'title': 'Linda Cobertura Duplex 240 m&sup2; 4 Dorm 5 Banheiros 1 Suite 2 '
          'Vagas 3 Sacadas Penha',
 'vaga': '2'}
2018-10-30 17:39:59 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.c

2018-10-30 17:40:03 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/apartamento-studio-no-centro-de-curitiba-100-2939454616.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-9.html)
2018-10-30 17:40:03 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/locacao-de-sala-comercial-86-m-sup2--proximo-a-2937524325.html> (referer: https://www.imovelweb.com.br/imoveis-pagina-11.html)
2018-10-30 17:40:03 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/apartamento-studio-no-centro-de-curitiba-100-2939454616.html>
{'Aluguel': '1380',
 'Condominio': '300',
 'IPTU': '45',
 'area_util': '33',
 'banheiro': '1',
 'city': 'Curitiba',
 'commonareas': 'Fitness/Sala de Ginástica',
 'neigghborhood': 'Centro',
 'privateareas': 'Elevador',
 'quarto': '1',
 'saletype': 'Apartamento',
 'street': 'Rua Conselheiro Laurindo 1138',
 'suite': '0',
 'title': 'Apartamento Studio no Ce

2018-10-30 17:40:07 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/open-residence-2938293296.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-8.html)
2018-10-30 17:40:07 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/open-residence-2938293296.html>
{'commonareas': 'Fitness/Sala de Ginástica',
 'privateareas': 'Varanda Gourmet',
 'title': 'Open Residence'}
2018-10-30 17:40:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/apartamento-frente-ao-mar-2939566148.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-8.html)
2018-10-30 17:40:08 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/apartamento-frente-ao-mar-2939566148.html>
{'Condominio': '250',
 'IPTU': '90',
 'Venda': '135000',
 'area_total': '45',
 'area_util': '33',
 'banheiro': '1',
 'city': 'Praia Grande',
 'neigghborhood': 'Vila Guilherm

2018-10-30 17:40:11 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/cobertura-para-locacao-sao-paulo-sp-bairro-alto-2938853444.html>
{'Aluguel': '18000',
 'Condominio': '2951',
 'IPTU': '2370',
 'area_total': '650',
 'area_util': '375',
 'banheiro': '1',
 'city': 'São Paulo',
 'commonareas': 'Salão de festas',
 'neigghborhood': 'Alto de Pinheiros',
 'privateareas': 'Suítes',
 'saletype': 'Apartamento',
 'street': 'R MASSACÁ 231',
 'suite': '3',
 'title': 'Cobertura para Locação - São Paulo / Sp, Bairro Alto De Pinheiros',
 'vaga': '5'}
2018-10-30 17:40:11 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/excelente-oportunidade-ed-evora-ap-com-cozinha-1930809393.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-8.html)
2018-10-30 17:40:12 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/excelente-oportunidade-ed-evora-ap-com-cozinha-1930809393.html>
{'com

2018-10-30 17:40:15 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/aptos-de-77-e-94-m-sup2--terraco-gourmet-e-2936993765.html>
{'Condominio': '400',
 'Venda': '384200',
 'age_imovel': '1',
 'area_total': '77',
 'area_util': '77',
 'banheiro': '2',
 'city': 'São Bernardo do Campo',
 'commonareas': 'Fitness/Sala de Ginástica',
 'neigghborhood': 'Baeta Neves',
 'privateareas': 'Suítes',
 'quarto': '2',
 'saletype': 'Apartamento',
 'street': 'Av. Pereira Barreto',
 'suite': '1',
 'title': 'Aptos De 77 E 94 m&sup2; Terraço Gourmet E Churrasqueira, Frente '
          'Sonda Pereira Barreto',
 'vaga': '1'}
2018-10-30 17:40:16 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/apartamento-para-aluguel-em-auxiliadora-2940067988.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-5.html)
2018-10-30 17:40:16 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/apartament

2018-10-30 17:40:20 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/studios-de-18-a-49-m-sup2--no-jardins-ao-lado-do-2938247520.html> (referer: https://www.imovelweb.com.br/imoveis-pagina-5.html)
2018-10-30 17:40:20 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/apartamento-a-venda-na-barra-da-tijuca-peninsula-2939082382.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-5.html)
2018-10-30 17:40:20 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/studios-de-18-a-49-m-sup2--no-jardins-ao-lado-do-2938247520.html>
{'Condominio': '240',
 'Venda': '229000',
 'age_imovel': 'Breve lançamento',
 'area_total': '18',
 'area_util': '18',
 'banheiro': '1',
 'city': 'São Paulo',
 'commonareas': 'Fitness/Sala de Ginástica',
 'neigghborhood': 'Jardins',
 'quarto': '1',
 'saletype': 'Apartamento',
 'street': 'Avenida Rebouças',
 'suite': '1',
 'title': 'Studios De 18 

2018-10-30 17:40:23 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.imovelweb.com.br/propriedades/fantastico!-apartamento-a-venda-na-vila-mascote-1001630522.html> from <GET https://www.imovelweb.com.br/propriedades/fantastico-!-apartamento-a-venda-na-vila-mascote-1001630522.html#map>
2018-10-30 17:40:24 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/apartamento-a-venda-na-agua-verde-2940120615.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-4.html)
2018-10-30 17:40:24 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/apartamento-a-venda-na-agua-verde-2940120615.html>
{'Condominio': '280',
 'Venda': '299000',
 'age_imovel': '38',
 'area_total': '118',
 'area_util': '65',
 'banheiro': '0',
 'city': 'Curitiba',
 'neigghborhood': 'Água Verde',
 'quarto': '3',
 'saletype': 'Apartamento',
 'street': 'Rua Bento Viana 553',
 'suite': '0',
 'title': 'Apartamento 

2018-10-30 17:40:29 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/lindo-apartamento-no-condominio-vitta-club!-2940127869.html>
{'Aluguel': '3000',
 'Condominio': '350',
 'IPTU': '100',
 'Venda': '850000',
 'age_imovel': '8',
 'area_util': '108',
 'banheiro': '4',
 'city': 'Jundiaí',
 'commonareas': 'Espaço Gourmet',
 'diverseareas': 'Andares',
 'neigghborhood': 'Anhangabaú',
 'privateareas': 'Armário de cozinha',
 'quarto': '2',
 'saletype': 'Apartamento',
 'street': 'Rua Barão de Teffe 127',
 'suite': '2',
 'title': 'Lindo Apartamento no Condominio Vitta Club!',
 'vaga': '2'}
2018-10-30 17:40:30 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/oportunidade-imperdivel-nos-jardins-transamerica-2935113603.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-6.html)
2018-10-30 17:40:30 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/apartamento-2-dormitori

2018-10-30 17:40:33 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/apartamento-jardim-tupanci-2927864810.html>
{'Venda': '1400000',
 'area_util': '240',
 'banheiro': '3',
 'city': 'Barueri',
 'commonareas': 'Fitness/Sala de Ginástica',
 'neigghborhood': 'Jardim Tupanci',
 'privateareas': 'Área de serviço',
 'quarto': '3',
 'saletype': 'Apartamento',
 'street': 'Rua Werner Goldberg 0',
 'suite': '1',
 'title': 'Apartamento - Jardim Tupanci',
 'vaga': '2'}
2018-10-30 17:40:33 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/apartamento-02-quartos-via-enseada-home-club-aguas-2939983430.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-6.html)
2018-10-30 17:40:33 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/quadra-nobre-com-3-quartos-sendo-1-suite-sala-2-2940019684.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-6.html)
2018-10-30 17:40

2018-10-30 17:40:38 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/apartamento-a-venda-na-campeche-2939741058.html>
{'Condominio': '480',
 'IPTU': '980',
 'Venda': '597000',
 'area_total': '140',
 'city': 'Florianópolis',
 'commonareas': 'Aquecimento central',
 'neigghborhood': 'Campeche',
 'privateareas': 'Mobiliado',
 'quarto': '2',
 'saletype': 'Apartamento',
 'street': 'Campeche 1670',
 'suite': '1',
 'title': 'Apartamento à Venda - na Campeche',
 'vaga': '2'}
2018-10-30 17:40:38 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/avance-vila-romana-rua-marco-aurelio-padrao-paulo-2938758351.html>
{'Condominio': '1000',
 'IPTU': '650',
 'Venda': '1845000',
 'age_imovel': '8',
 'area_util': '152',
 'banheiro': '5',
 'city': 'São Paulo',
 'commonareas': 'Fitness/Sala de Ginástica',
 'diverseareas': 'Andares',
 'neigghborhood': 'Vila Romana',
 'privateareas': 'Suítes',
 'quarto': '4',
 'saletype': 'Apartamen

2018-10-30 17:40:42 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.imovelweb.com.br/propriedades/oferta-apartamento-03-dormitorios-a-venda-em-itapema!-2937798225.html> from <GET https://www.imovelweb.com.br/propriedades/oferta-apartamento-03-dormitorios-a-venda-em-2937798225.html#map>
2018-10-30 17:40:42 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/casa-a-venda-na-campeche-2939741053.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-3.html)
2018-10-30 17:40:42 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/casa-a-venda-na-campeche-2939741053.html>
{'Venda': '375000',
 'area_total': '114',
 'city': 'Florianópolis',
 'neigghborhood': 'Campeche',
 'privateareas': 'Área de serviço',
 'quarto': '2',
 'saletype': 'Casa',
 'street': 'Revoar Das Gaivotas 13',
 'suite': '2',
 'title': 'Casa à Venda - na Campeche',
 'vaga': '1'}
2018-10-30 17:40:43 [scrapy.cor

2018-10-30 17:40:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/lille-2938846713.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-7.html)
2018-10-30 17:40:47 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/lille-2938846713.html>
{'commonareas': 'Espaço Gourmet', 'privateareas': 'Piso frio', 'title': 'Lillè'}
2018-10-30 17:40:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/oferta-apartamento-03-dormitorios-a-venda-em-itapema!-2937798225.html> (referer: https://www.imovelweb.com.br/imoveis-pagina-7.html)
2018-10-30 17:40:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/lindo-apto-coladinho-ao-metro-vila-sonia-e-rico-em-2939328336.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-7.html)
2018-10-30 17:40:47 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www

2018-10-30 17:40:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/apartamento-a-venda-na-barra-da-tijuca-2923949437.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-7.html)
2018-10-30 17:40:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/otimo-projeto-vista-livre-e-estuda-proposta-com-2938940614.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-2.html)
2018-10-30 17:40:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/apartamento-a-venda-na-barra-da-tijuca-2923949437.html>
{'Condominio': '1200',
 'IPTU': '850',
 'Venda': '1980000',
 'age_imovel': '28',
 'area_total': '386',
 'area_util': '386',
 'banheiro': '2',
 'city': 'Rio de Janeiro',
 'neigghborhood': 'Barra da Tijuca',
 'privateareas': 'Suítes',
 'quarto': '4',
 'saletype': 'Apartamento',
 'street': 'Rua Einstein',
 'suite': '2',
 'title': 'Apartamento à Venda - na Barra Da 

2018-10-30 17:40:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/reserva-horizonte-2935759765.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-2.html)
2018-10-30 17:40:57 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/reserva-horizonte-2935759765.html>
{'commonareas': 'Playground', 'title': 'Reserva Horizonte'}
2018-10-30 17:40:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/quartier-campo-belo-2923335007.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-2.html)
2018-10-30 17:40:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/kit-com-garagem-em-predio-novo-com-elevadores-entrega-2936929421.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-2.html)
2018-10-30 17:40:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/oportunidade-apto-3

2018-10-30 17:41:04 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/imoveis-pagina-25.html> (referer: None)
2018-10-30 17:41:05 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/imoveis-pagina-31.html> (referer: None)
2018-10-30 17:41:06 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/imoveis-pagina-32.html> (referer: None)
2018-10-30 17:41:07 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/edificio-vila-nova-de-gaia-analia-franco-2935958673.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-19.html)
2018-10-30 17:41:07 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/cadoro-escritorios-2926034206.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-23.html)
2018-10-30 17:41:07 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/dsenho-2938369141.html#map> (referer:

2018-10-30 17:41:11 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/portal-vila-das-praias-vila-de-camburi-2935755730.html>
{'commonareas': 'Espaço Gourmet',
 'title': 'Portal Vila Das Praias - Vila De Camburi'}
2018-10-30 17:41:12 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/apartamento-para-aluguel-na-barra-da-tijuca-2938288438.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-32.html)
2018-10-30 17:41:12 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/for-life-maraponga-diversao-2936240383.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-32.html)
2018-10-30 17:41:12 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/apartamento-para-aluguel-na-barra-da-tijuca-2938288438.html>
{'Aluguel': '3500',
 'Condominio': '1100',
 'IPTU': '299',
 'area_util': '110',
 'banheiro': '1',
 'city': 'Rio de Jane

2018-10-30 17:41:17 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/residencial-jardim-di-hamelin-2939858123.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-34.html)
2018-10-30 17:41:17 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/alto-padrao-r$7.500-m-sup2--vai-perder-2936657369.html> (referer: https://www.imovelweb.com.br/imoveis-pagina-32.html)
2018-10-30 17:41:17 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/residencial-jardim-di-hamelin-2939858123.html>
{'title': 'Residencial Jardim Di Hamelin'}
2018-10-30 17:41:17 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.imovelweb.com.br/propriedades/chegou-a-hora-de-conquistar-o-que-e-seu!apto-1e-2-2935756553.html> from <GET https://www.imovelweb.com.br/propriedades/chegou-a-hora-de-conquistar-o-que-e-seu!!!!apto-1e-2-2935756553.html#map>
2018-10-30 17:41:17 [s

2018-10-30 17:41:21 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/le-quartier-moema!2-vagas!-70-m-sup2-!jandira-2939328235.html> (referer: https://www.imovelweb.com.br/imoveis-pagina-34.html)
2018-10-30 17:41:22 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/oportunidade-alto-padrao-2937988525.html>
{'Venda': '1100000',
 'age_imovel': 'Breve lançamento',
 'area_total': '226',
 'area_util': '121',
 'banheiro': '4',
 'city': 'Balneário Camboriú',
 'commonareas': 'Espaço Gourmet',
 'diverseareas': 'Andares',
 'neigghborhood': 'Centro',
 'privateareas': 'Suítes',
 'quarto': '3',
 'saletype': 'Apartamento',
 'street': 'RUA 3700 415',
 'suite': '3',
 'title': 'Oportunidade Alto Padrão',
 'vaga': '2'}
2018-10-30 17:41:22 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/le-quartier-moema!2-vagas!-70-m-sup2-!jandira-2939328235.html>
{'Aluguel': '3500',
 'Condominio': '885

2018-10-30 17:41:26 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/studios-910-2938946674.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-33.html)
2018-10-30 17:41:26 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/loja-comercial-bem-localizada!-2938994595.html> (referer: https://www.imovelweb.com.br/imoveis-pagina-33.html)
2018-10-30 17:41:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/studios-910-2938946674.html>
{'commonareas': 'Praça',
 'privateareas': 'Ar condicionado',
 'title': 'Studios 910'}
2018-10-30 17:41:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/loja-comercial-bem-localizada!-2938994595.html>
{'Venda': '1100000',
 'age_imovel': '10',
 'area_total': '401',
 'area_util': '350',
 'banheiro': '1',
 'city': 'Curitiba',
 'commonareas': 'Acesso asfaltado',
 'neigghborhood': 'Água Verde',
 '

2018-10-30 17:41:30 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imovelweb.com.br/propriedades/apartamento-com-2-dormitorios-ao-lado-da-cidade-2939496347.html>
{'Condominio': '450',
 'Venda': '320000',
 'age_imovel': '7',
 'area_util': '50',
 'banheiro': '1',
 'city': 'São Paulo',
 'commonareas': 'Interfone',
 'neigghborhood': 'Rio Pequeno',
 'privateareas': 'Ar condicionado',
 'quarto': '2',
 'saletype': 'Apartamento',
 'street': 'Av. do Rio Pequeno',
 'suite': '0',
 'title': 'Apartamento Com 2 Dormitórios, Ao Lado Da Cidade Universitária, Com '
          'Móveis!',
 'vaga': '1'}
2018-10-30 17:41:30 [scrapy.extensions.logstats] INFO: Crawled 439 pages (at 146 pages/min), scraped 404 items (at 130 items/min)
2018-10-30 17:41:31 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/boulevard-reboucas-2-quartos-semi-mobiliado-2939781353.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-32.html)
2018-10-30 17:41:31 [scrapy.cor

2018-10-30 17:41:33 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/jurubatuba-empresarial-2939273766.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-31.html)
2018-10-30 17:41:33 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/excelente-sobrado-no-agua-verde-5-quartos-vagas-2939745678.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:33 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/casa-comercial-com-3-dorm.-2939656903.html> (referer: https://www.imovelweb.com.br/imoveis-pagina-32.html)
2018-10-30 17:41:33 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/casa-com-3-dorms-guilhermina-praia-grande-r$5-2931607784.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:33 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb

2018-10-30 17:41:33 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/lindissimo-lancamento-na-praia-dos-sonhos-2939627294.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-25.html)
2018-10-30 17:41:33 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/just-brigadeiro-2937338256.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:33 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.imovelweb.com.br/propriedades/terreno-lopes-de-oliveira-sorocaba-r$80.000-2940121537.html> (referer: https://www.imovelweb.com.br/imoveis-pagina-32.html)
2018-10-30 17:41:33 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/lindissimo-lancamento-na-praia-dos-sonhos-2939627294.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:33 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb

2018-10-30 17:41:34 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/apartamento-residencial-para-venda-e-locacao-morumbi-2930258255.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:34 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/apartamento-a-venda-pituba-salvador-2936419543.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:34 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/apartamento-02-dormitorios-alto-padrao-em-frente-2935138138.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-30.html)
2018-10-30 17:41:34 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/casa-nova-1200m-a-venda-em-alphaville-abrokers-2933648122.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-30.html)
2018-10-30 17:41:34 [scrapy.spidermiddlewar

2018-10-30 17:41:34 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/apartamento-para-venda-e-locacao-com-120-m-vila-2939324844.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-28.html)
2018-10-30 17:41:34 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/vivere-ecoville-mobiliado-3-quartos-2-vagas-2939112222.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-30.html)
2018-10-30 17:41:34 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/apartamento-2-ou-3-quartos-more-em-alphaville-melhor-2937122283.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-30.html)
2018-10-30 17:41:34 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/pe-na-areia-praia-de-ingleses-2938648039.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-28.html)
2018-10-30 17:41:34 [scrapy.core.engine] DEBUG: Crawl

2018-10-30 17:41:35 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/casa-para-alugar-condominio-alphaville-04-suites-06-2935623562.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-28.html)
2018-10-30 17:41:35 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/apartamento-residencial-a-venda-agua-verde-curitiba.-2937182019.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:35 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.imovelweb.com.br/propriedades/cod-4595.-ibiuna-chacara-ideal-para-lazer-ou-2937628559.html> from <GET https://www.imovelweb.com.br/propriedades/cod-4595.-ibiuna-chacara-ideal-para-lazer-ou-moradia!-2937628559.html#map>
2018-10-30 17:41:35 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/terreno-residencial-a-venda-377-00-m2-privativa-santa-293

2018-10-30 17:41:35 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/duplex-em-excelente-localizacao-jardim-paulista-2936396080.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:35 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/marville-residence-2939821116.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-27.html)
2018-10-30 17:41:35 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/apartamento-a-venda-em-sao-conrado-2928171810.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-27.html)
2018-10-30 17:41:35 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/apartamento-em-meia-praia-bem-localizado-com-otimo-2938475028.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:35 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://

2018-10-30 17:41:35 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/spazio-vila-da-gloria-2938009089.html> (referer: https://www.imovelweb.com.br/imoveis-pagina-29.html)
2018-10-30 17:41:35 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/apartamento-a-venda-em-ingleses-2937411868.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-29.html)
2018-10-30 17:41:35 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/lindo-apartamento-na-vila-guilherme!!!-2938302420.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:35 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/cidade-maia-jardim-2931782406.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:35 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/apartamento

2018-10-30 17:41:36 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/apartamento-residencial-a-venda-praia-mansa-caioba-2932469259.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:36 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/quadra-do-mar-2939545582.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:36 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/inspira-business-sala-comercial-na-av.-republica-2939758733.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:36 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/ultima-unidade-pronto-p-morar-parcela-a-entrada-2940011401.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-26.html)
2018-10-30 17:41:36 [scrapy.core.engine] DEBUG: Crawled (403) <GET

2018-10-30 17:41:36 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/brooklin-skymark-2934721478.html> (referer: https://www.imovelweb.com.br/imoveis-pagina-24.html)
2018-10-30 17:41:36 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/fechamento-de-mes-faca-sua-proposta-apartamento-em-2936900467.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-24.html)
2018-10-30 17:41:36 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/apartamento-residencial-a-venda-brooklin-sao-paulo.-2937525972.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:36 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/excelente-sobrado-no-jardim-luzitania-a-poucos-passos-2936176640.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-24.html)
2018-10-30 17:41:36 [scrapy.core.engine] DEBUG: Crawled (4

2018-10-30 17:41:36 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.imovelweb.com.br/propriedades/apartamento-60-m-sup2--2-dormitorios-pronto-para-morar-2936268470.html> from <GET https://www.imovelweb.com.br/propriedades/apartamento-60-m-2-dormitorios-pronto-para-morar-2936268470.html#map>
2018-10-30 17:41:36 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/apartamento-2-dormitorios-sendo-uma-suite-em-pinheiros-2940045050.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-23.html)
2018-10-30 17:41:36 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/apartamento-a-venda-em-moinhos-de-vento-2922947078.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:36 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/parque-caminho-das-aroeiras-2938008231.html>: HTTP status code

2018-10-30 17:41:37 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/qi-06-otima-localizacao-perto-de-tudo-111496114.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-22.html)
2018-10-30 17:41:37 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/flex-jundiai-2921935384.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-22.html)
2018-10-30 17:41:37 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/conquista-campestre-reserva-2938995796.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-22.html)
2018-10-30 17:41:37 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/apartamento-60-m-sup2--2-dormitorios-pronto-para-morar-2936268470.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:37 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://w

2018-10-30 17:41:37 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/monte-libano-barato!!-apartamento-3-quartos-1-vaga-2937282996.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:37 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/apartamento-na-meia-praia-em-regiao-de-crescimento-e-2938469622.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-21.html)
2018-10-30 17:41:37 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/rosas-de-provence-2936854137.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-19.html)
2018-10-30 17:41:37 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/cobertura-duplex-2-dorms-1-banheiro-e-1-vaga-de-2939116444.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-21.html)
2018-10-30 17:41:37 [scrapy.spidermiddlewares.httperror] I

2018-10-30 17:41:37 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/apartamento-residencial-a-venda-centro-guarulhos.-2938317744.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-19.html)
2018-10-30 17:41:37 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/reserva-pq-das-cachoeiras-cristais-2935721055.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-19.html)
2018-10-30 17:41:37 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/casa-com-piscina-no-bella-citta-3-suites-2939615208.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-19.html)
2018-10-30 17:41:37 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/cobertura-pronta-para-morar.-residencial-soleil-2938862887.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:37 [scrapy.spidermiddlewares.httperro

2018-10-30 17:41:37 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/apartamento-a-venda-no-campo-comprido-2938134322.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-20.html)
2018-10-30 17:41:37 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/cobertura-unica-itaim-exclusiva-2933039981.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-20.html)
2018-10-30 17:41:37 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/la-vie-agua-verde-2938668867.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-20.html)
2018-10-30 17:41:37 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/terrea-!-pratica-!-reformada.-2939505945.html#map> (referer: https://www.imovelweb.com.br/imoveis-pagina-20.html)
2018-10-30 17:41:37 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/a

2018-10-30 17:41:38 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-41.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:38 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-42.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:38 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-50.html> (referer: None)
2018-10-30 17:41:38 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-43.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:38 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-44.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:38 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/i

2018-10-30 17:41:38 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-69.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:38 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-67.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:38 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-76.html> (referer: None)
2018-10-30 17:41:38 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-68.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:38 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-77.html> (referer: None)
2018-10-30 17:41:38 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-70.html>: HTTP status code is not han

2018-10-30 17:41:39 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-101.html> (referer: None)
2018-10-30 17:41:39 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-93.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:39 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-94.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:39 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-102.html> (referer: None)
2018-10-30 17:41:39 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-95.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:39 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-103.html> (referer: None)
2018-10-30 17:41:39 [scrapy.

2018-10-30 17:41:39 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-125.html> (referer: None)
2018-10-30 17:41:39 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-126.html> (referer: None)
2018-10-30 17:41:39 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-120.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:39 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-127.html> (referer: None)
2018-10-30 17:41:39 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-128.html> (referer: None)
2018-10-30 17:41:39 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-121.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:39 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://w

2018-10-30 17:41:39 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-144.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:39 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-152.html> (referer: None)
2018-10-30 17:41:39 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-153.html> (referer: None)
2018-10-30 17:41:39 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-154.html> (referer: None)
2018-10-30 17:41:39 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-146.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:39 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-155.html> (referer: None)
2018-10-30 17:41:39 [scrapy.spidermiddlewares.httperror] INFO: Ignoring resp

2018-10-30 17:41:40 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-177.html> (referer: None)
2018-10-30 17:41:40 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-178.html> (referer: None)
2018-10-30 17:41:40 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-169.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:40 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-171.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:40 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-179.html> (referer: None)
2018-10-30 17:41:40 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-180.html> (referer: None)
2018-10-30 17:41:40 [scrapy.spidermiddlewares.httperror] INFO: Ignoring resp

2018-10-30 17:41:40 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-194.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:40 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-204.html> (referer: None)
2018-10-30 17:41:40 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-195.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:40 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-196.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:40 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-205.html> (referer: None)
2018-10-30 17:41:40 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-197.html>: HTTP status code is n

2018-10-30 17:41:40 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-228.html> (referer: None)
2018-10-30 17:41:40 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-229.html> (referer: None)
2018-10-30 17:41:40 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-231.html> (referer: None)
2018-10-30 17:41:40 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-230.html> (referer: None)
2018-10-30 17:41:40 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-222.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:40 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-221.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:40 [scrapy.spidermiddlewares.httperror] INFO: Ignoring resp

2018-10-30 17:41:41 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-245.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:41 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-254.html> (referer: None)
2018-10-30 17:41:41 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-247.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:41 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-248.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:41 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-255.html> (referer: None)
2018-10-30 17:41:41 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-249.html>: HTTP status code is n

2018-10-30 17:41:41 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-269.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:41 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-280.html> (referer: None)
2018-10-30 17:41:41 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-279.html> (referer: None)
2018-10-30 17:41:41 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-281.html> (referer: None)
2018-10-30 17:41:41 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-273.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:41 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-274.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:41 [scra

2018-10-30 17:41:42 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-305.html> (referer: None)
2018-10-30 17:41:42 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-297.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:42 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-306.html> (referer: None)
2018-10-30 17:41:42 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-298.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:42 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-308.html> (referer: None)
2018-10-30 17:41:42 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-307.html> (referer: None)
2018-10-30 17:41:42 [scrapy.spidermiddlewares.httperror] INFO: Ignoring resp

2018-10-30 17:41:42 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-322.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:42 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-323.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:42 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-321.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:42 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-324.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:42 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-331.html> (referer: None)
2018-10-30 17:41:42 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/propriedades/ap

2018-10-30 17:41:42 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/apartamento-no-brooklin-165-m-4-dorms-2-suites-3-2929647648.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:42 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-336.html> (referer: None)
2018-10-30 17:41:42 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/galpao-industrial-2938306444.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:42 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-337.html> (referer: None)
2018-10-30 17:41:42 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-338.html> (referer: None)
2018-10-30 17:41:42 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/apartame

2018-10-30 17:41:43 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-357.html> (referer: None)
2018-10-30 17:41:43 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/botanico-condominio-parque-2926722535.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:43 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-347.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:43 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-358.html> (referer: None)
2018-10-30 17:41:43 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/propriedades/df-century-plaza-residencial-2938776154.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:43 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://ww

2018-10-30 17:41:43 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-377.html> (referer: None)
2018-10-30 17:41:43 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-373.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:43 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-378.html> (referer: None)
2018-10-30 17:41:43 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-374.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:43 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-375.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:43 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-379.html> (referer: None)
2018-10-30 17:41:43 [scra

2018-10-30 17:41:43 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-397.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:43 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-406.html> (referer: None)
2018-10-30 17:41:43 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-405.html> (referer: None)
2018-10-30 17:41:43 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-407.html> (referer: None)
2018-10-30 17:41:43 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-399.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:43 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-389.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:43 [scra

2018-10-30 17:41:44 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-430.html> (referer: None)
2018-10-30 17:41:44 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-431.html> (referer: None)
2018-10-30 17:41:44 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-424.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:44 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-423.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:44 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-432.html> (referer: None)
2018-10-30 17:41:44 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-426.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:44 [scra

2018-10-30 17:41:44 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-456.html> (referer: None)
2018-10-30 17:41:44 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-447.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:44 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-457.html> (referer: None)
2018-10-30 17:41:44 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-450.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:44 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-458.html> (referer: None)
2018-10-30 17:41:44 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-449.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:44 [scra

2018-10-30 17:41:45 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-473.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:45 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-469.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:45 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-474.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:45 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-483.html> (referer: None)
2018-10-30 17:41:45 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <403 https://www.imovelweb.com.br/imoveis-pagina-475.html>: HTTP status code is not handled or not allowed
2018-10-30 17:41:45 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.imovelweb.com.br/imoveis-pagina-

2018-10-30 17:41:45 [scrapy.core.engine] INFO: Spider closed (finished)


Wall time: 3min 18s


So, now let's take a look in the data we scraped.

In [3]:
dataset = pd.read_csv ('imovelweb.csv')

In [4]:
dataset.head()

Unnamed: 0,Aluguel,Condominio,IPTU,Venda,age_imovel,area_total,area_util,banheiro,city,commonareas,diverseareas,neigghborhood,privateareas,quarto,saletype,street,suite,title,vaga
0,,,,,,,,,,,,,,,,,,Focus Business Center,
1,,,,,,,,,,,,,,,,,,Vista Arboris Mistral Residencial,
2,,,,,,,,,,Espaço Gourmet,,,Suítes,,,,,Últimas Unidades! Campanário Appartamentti,
3,,155.0,786.0,160000.0,,,20.0,1.0,Setor Industrial,,Posição do Apto,Saan,,0.0,Comercial,SAAN,0.0,Saan - Business Center - Sala - Vende - Se,0.0
4,,,,,,,,,,Espaço Gourmet,,,,,,,,Rox,


In [5]:
dataset.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 421 entries, 0 to 420
Data columns (total 19 columns):
Aluguel          74 non-null float64
Condominio       218 non-null float64
IPTU             184 non-null float64
Venda            263 non-null float64
age_imovel       195 non-null object
area_total       244 non-null float64
area_util        304 non-null float64
banheiro         299 non-null float64
city             305 non-null object
commonareas      299 non-null object
diverseareas     80 non-null object
neigghborhood    305 non-null object
privateareas     290 non-null object
quarto           307 non-null float64
saletype         321 non-null object
street           301 non-null object
suite            277 non-null float64
title            421 non-null object
vaga             304 non-null float64
dtypes: float64(10), object(9)
memory usage: 62.6+ KB


In [6]:
dataset = dataset.dropna(axis=0, subset= ['Aluguel','Venda'], how='all').reset_index(drop=True)

#### Data Manipulation

Since we want to evaluate the price for each item, we will create a new column for the total price.
- for rents the total price is the rent plus the condo.

In [23]:
# dataset ['pricetotal'] = pd.to_numeric(dataset['price_original'].str.split(' ', 1, expand = True)[1].str.replace('.',''))+pd.to_numeric(dataset['price_extra'].str.split(' ', 3, expand=True)[2].str.replace('.','').str.replace(',','.').fillna(0))
# dataset.loc[(dataset['filter_purchase']=='Comprar'),'pricetotal'] = dataset[(dataset['filter_purchase']=='Comprar')]['price_original'].str.split(' ', 1, expand = True)[1].str.replace('.','')

dataset['price_rent_total'] = dataset['Aluguel']+dataset['Condominio']
dataset.loc[dataset['Aluguel'].notna() & dataset['Condominio'].isna(),'price_rent_total'] = dataset[dataset['Aluguel'].notna() & dataset['Condominio'].isna()]['Aluguel']

In [24]:
dataset.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 320 entries, 0 to 319
Data columns (total 20 columns):
Aluguel             74 non-null float64
Condominio          218 non-null float64
IPTU                184 non-null float64
Venda               263 non-null float64
age_imovel          194 non-null object
area_total          243 non-null float64
area_util           303 non-null float64
banheiro            298 non-null float64
city                304 non-null object
commonareas         222 non-null object
diverseareas        80 non-null object
neigghborhood       304 non-null object
privateareas        258 non-null object
quarto              306 non-null float64
saletype            320 non-null object
street              300 non-null object
suite               276 non-null float64
title               320 non-null object
vaga                303 non-null float64
price_rent_total    74 non-null float64
dtypes: float64(11), object(9)
memory usage: 50.1+ KB


In [25]:
dataset.to_csv (path_or_buf= 'imovelweb_final.csv', index= False)

### Extra thoughts
In the age column, we could act on the 'Em construção' itens.

In the extra features columns ('commonareas', 'diverseareas','privateareas'), we could discover all the itens possibles and make a column for each of the itens, as done for the features ('quarto','suite','area_util', etc).

Also, we could create categorical columns to indicate if the product is on sale or for rent.