Analysis of house pricing in the Netherlands by data received from Funda.
Get conda into your system. The commands below will import the environment and activate it.
conda env create -f environment.yml
conda activate funda
Call the following command from repository to pull data from Funda:
cd scrapy
scrapy crawl funda -O dump.json
Where funda
is a name of the spider and dump.json
is a place where to store collected data.
Run JupyterLab and navigate to the analysis folder (JupyterLab is installed as part of environment):
jupyter-lab
Notebooks are located in the analysis folder.
Filter data by buildings after 1990, 80-100 (m2) and not less than 20 properties per town.
Areas are not taken into account (as there are a lot of bad neighborhood in every city), but top 25% should give us approximate price of the property. Anyway it's a sandbox just to get starting bids, because real price is different.