Understanding housing prices

The plan:

First I will scrape the following websources for pricing information:

Once I have sufficient pricing data, I will try to understand why prices are the way they are.

Update to the plan:

It turns out most housing websites do a pretty good job at protecting against bots. Given lack of time, I'm going to forgo typical scraping and instead download each page and then parse it locally. Given more time, I'd do a thorough investigation of user-agent strings, rotating IP addresses, VPNs, and alternative scraper technologies.

Installation

In order to carry out this project, I needed some extra tools. So in order to get the scraper to work, you'll need chromeheadless-driver. From there you'll need to put the binary in your executable path. For Ubuntu, the answer seems to be to put the executable in /opt and then soft link it to /usr/local/bin. From the terminal this looks like:

sudo mv /opt/google/chromedriver /opt/
sudo ln -fs /opt/chromedriver /usr/local/bin/chromedriver

The reason for this seems to have something to do with path hierarchy and the relative placement of google chrome versus google chrome headless driver. I haven't replicated these instructions for MacOSX or Windows, but I imagine they will not be too dissimilar.

All other installs can be found in requirements.txt found at the base of the project. And installed with:

python -m pip install -r requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
html_pages		html_pages
README.md		README.md
parsers.py		parsers.py
parsers.py~		parsers.py~
requirements.txt		requirements.txt
westfield_nj.csv		westfield_nj.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

html_pages

html_pages

README.md

README.md

parsers.py

parsers.py

parsers.py~

parsers.py~

requirements.txt

requirements.txt

westfield_nj.csv

westfield_nj.csv

Repository files navigation

Understanding housing prices

Installation

About

Releases

Packages

Languages

EricSchles/understanding_housing_prices

Folders and files

Latest commit

History

Repository files navigation

Understanding housing prices

Installation

About

Resources

Stars

Watchers

Forks

Languages