Skip to content

Dead-simple analysis (e.g., simple histograms) of recently sold homes using Redfin data.

Notifications You must be signed in to change notification settings

jmftrindade/redfin_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Description

Dead-simple analysis (e.g., simple histograms) of recently sold homes using Redfin data.

Currently only looks at specific types of homes (2 to 3 BR, 1.25+ BA, 1200+ sqft) sold in the last 90 days, and only in select cities from the Greater Boston area.

Requirements

For running Jupyter notebooks:

# Install prereqs.
$ pip install wheel
$ pip install ipykernel jupyter

For running the "analysis" in the notebook:

# Data processing.
$ pip install pandas
$ pip install numpy

# Optional if you do some ML.
$ pip install sklearn

# For seaborn cumulative distplots (aka CDFs).
$ pip install statsmodels

Scrape Redfin Data

Slightly modified version of https://github.com/micahsteinberg/redfin-recently-sold-property-scraper.

Make sure to update the ids of cities of interest, which are currently hardcoded in the script.

This script uses Redfin city ids, and not neighborhood ids, e.g., you want "29663" (https://www.redfin.com/city/29663/MA/Burlington) and not "497396" (https://www.redfin.com/neighborhood/497396/MA/Burlington/Burlington) for Burlington, MA.

To run the scraper:

$ python3 main.py

Run Notebook

$ jupyter notebook

About

Dead-simple analysis (e.g., simple histograms) of recently sold homes using Redfin data.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages