No description, website, or topics provided.
Clone or download
Pull request Compare This branch is 8 commits ahead, 13 commits behind sfbrigade:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


slack channel: #datasci-housingreport


This is a project of the Data Science Working Group at Code for San Francisco

The original ask, or how this project got started.

Here is a past report that's a great overview of the how the data has been used in the past

Paula Chiu is our contact as SF Gov (she's part of our Slack group too)

These DSWG members are contributors to this project, and how to get in touch with us on slack:

Name Slack Handle
Brian Goggin @bgoggin
Jeff Quinn @jfquinn
Arash Aghevli @arashaghevli
Tyler Field @tyler
Sanat Moningi @sanat
Earl Dos Santos @earldossantos
Juan Carlos Collins @juancarlos
Alwyna Lau @alwynalau
Paula Chiu @pchiu-sf
Geoffrey Pay @gpay
Angelique DeCastro @angeliquedecastro
Caressa Cunningham @caressalc27

Working Plan/Current Priorities

  1. Create a data model that can span several quarters, adjusting for the name mismatch (@brgoggin, @jfquinn, @arashaghevli)
  2. Analyze how long projects take for completion. Determine what relationship between project size and completion time is. (@brgoggin, @juancarlos)
  3. Come up with detailed UI design (@caressalc27, @alwynalau)—no longer working on this as of 10/13/2017

Questions we want to answer

The main question(s) we want to tackle with an interactive visualization would be the following.

  • How long it takes a project to go from start to end.
    • Does it depend on neighborhood? Size of project?
    • What factors depend on this?
    • What status of the projects take longer
    • Is there a status of the project where it's common for projects to get cancelled?
  • At some point in time in the lifecycle of a project, the # of units are defined? This number changes towards the end (usually decreases).Typically speaking how many units do we lose over the course of a project?
    • What factors tend to lead to this?
  • How many projects are being built per neighborhood?

Keep in mind we want to look at this at the Neighborhood and Zoning district level not at a individual project level.

Other Questions (From previous meeting):

  • How many units are being built per neighborhood per time period?
    • how many of those are affordable?
  • How many projects are being built per neighbood
  • How much space designated as "light industrial" is being gained/lost per neighboorhood?
  • Projects approved and filed over time:
    • what happens to the planning process per neighborhood
    • when were projects filed/approved/started/completed?
  • Size of project vs speed of getting on market?
  • A way to gauge compliance with Nov 2016's Measure X

How do I access the data?

See data/ for information about analyzing the data. The data is checked into the repository under data/cleaned, and you should not need to download it yourself for most purposes.


The pipeline dataset
The pipeline website
Notes from March 2017 convo with Paula an obsolete column mapping google doc

See data/README.MD for details about the data

dataset source
[2009-Quarter-2] [Internal to Planning Department]
[2009-Quarter-3] [Internal to Planning Department]
[2010-Quarter-1] [Internal to Planning Department]
[2010-Quarter-2] [Internal to Planning Department]
[2010-Quarter-3] [Internal to Planning Department]
[2010-Quarter-4] [Internal to Planning Department]
[2011-Quarter-1] [Internal to Planning Department]
[2011-Quarter-2] [Internal to Planning Department]
[2011-Quarter-3] [Internal to Planning Department]
[2011-Quarter-4] [Internal to Planning Department]
2012-Quarter-1 2012-Quarter-1 api
2012-Quarter-2 2012-Quarter-2 api
[2012-Quarter-3] [Internal to Planning Department]
2012-Quarter-4 2012-Quarter-4 api
2013-Quarter-1 2013-Quarter-1 api
2013-Quarter-2 2013-Quarter-2 api
2013-Quarter-3 2013-Quarter-3 api
2013-Quarter-4 2013-Quarter-4 api
2014-Quarter-1 2014-Quarter-1 api
2014-Quarter-2 2014-Quarter-2 api
2014-Quarter-3 2014-Quarter-3 api
2014-Quarter-4 2014-Quarter-4 api
2015-Quarter-1 2015-Quarter-1 api
2015-Quarter-2 2015-Quarter-2 api
2015-Quarter-3 2015-Quarter-3 api
2015-Quarter-4 2015-Quarter-4 api
2016-Quarter-1 2016-Quarter-1 api
2016-Quarter-2 2016-Quarter-2 api

Annual Housing Inventory Reports

Affordable Housing Reports

Useful Term Dictionary

Entitlement Status: 0 = Under Planning Review, -1 = Approved By Planning
APN: Assessor Parcel Number (blocklot, blklot)
MIPS: Managerial, Information, Professional Services. (Same as Office)
CIE: Cultural, Institutional, Educational
PDR: Production, Distribution, Repair

Setting up Python Environment

First make sure you have python3 and virtualenv installed.

Run this command to make a virtualenv:

virtualenv --python=$(which python3) VE

Run this command to enter the virtualenv:

source VE/bin/activate

Then run this command to install the dependencies:

brew install gdal --HEAD
pip install -r requirements.txt