# Housing Data System
Not sure what to call this repo yet, but it pulls data from the census (and ultimately from other sources too) and makes it easy to answer the kinds of questions that the Housing Campaign often gets. Like "Did rents in Gulfport really increase by 90% last year, or what?"

## Setup

In [1]:
import pandas as pd
from us import states
from typing import Tuple, List
from pprint import pprint
from housingresearch.systems import CensusClient, Query, ACS_TABLES
from housingresearch.config import settings

Records = List[dict]

Initialize a `CensusClient` explorer.

In [2]:
cen = CensusClient()

## Get a variable across all census tracts in a "place"
Note that "place" is the Census API's term for "City". So here I'm querying Bushnell, which has just three tracts. 

In [3]:
specs = [
    {
        "table_name": "B25070",
        "survey": "acs5",
        "year": 2019,
        "state": "FL",
        "place": "09625",
        "level": "tract",
    }
]
cen.run_queries(specs)

2022-04-28 14:24:03.335 | INFO     | housingresearch.systems.census.client:get_data_by_tract:100 - Querying Census API.


Found lookup data for table B25070.


Finding max/min from statistics failed. Trying OID enumeration.
Traceback (most recent call last):
  File "/Users/james/Documents/Data_Projects/housing_research/env/lib/python3.9/site-packages/esridump/dumper.py", line 352, in __iter__
    (oid_min, oid_max) = self._get_layer_min_max(oid_field_name)
  File "/Users/james/Documents/Data_Projects/housing_research/env/lib/python3.9/site-packages/esridump/dumper.py", line 187, in _get_layer_min_max
    metadata = self._handle_esri_errors(response, "Could not retrieve min/max oid values")
  File "/Users/james/Documents/Data_Projects/housing_research/env/lib/python3.9/site-packages/esridump/dumper.py", line 107, in _handle_esri_errors
    raise EsriDownloadError("{}: {} {}" .format(
esridump.errors.EsriDownloadError: Could not retrieve min/max oid values: Error performing query operation 
2022-04-28 14:24:16.117 | INFO     | housingresearch.systems.census.client:get_data_by_tract:100 - Querying Census API.


EsriDownloadError was raised.


2022-04-28 14:24:40.730 | INFO     | housingresearch.systems.census.client:get_data_by_tract:108 - Query successful!
2022-04-28 14:24:40.731 | INFO     | housingresearch.systems.census.client:__init__:67 - Query instantiated.


Confirming that three tracts were delivered.

In [4]:
query = cen.queries[0]
print(len(query.api_results))

3


Here's the format the results come back in. These are "lists of dicts" which can be called the `Records` format.

In [5]:
pprint(query.api_results[0])

{'B25070_001E': 285.0,
 'B25070_001M': 81.0,
 'B25070_002E': 21.0,
 'B25070_002M': 23.0,
 'B25070_003E': 6.0,
 'B25070_003M': 12.0,
 'B25070_004E': 11.0,
 'B25070_004M': 17.0,
 'B25070_005E': 21.0,
 'B25070_005M': 23.0,
 'B25070_006E': 117.0,
 'B25070_006M': 68.0,
 'B25070_007E': 11.0,
 'B25070_007M': 16.0,
 'B25070_008E': 14.0,
 'B25070_008M': 18.0,
 'B25070_009E': 10.0,
 'B25070_009M': 15.0,
 'B25070_010E': 74.0,
 'B25070_010M': 48.0,
 'B25070_011E': 0.0,
 'B25070_011M': 14.0,
 'county': '119',
 'state': '12',
 'tract': '910402'}


Here's what we see if we convert that list of dicts to a table. 

In [6]:
df = pd.DataFrame.from_records(query.api_results)

In [7]:
df["year"] = query.year
df["place"] = query.place
df["table"] = query.table_name
df.columns = [col.replace(query.table_name + "_", "") for col in df.columns]
df

Unnamed: 0,001E,002E,003E,004E,005E,006E,007E,008E,009E,010E,...,008M,009M,010M,011M,state,county,tract,year,place,table
0,285.0,21.0,6.0,11.0,21.0,117.0,11.0,14.0,10.0,74.0,...,18.0,15.0,48.0,14.0,12,119,910402,2019,9625,B25070
1,413.0,0.0,27.0,58.0,44.0,56.0,23.0,6.0,76.0,62.0,...,11.0,61.0,47.0,37.0,12,119,910500,2019,9625,B25070
2,376.0,0.0,18.0,27.0,30.0,29.0,17.0,68.0,34.0,122.0,...,76.0,36.0,83.0,24.0,12,119,910601,2019,9625,B25070
