# Grabbing data with cenpy

Cenpy (`sen - pie`) is a package that exposes APIs from the US Census Bureau and makes it easy to pull down and work with Census data in Pandas. First, notice that there are two core modules in the package, `base` and `explorer`, which each do different things. First, let's look at `explorer`. 

In [1]:
import cenpy as c
import pandas

On import, `explorer` requests all currently available APIs from the Census Bureau's [API listing](http://www.census.gov/data/developers/data-sets.html). In future, it will can also read a `JSON` collection describing the databases from disk, if asked.

Explorer has two functions, `available` and `explain`. `available` will provide a list of the identifiers of all the APIs that `cenpy` knows about. If run with `verbose=True`, `cenpy` will also include the title of the database as a dictionary. It's a good idea to *not* process this directly, and instead use it to explore currently available APIs. Here, I'll just show the first five entries:

In [2]:
{k:v for i, (k,v) in enumerate(c.explorer.available(verbose=True).items()) if i < 5}

{'ACSSF5Y2013': '2009-2013 American Community Survey 5-Year Estimates',
 'POPESTagespecialPR': 'Vintage 2014 Population Estimates: Puerto Rico Commonwealth and Municipios Annual Resident Population Estimates by Age Groups and Sex',
 'POPESTprm2014': 'Vintage 2014 Population Estimates: Puerto Rico Municipios Total Population',
 'ftd15ImpExpHist': '2015 International Trade: Historical Imports and Exports',
 'ftdImpExpHist': '2014 International Trade: Historical Imports and Exports'}

The `explain` command provides the title and full description of the datasource. If run in verbose mode, the function returns the full `json` listing of the API. 

In [3]:
c.explorer.explain('2011acs5')

{'2011 American Community Survey: 5-Year Estimates': 'The American Community Survey (ACS) is an ongoing survey that provides data every year -- giving communities the current information they need to plan investments and services. The ACS covers a broad range of topics about social, economic, demographic, and housing characteristics of the U.S. population.  Summary files include the following geographies: nation, all states (including DC and Puerto Rico), all metropolitan areas, all congressional districts (114th congress), all counties, all places, and all tracts and block groups.  Summary files contain the most detailed cross-tabulations, many of which are published down to block groups. The data are population and housing counts. There are over 64,000 variables in this dataset.'}

To actually connect to a database resource, you create a `Connection`. A `Connection` works like a *very* simplified connection from the `sqlalchemy` world. The `Connection` class has a method, `query` that constructs a query string and requests it from the Census server. This result is then parsed into JSON and returned to the user.  

In [4]:
conn = c.base.Connection('DecennialSF12010')

In [5]:
conn

Connection to 2010 Decennial: Summary File 1 (ID: http://api.census.gov/data/id/DecennialSF12010)

That may have taken longer than you'd've expected. This is because, when the `Connection` constructor is called, it populates the connection object with a bit of metadata that makes it possible to construct queries without referring to the census handbooks. 

For instance, a connection's `variables` represent all available search parameters for a given dataset. 

In [6]:
conn.variables.head()

Unnamed: 0,concept,label,predicateOnly,predicateType
AIANHH,Geographic Summary Level,GEO PLACE HOLDER,,
AIANHHCC,Geographic Characteristics,GEO PLACE HOLDER,,
AIANHHFP,Geographic Characteristics,GEO PLACE HOLDER,,
AIHHTLI,Geographic Characteristics,GEO PLACE HOLDER,,
AITS,Geographic Characteristics,GEO PLACE HOLDER,,


This dataframe is populated just like the census's table describing the variables on the corresponding [api website](http://api.census.gov/data/2010/sf1/variables.html). Fortunately, this means that you can modify and filter this dataframe just like you can regular pandas dataframes, so working out what the exact codes to use in your query is easy. 

I've added a function, `varslike`, that globs variables that fit a regular expression pattern. It can use the builtin python `re` module, in addition to the `fnmatch` module. It also can use any filtering function you want. 

So, you can extract the rows of the variables using the `df.ix` method on the list of columns that match your expression:

In [7]:
conn.variables.ix[conn.varslike('H011[AB]')]

Unnamed: 0,concept,label,predicateOnly,predicateType
H011A0001,H11A. TOTAL POPULATION IN OCCUPIED HOUSING UNI...,Population in occupied housing units with a ho...,,
H011A0002,H11A. TOTAL POPULATION IN OCCUPIED HOUSING UNI...,Owned with a mortgage or a loan,,
H011A0003,H11A. TOTAL POPULATION IN OCCUPIED HOUSING UNI...,Owned free and clear,,
H011A0004,H11A. TOTAL POPULATION IN OCCUPIED HOUSING UNI...,Renter occupied,,
H011B0001,H11B. TOTAL POPULATION IN OCCUPIED HOUSING UNI...,Population in occupied housing units with a ho...,,
H011B0002,H11B. TOTAL POPULATION IN OCCUPIED HOUSING UNI...,Owned with a mortgage or a loan,,
H011B0003,H11B. TOTAL POPULATION IN OCCUPIED HOUSING UNI...,Owned free and clear,,
H011B0004,H11B. TOTAL POPULATION IN OCCUPIED HOUSING UNI...,Renter occupied,,


Likewise, the different levels of geographic scale are determined from the metadata in the overall API listing and recorded. 

However, many Census products have multiple possible geographical indexing systems, like the deprecated `fips` code system and the new *Geographical Names Information System*, `gnis`. Thus, the `geographies` property is a dictionary of dataframes, where each key is the name of the identifier system and the value is the dataframe describing the identifier system. 

For the 2010 census, only `fips` and `gnis` systems are available. 

In [8]:
conn.geographies.keys()

dict_keys(['fips', 'gnis'])

In [9]:
conn.geographies['fips']

Unnamed: 0,geoLevelId,name,optionalWithWCFor,requires,wildcard
0,40,state,,,
1,50,county,state,[state],[state]
2,60,county subdivision,,"[state, county]",
3,67,subminor civil subdivision,,"[state, county, county subdivision]",
4,101,block,tract,"[state, county, tract]",[tract]
5,140,tract,county,"[state, county]",[county]
6,150,block group,tract,"[state, county, tract]",[tract]
7,160,place,state,[state],[state]
8,230,alaska native regional corporation,state,[state],[state]
9,280,american indian area/alaska native area/hawaii...,state,[state],[state]


In [10]:
conn.geographies['gnis']

Unnamed: 0,geoLevelId,name
0,170,consolidated city
1,50,county
2,60,county subdivision
3,160,place
4,40,state
5,67,subminor civil subdivision


Note that some geographies in the `fips` system have a **required** filter to prevent drawing too much data. This will get passed to the `query` method later. 

So, let's just grab the housing information from the 2010 Census Short Form. Using the variables table above, we picked out a subset of the fields we wanted. Since the variables table is indexed by the identifiers, we can grab the indexes of the filtered dataframe as query parameters. 

In addition, adding the `NAME` field smart-fills the table with the name of the geographic entity being pulled from the Census.

In [11]:
cols = conn.varslike('H00[012]*', engine='fnmatch')

In [12]:
cols.append('NAME')

In [13]:
cols

['H00010001',
 'H0020001',
 'H0020002',
 'H0020003',
 'H0020004',
 'H0020005',
 'H0020006',
 'NAME']

Now the query. The query is constructed just like the API query, and works as follows. 

1. cols - list of columns desired from the database, maps to census API's `get=`
2. geo_unit - string denoting the unit of study to pull, maps to census API's `in=`
3. geo_filter - dictionary containing groupings of geo_units, if required, maps to `for=`
    
To be specific, a fully query tells the server *what* columns to pull of *what* underlying geography from *what* aggregation units. It's structured using these heterogeneous datatypes so it's easy to change the smallest units quickly, while providing sufficient granularity to change the filters and columns as you go. 

This query below grabs the names, population, and housing estimates from the ACS, as well as their standard errors from census designated places in Arizona. 


In [14]:
data = conn.query(cols, geo_unit = 'place:*', geo_filter = {'state':'04'})

  df[cols] = df[cols].convert_objects(convert_numeric=convert_numeric)


Once constructed, the query executes as fast as your internet connection will move. This query has:

In [15]:
data.shape

(451, 10)

28 columns and 451 rows. So, rather fast. 

For validity and ease of use, we store the last executed query to the object. If you're dodgy about your census API key never being shown in plaintext, never print this property!

In [16]:
conn.last_query

'http://api.census.gov/data/2010/sf1?get=H00010001,H0020001,H0020002,H0020003,H0020004,H0020005,H0020006,NAME&for=place:*&in=state:04'

So, you have a dataframe with the information requested, plus the fields specified in the `geo_filter` and `geo_unit`. 

So, the following is a dataframe of the data requested. I've filtered it to only look at data where the population is larger than 40 thousand people.

Pretty neat!

In [17]:
data[data['H00010001'] > 40000]

Unnamed: 0,H00010001,H0020001,H0020002,H0020003,H0020004,H0020005,H0020006,NAME,state,place
63,94404,94404,0,0,0,0,94404,Chandler city,4,12000
146,74907,74907,0,0,0,0,74907,Gilbert town,4,27400
148,90505,90505,0,0,0,0,90505,Glendale city,4,27820
224,201173,201173,0,0,0,0,201173,Mesa city,4,46000
266,64818,64818,0,0,0,0,64818,Peoria city,4,54050
268,590149,590149,0,0,0,0,590149,Phoenix city,4,55000
328,124001,124001,0,0,0,0,124001,Scottsdale city,4,65000
366,52586,52586,0,0,0,0,52586,Surprise city,4,71510
375,73462,73462,0,0,0,0,73462,Tempe city,4,73000
394,229762,229762,0,0,0,0,229762,Tucson city,4,77000


And, just in case you're liable to forget your FIPS codes, the explorer module can look up some fips codes listings for you.

In [18]:
c.explorer.fips_table('place', in_state='AZ')

reading http://www2.census.gov/geo/docs/reference/codes/files/st04_az_places.txt


Unnamed: 0,0,1,2,3,4,5,6
0,AZ,4,730,Aguila CDP,Census Designated Place,S,Maricopa County
1,AZ,4,870,Ajo CDP,Census Designated Place,S,Pima County
2,AZ,4,940,Ak Chin CDP,Census Designated Place,S,Pima County
3,AZ,4,1090,Ak-Chin Village CDP,Census Designated Place,S,Pinal County
4,AZ,4,1170,Alamo Lake CDP,Census Designated Place,S,La Paz County
5,AZ,4,1560,Ali Chuk CDP,Census Designated Place,S,Pima County
6,AZ,4,1570,Ali Chukson CDP,Census Designated Place,S,Pima County
7,AZ,4,1620,Ali Molina CDP,Census Designated Place,S,Pima County
8,AZ,4,1920,Alpine CDP,Census Designated Place,S,Apache County
9,AZ,4,1990,Amado CDP,Census Designated Place,S,Santa Cruz County


### GEO & Tiger Integration

The Census TIGER geometry API is substantively different from every other API, in that it's an ArcGIS REST API. But, I've tried to expose a consistent interface. It works like this:

In [19]:
import cenpy.tiger as tiger

In [20]:
tiger.available()

[{'name': 'AIANNHA', 'type': 'MapServer'},
 {'name': 'CBSA', 'type': 'MapServer'},
 {'name': 'Hydro_LargeScale', 'type': 'MapServer'},
 {'name': 'Hydro', 'type': 'MapServer'},
 {'name': 'Labels', 'type': 'MapServer'},
 {'name': 'Legislative', 'type': 'MapServer'},
 {'name': 'Places_CouSub_ConCity_SubMCD', 'type': 'MapServer'},
 {'name': 'PUMA_TAD_TAZ_UGA_ZCTA', 'type': 'MapServer'},
 {'name': 'Region_Division', 'type': 'MapServer'},
 {'name': 'School', 'type': 'MapServer'},
 {'name': 'Special_Land_Use_Areas', 'type': 'MapServer'},
 {'name': 'State_County', 'type': 'MapServer'},
 {'name': 'tigerWMS_ACS2013', 'type': 'MapServer'},
 {'name': 'tigerWMS_ACS2014', 'type': 'MapServer'},
 {'name': 'tigerWMS_ACS2015', 'type': 'MapServer'},
 {'name': 'tigerWMS_ACS2016', 'type': 'MapServer'},
 {'name': 'tigerWMS_Census2010', 'type': 'MapServer'},
 {'name': 'tigerWMS_Current', 'type': 'MapServer'},
 {'name': 'tigerWMS_ECON2012', 'type': 'MapServer'},
 {'name': 'tigerWMS_PhysicalFeatures', 'type': 

In some cases, it makes quite a bit of sense to "attach" a map server to your connection. In the case of the US Census 2010 we've been using, there is an obvious data product match in `tigerWMS_Census2010`. So, let's attach it to the connection.

In [21]:
conn.set_mapservice('tigerWMS_Census2010')

In [22]:
conn.mapservice

<cenpy.tiger.TigerConnection at 0x7fdb464bbfd0>

neat! this is the same as calling: 

`tiger.TigerConnection('tigerWMS_Census2010')`

but this attaches that object it to the connection you've been using. The connection also updates with this information:

In [23]:
conn

Connection to 2010 Decennial: Summary File 1(ID: http://api.census.gov/data/id/DecennialSF12010)
With MapServer: Census 2010 WMS

An ESRI MapServer is a big thing, and `cenpy` doesn't support all of its features. Since `cenpy` is designed to support retreival of data from the US Census, we only support `GET` statements for defined geographic units, and ignore the vaious other functionalities in the service. 

To work with a service, note that any map server is composed of layers:

In [24]:
conn.mapservice.layers

{0: (ESRILayer) Public Use Microdata Areas,
 1: (ESRILayer) Public Use Microdata Areas Labels,
 2: (ESRILayer) Traffic Analysis Districts,
 3: (ESRILayer) Traffic Analysis Districts Labels,
 4: (ESRILayer) Traffic Analysis Zones,
 5: (ESRILayer) Traffic Analysis Zones Labels,
 6: (ESRILayer) Urban Growth Areas,
 7: (ESRILayer) Urban Growth Areas Labels,
 8: (ESRILayer) ZIP Code Tabulation Areas,
 9: (ESRILayer) ZIP Code Tabulation Areas Labels,
 10: (ESRILayer) Tribal Census Tracts,
 11: (ESRILayer) Tribal Census Tracts Labels,
 12: (ESRILayer) Tribal Block Groups,
 13: (ESRILayer) Tribal Block Groups Labels,
 14: (ESRILayer) Census Tracts,
 15: (ESRILayer) Census Tracts Labels,
 16: (ESRILayer) Census Block Groups,
 17: (ESRILayer) Census Block Groups Labels,
 18: (ESRILayer) Census Blocks,
 19: (ESRILayer) Census Blocks Labels,
 20: (ESRILayer) Unified School Districts,
 21: (ESRILayer) Unified School Districts Labels,
 22: (ESRILayer) Secondary School Districts,
 23: (ESRILayer) Sec

These layers are what actually implement query operations. For now, let's focus on the same "class" of units we were using before, Census Designated Places:

In [25]:
conn.mapservice.layers[36]

(ESRILayer) Census Designated Places

A query function is implemented both at the mapservice level and the layer level. At the mapservice level, a layer ID is required in order to complete the query. 

Mapservice queries are driven by SQL. So, to grab all of the geodata that fits the CDPs we pulled before, you could start to construct it like this. 

First, just like the main connection, each layer has a set of variables: 

In [26]:
conn.mapservice.layers[36].variables

Unnamed: 0,alias,domain,length,name,type
0,MTFCC,,5.0,MTFCC,esriFieldTypeString
1,OID,,,OID,esriFieldTypeDouble
2,GEOID,,7.0,GEOID,esriFieldTypeString
3,STATE,,2.0,STATE,esriFieldTypeString
4,PLACE,,5.0,PLACE,esriFieldTypeString
5,BASENAME,,100.0,BASENAME,esriFieldTypeString
6,NAME,,100.0,NAME,esriFieldTypeString
7,LSADC,,2.0,LSADC,esriFieldTypeString
8,FUNCSTAT,,1.0,FUNCSTAT,esriFieldTypeString
9,PLACECC,,2.0,PLACECC,esriFieldTypeString


Our prior query grabbed the places in AZ. So, we could use a SQL query that focuses on that. 

I try to pack the geometries into containers that people are used to using. Without knowing if GEOS is installed on a user's computer, I use `PySAL` as the target geometry type. 

If you do have GEOS, that means you can use Shapely or GeoPandas. So, to choose your backend, you can use the following two arguments to this query function. the `pkg` argument will let you choose the three types of python objects to output to. 

Pysal is default. If you select Shapely, the result will just be a pandas dataframe with Shapely geometries instead of pysal geometries. If you choose geopandas (or throw a gpize) option, cenpy will try to convert the pandas dataframe into a GeoPandas dataframe.

In [27]:
geodata = conn.mapservice.query(layer=36, where='STATE = 04')

In [28]:
geodata.head()

Unnamed: 0,AREALAND,AREAWATER,BASENAME,CBSAPCI,CENTLAT,CENTLON,FUNCSTAT,GEOID,HU100,INTPTLAT,...,NECTAPCI,OBJECTID,OID,PLACE,PLACECC,PLACENS,POP100,STATE,UR,geometry
0,13352095,0,Topawa,N,31.807822,-111.830486,S,474680,135,31.807822,...,N,19866,280403717476697,74680,U1,2582880,299,4,R,<pysal.cg.shapes.Polygon object at 0x7fdb443d6...
1,42998662,2536,Dilkon,N,35.3606097,-110.3155452,S,419280,361,35.3529051,...,N,19926,280401240982653,19280,U1,2408670,1184,4,R,<pysal.cg.shapes.Polygon object at 0x7fdb443d6...
2,13063355,14833,Rio Verde,N,33.7266382,-111.6761487,S,460250,1647,33.7265919,...,N,19939,280401230483010,60250,U1,2409181,1811,4,R,<pysal.cg.shapes.Polygon object at 0x7fdb5b918...
3,16150048,0,Sacaton Flats Village,N,33.0558972,-111.6589922,S,461800,168,33.0558972,...,N,19830,280403850591491,61800,U2,2612143,541,4,R,<pysal.cg.shapes.Polygon object at 0x7fdb5b8c0...
4,4067735,0,Aguila,N,33.9375158,-113.1664832,S,400730,304,33.9375158,...,N,19887,280403717476713,730,U1,2582720,798,4,R,<pysal.cg.shapes.Polygon object at 0x7fdb5b8d8...


To join the geodata to the other data, use pandas functions:

In [29]:
import pandas as pd

In [30]:
newdata = pd.merge(data, geodata, left_on='place', right_on='PLACE')

In [31]:
newdata.head()

Unnamed: 0,H00010001,H0020001,H0020002,H0020003,H0020004,H0020005,H0020006,NAME_x,state,place,...,NECTAPCI,OBJECTID,OID,PLACE,PLACECC,PLACENS,POP100,STATE,UR,geometry
0,304,304,0,0,0,0,304,Aguila CDP,4,730,...,N,19887,280403717476713,730,U1,2582720,798,4,R,<pysal.cg.shapes.Polygon object at 0x7fdb5b8d8...
1,2175,2175,0,0,0,0,2175,Ajo CDP,4,870,...,N,24756,280401254189026,870,U1,2407704,3304,4,M,<pysal.cg.shapes.Polygon object at 0x7fdb40ae8...
2,11,11,0,0,0,0,11,Ak Chin CDP,4,940,...,N,20442,280403717476626,940,U1,2582721,30,4,R,<pysal.cg.shapes.Polygon object at 0x7fdb41dab...
3,256,256,0,0,0,0,256,Ak-Chin Village CDP,4,1090,...,N,22362,280401260231698,1090,U1,2407705,862,4,M,<pysal.cg.shapes.Polygon object at 0x7fdb414e2...
4,31,31,0,0,0,0,31,Alamo Lake CDP,4,1170,...,N,23031,280403717388977,1170,U2,2582722,25,4,R,<pysal.cg.shapes.Polygon object at 0x7fdb4101d...


So, that's how you get your geodata in addition to your regular data!

## OK, that's one API, does it work for others?

We'll try the Economic Census

In [32]:
conn2 = c.base.Connection('CBP2012')

Alright, let's look at the available columns:

In [33]:
filt2 = [True if 'Statistics' in x else False for x in conn2.variables['concept']]

In [34]:
conn2.variables[filt2]

Unnamed: 0,concept,label,predicateOnly,predicateType
EMP,Employer Statistics,Total Number of Employees,,int
EMPSZES,Employer Statistics,Employment size of establishment,,string
EMPSZES_TTL,Employer Statistics,Title of Employment size of establishment,,string
EMP_F,Employer Statistics,Flag for Number of employees,,string
EMP_N,Employer Statistics,Noise flag for Total Mid-March Employees,,int
EMP_N_F,Employer Statistics,Flag for Noise field for Total Mid-March Emplo...,,string
ESTAB,Employer Statistics,Total number of Establishments,,int
ESTAB_F,Employer Statistics,Flag for Total number of Establishments,,int
LFO,Employer Statistics,Legal form of organization,,string
LFO_TTL,Employer Statistics,Title of legal form of organization,,


To show the required predicates, we can construct yet another filter. Note that *required* means that the query **will fail** if these are not passed as keyword arguments. They don't have to specify a single value, though, so they can be left as a wild card, like we did with `place:*` in the prior query:

In [35]:
conn2.variables

Unnamed: 0,concept,label,predicateOnly,predicateType
COUNTY,Selectable Geographies,FIPS County Code,,
CSA,Geographic Characteristics,Combined Statistical Area,,int
EMP,Employer Statistics,Total Number of Employees,,int
EMPSZES,Employer Statistics,Employment size of establishment,,string
EMPSZES_TTL,Employer Statistics,Title of Employment size of establishment,,string
EMP_F,Employer Statistics,Flag for Number of employees,,string
EMP_N,Employer Statistics,Noise flag for Total Mid-March Employees,,int
EMP_N_F,Employer Statistics,Flag for Noise field for Total Mid-March Emplo...,,string
ESTAB,Employer Statistics,Total number of Establishments,,int
ESTAB_F,Employer Statistics,Flag for Total number of Establishments,,int


Like before, geographies are shown with their requirements. Here, the only geography is the `fips` geography. 

In [36]:
conn2.geographies.keys()

dict_keys(['fips'])

In [37]:
conn2.geographies['fips']

Unnamed: 0,geoLevelDisplay,geoLevelId,name,optionalWithWCFor,requires,wildcard
0,,1,us,,,
1,,2,state,,,
2,3.0,3,county,state,[state],[state]
3,809.0,8,metropolitan statistical area/micropolitan sta...,,,


Now, we'll do some fun with error handling and passing of additional arguments to the query. Any "extra" required predicates beyond `get`, `for` and `in` are added at the end of the query as keyword arguments. These are caught and introduced into the query following the API specifications. 

First, though, let's see what happens when we submit a malformed query!

Here, we can query for every column in the dataset applied to places in California (`fips = 06`). The dataset we're working with, the Economic Census, requires an `OPTAX` field, which identifies the "type of operation or tax status code" along which to slice the data. Just like the other arguments, we will map them to keywords in the API string, and a wildcard represents a slice of all possible values. 

In [38]:
cols = conn2.varslike('ESTAB*', engine='fnmatch')

In [39]:
data2 = conn2.query(cols=cols, geo_unit='county:*', geo_filter={'state':'06'})

  df[cols] = df[cols].convert_objects(convert_numeric=convert_numeric)


In [40]:
data2.head()

Unnamed: 0,ESTAB,ESTAB_F,state,county
0,635,,6,999
1,36700,,6,1
2,43,,6,3
3,801,,6,5
4,4615,,6,7


And so you get the table of employment by County & NAICS code for employment and establishments in California counties. Since we're using counties as our unit of analysis, we could grab the geodata for counties.

In [41]:
conn2.set_mapservice('State_County')

But, there are quite a few layers in this MapService:

In [42]:
len(conn2.mapservice.layers)

71

Oof. If you ever want to check out the web interface to see what it looks like, you can retrieve the URLs of most objects using:

In [43]:
conn2.mapservice._baseurl

'http://tigerweb.geo.census.gov/arcgis/rest/services/TIGERweb/State_County/MapServer'

Anyway, we know counties don't really change all that much. So, let's just pick a counties layer and pull it down for California:

In [44]:
geodata2= conn2.mapservice.query(layer=1,where='STATE = 06')

In [45]:
newdata2 = pd.merge(data2, geodata2, left_on='county', right_on='COUNTY')

In [46]:
newdata2.head()

Unnamed: 0,ESTAB,ESTAB_F,state,county,AREALAND,AREAWATER,BASENAME,CENTLAT,CENTLON,COUNTY,...,GEOID,INTPTLAT,INTPTLON,LSADC,MTFCC,NAME,OBJECTID,OID,STATE,geometry
0,36700,,6,1,1914242789,212979931,Alameda,37.6506226,-121.9176449,1,...,6001,37.6471385,-121.912488,6,G4020,Alameda County,2098,27590141293924,6,<pysal.cg.shapes.Polygon object at 0x7fdb28b50...
1,43,,6,3,1912292633,12557304,Alpine,38.5971043,-119.8206026,3,...,6003,38.6217831,-119.7983522,6,G4020,Alpine County,1317,27590289634197,6,<pysal.cg.shapes.Polygon object at 0x7fdb291d4...
2,801,,6,5,1539933575,29470568,Amador,38.4466174,-120.6516693,5,...,6005,38.4435501,-120.6538563,6,G4020,Amador County,2724,27590143912562,6,<pysal.cg.shapes.Polygon object at 0x7fdb27aee...
3,4615,,6,7,4238423334,105325812,Butte,39.6665788,-121.6007017,7,...,6007,39.6659588,-121.6019188,6,G4020,Butte County,2237,27590417130535,6,<pysal.cg.shapes.Polygon object at 0x7fdb28006...
4,891,,6,9,2641820834,43806026,Calaveras,38.2044678,-120.5546688,9,...,6009,38.1838996,-120.5614415,6,G4020,Calaveras County,347,27590202403841,6,<pysal.cg.shapes.Polygon object at 0x7fdb2a353...


And that's all there is to it! Geodata and tabular data from the Census APIs in one place.

File an issue if you have concerns!