# Census Data Extraction and Processing

## Purpose

This notebook provides functionality to programmatically acquire and then process Census data. Such data can be obtained manually by visiting [data.census.gov](https://data.census.gov/cedsci/), but often the format of the data or even the act of downloading it can be time consuming. Therefore we are demonstrating how you can pull data from the census in an easily reproducible way and process it into the format needed for analysis. Note that different uses will require different formats so we will just demonstrate one such case, but hope that it provides the enough exposure for you to feel comfortable altering the code to fit your use cases.

## Approach

The US Census provides [APIs](https://www.census.gov/data/developers/data-sets.html) for programmatically accessing Census data. We could use those directly but it would require us to write a large amount of code. Therefore we will leverage an available Python library called `censusdata` [link](https://github.com/jtleider/censusdata) to help us obtain the data and then we will use some code to process it. 

 

**Census API KEY** 

To download large amounts of data or use the Census API frequently, you must obtain a Census API key. To request a Census API key, go [here](https://api.census.gov/data/key_signup.html) and enter your information and enter your information, you will then be sent a key over email.

**Keep your key private**, no one else should use your API key as it's linked to your email address and is used to monitor your usage of the Census API.

An API key is just a random string of numbers and letters. It will look something like this `96e87430410c12340dc57c6e556edf5c25905d70`. Below (where relevant) we will highlight where you can add your key when downloading data.


## Code

**Imports**

In [1]:
import pandas as pd
import censusdata

pd.set_option(
    "display.expand_frame_repr", False
)  # These options just help us see more of the data in the dataframe
pd.set_option("display.precision", 2)

## 1. Searching for data

The `censusdata` has a full list of all census tables and variables available in the Census api. They get this information from the US census and it is organized with certain fields that can be used for searching. 

Two key fields to know about are **concept** and **label**. Concept is the overall concept a table or tables are concerned with while **label** is the actual variable label of a column. For added context see below - this is the background metadata information used by the library for searching. Here we see an example of a single variable `S0804_C04_068E`

```    
"S0804_C04_068E": {
      "label": "Estimate!!Public transportation (excluding taxicab)!!Workers 16 years and over 
          who did not workfrom home!!TIME ARRIVING AT WORK!!5:00 a.m. to 5:29 a.m.",
      "concept": "MEANS OF TRANSPORTATION TO WORK BY SELECTED CHARACTERISTICS FOR WORKPLACE GEOGRAPHY",
      "predicateType": "float",
      "group": "S0804",
      "limit": 0,
      "attributes": "S0804_C04_068EA,S0804_C04_068M,S0804_C04_068MA"
      
```


If you're not sure whether to use concept or label, try both, and see what you get back as your result (examples below).

Let's use concept and label to search for income in the Detailed Tables for ACS 2019 estimates and see what we get back.

### Example 1

**Let's search for Income in the detail tables Using the 5 year acs, and the 2019 estimates**

Note: Reference on detail, vs subject tables etc [link](https://www.census.gov/data/developers/data-sets/acs-5year.html)

In [3]:
censusdata.search("acs5", 2019, "concept", "INCOME", tabletype="detail")[:100] #limiting to just the first 100

[('B05010_001E',
  'RATIO OF INCOME TO POVERTY LEVEL IN THE PAST 12 MONTHS BY NATIVITY OF CHILDREN UNDER 18 YEARS IN FAMILIES AND SUBFAMILIES BY LIVING ARRANGEMENTS AND NATIVITY OF PARENTS',
  'Estimate!!Total:'),
 ('B05010_002E',
  'RATIO OF INCOME TO POVERTY LEVEL IN THE PAST 12 MONTHS BY NATIVITY OF CHILDREN UNDER 18 YEARS IN FAMILIES AND SUBFAMILIES BY LIVING ARRANGEMENTS AND NATIVITY OF PARENTS',
  'Estimate!!Total:!!Under 1.00:'),
 ('B05010_003E',
  'RATIO OF INCOME TO POVERTY LEVEL IN THE PAST 12 MONTHS BY NATIVITY OF CHILDREN UNDER 18 YEARS IN FAMILIES AND SUBFAMILIES BY LIVING ARRANGEMENTS AND NATIVITY OF PARENTS',
  'Estimate!!Total:!!Under 1.00:!!Living with two parents:'),
 ('B05010_004E',
  'RATIO OF INCOME TO POVERTY LEVEL IN THE PAST 12 MONTHS BY NATIVITY OF CHILDREN UNDER 18 YEARS IN FAMILIES AND SUBFAMILIES BY LIVING ARRANGEMENTS AND NATIVITY OF PARENTS',
  'Estimate!!Total:!!Under 1.00:!!Living with two parents:!!Both parents native'),
 ('B05010_005E',
  'RATIO OF INC

Looking at the output from the code blocks below we see we get a list of variables/fields along with the table ID and variable code:
* `B05010_001E` --> Table ID `B05010` and variable code `001E`

Now let's search using "label" instead

In [4]:
censusdata.search("acs5", 2019, "label", "INCOME", tabletype="detail")[:100]

[('B06010PR_002E',
  'PLACE OF BIRTH BY INDIVIDUAL INCOME IN THE PAST 12 MONTHS (IN 2019 INFLATION-ADJUSTED DOLLARS) IN PUERTO RICO',
  'Estimate!!Total:!!No income'),
 ('B06010PR_003E',
  'PLACE OF BIRTH BY INDIVIDUAL INCOME IN THE PAST 12 MONTHS (IN 2019 INFLATION-ADJUSTED DOLLARS) IN PUERTO RICO',
  'Estimate!!Total:!!With income:'),
 ('B06010PR_004E',
  'PLACE OF BIRTH BY INDIVIDUAL INCOME IN THE PAST 12 MONTHS (IN 2019 INFLATION-ADJUSTED DOLLARS) IN PUERTO RICO',
  'Estimate!!Total:!!With income:!!$1 to $9,999 or loss'),
 ('B06010PR_005E',
  'PLACE OF BIRTH BY INDIVIDUAL INCOME IN THE PAST 12 MONTHS (IN 2019 INFLATION-ADJUSTED DOLLARS) IN PUERTO RICO',
  'Estimate!!Total:!!With income:!!$10,000 to $14,999'),
 ('B06010PR_006E',
  'PLACE OF BIRTH BY INDIVIDUAL INCOME IN THE PAST 12 MONTHS (IN 2019 INFLATION-ADJUSTED DOLLARS) IN PUERTO RICO',
  'Estimate!!Total:!!With income:!!$15,000 to $24,999'),
 ('B06010PR_007E',
  'PLACE OF BIRTH BY INDIVIDUAL INCOME IN THE PAST 12 MONTHS (IN 20

We see we get some different information now compared to the concept list. Now there will definitely be overlap between the two, and income is a broad search. In general concept is likely a better place to start. 

### Example 2: Getting Data from Census Table B05006
If we are looking for this specific table then we can assume we know that it deals with  `PLACE OF BIRTH FOR THE FOREIGN-BORN POPULATION IN THE UNITED STATES`

Let's just search using concept

In [13]:
censusdata.search(
    "acs5",
    2019,
    "concept",
    "PLACE OF BIRTH FOR THE FOREIGN-BORN POPULATION IN THE UNITED STATES",
    tabletype="detail",
)[:100]

[('B05006_001E',
  'PLACE OF BIRTH FOR THE FOREIGN-BORN POPULATION IN THE UNITED STATES',
  'Estimate!!Total:'),
 ('B05006_002E',
  'PLACE OF BIRTH FOR THE FOREIGN-BORN POPULATION IN THE UNITED STATES',
  'Estimate!!Total:!!Europe:'),
 ('B05006_003E',
  'PLACE OF BIRTH FOR THE FOREIGN-BORN POPULATION IN THE UNITED STATES',
  'Estimate!!Total:!!Europe:!!Northern Europe:'),
 ('B05006_004E',
  'PLACE OF BIRTH FOR THE FOREIGN-BORN POPULATION IN THE UNITED STATES',
  'Estimate!!Total:!!Europe:!!Northern Europe:!!Ireland'),
 ('B05006_005E',
  'PLACE OF BIRTH FOR THE FOREIGN-BORN POPULATION IN THE UNITED STATES',
  'Estimate!!Total:!!Europe:!!Northern Europe:!!Denmark'),
 ('B05006_006E',
  'PLACE OF BIRTH FOR THE FOREIGN-BORN POPULATION IN THE UNITED STATES',
  'Estimate!!Total:!!Europe:!!Northern Europe:!!Norway'),
 ('B05006_007E',
  'PLACE OF BIRTH FOR THE FOREIGN-BORN POPULATION IN THE UNITED STATES',
  'Estimate!!Total:!!Europe:!!Northern Europe:!!Sweden'),
 ('B05006_008E',
  'PLACE OF BI

If we scroll through it looks like all the info is from table `B05006`. 

Now we already knew we wanted table `B05006` so another way to get variable information for a table is to just request it directly using the `censusdata.censustable` function. 

**Get the table info base on table name, year, and survey**

The `censusdata.censustable` function expects:
* survey source label: acs5, acs3, acs1, or sf1
* a year
* and table code/ID

In [64]:
table_info = censusdata.censustable("acs5", 2019, "B05006")

Above we see all the information for that table, which matches what we found in the search. The `censusdata` package also provides a helper function to make this more readable. 

In [66]:
censusdata.printtable(table_info)

Variable     | Table                          | Label                                                    | Type 
-------------------------------------------------------------------------------------------------------------------
B05006_001E  | PLACE OF BIRTH FOR THE FOREIGN | !! Estimate Total:                                       | int  
B05006_002E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! Estimate Total: Europe:                            | int  
B05006_003E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! !! Estimate Total: Europe: Northern Europe:        | int  
B05006_004E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! !! !! Estimate Total: Europe: Northern Europe: Ire | int  
B05006_005E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! !! !! Estimate Total: Europe: Northern Europe: Den | int  
B05006_006E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! !! !! Estimate Total: Europe: Northern Europe: Nor | int  
B05006_007E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! !! !! Estimate Total: Europe: Northern 

## 2. Selecting Our Geogrpahy 

Census data can come in many different [geographies](https://www.census.gov/programs-surveys/geography/geographies.html) (national, state, county, tract, block group, etc. )

Some geographies are so numerous that we can't just request or download all of them easily, for example imagine downloading data for every block group across the country .. that would be 217,740 block groups with numerous columns for each. So when we request data from the census we have to be intentional about what geography we want. 

The `censusdata` library can also help us pick our geogrpahy and get information for it. How this works is that we combine different "geography elements" together to drill down to what we want. 

For example if we wanted all block groups in a county request our data with geography elements for a specific state, then a specific county and then ask for all block groups in those two things. 

We'll start with a few examples. 

### Example 1: Let's get the geography information for every state in the country / acs5 / 2019

We create our list of geography elements, in this case we have one element searching for all states
```
[
    ('state', *) <-- Here we use an asterisk to indicate that we want all counties in the country
]
```

In [18]:
censusdata.geographies(censusdata.censusgeo([("state", "*")]), "acs5", 2019)

{'Alabama': censusgeo((('state', '01'),)),
 'Alaska': censusgeo((('state', '02'),)),
 'Arizona': censusgeo((('state', '04'),)),
 'Arkansas': censusgeo((('state', '05'),)),
 'California': censusgeo((('state', '06'),)),
 'Colorado': censusgeo((('state', '08'),)),
 'Delaware': censusgeo((('state', '10'),)),
 'District of Columbia': censusgeo((('state', '11'),)),
 'Connecticut': censusgeo((('state', '09'),)),
 'Florida': censusgeo((('state', '12'),)),
 'Georgia': censusgeo((('state', '13'),)),
 'Idaho': censusgeo((('state', '16'),)),
 'Hawaii': censusgeo((('state', '15'),)),
 'Illinois': censusgeo((('state', '17'),)),
 'Indiana': censusgeo((('state', '18'),)),
 'Iowa': censusgeo((('state', '19'),)),
 'Kansas': censusgeo((('state', '20'),)),
 'Kentucky': censusgeo((('state', '21'),)),
 'Louisiana': censusgeo((('state', '22'),)),
 'Maine': censusgeo((('state', '23'),)),
 'Maryland': censusgeo((('state', '24'),)),
 'Massachusetts': censusgeo((('state', '25'),)),
 'Michigan': censusgeo((('stat

This output has a label and if you youe census FIPS info, these are the fips codes for those states. 

### Example 2: Lets get all counties in the US

```
[
    ('county', *) <-- Again we use an asterisk to indicate that we want all counties in the country
]
```

In [19]:
censusdata.geographies(censusdata.censusgeo([("county", "*")]), "acs5", 2019)

{'Fayette County, Illinois': censusgeo((('state', '17'), ('county', '051'))),
 'Logan County, Illinois': censusgeo((('state', '17'), ('county', '107'))),
 'Saline County, Illinois': censusgeo((('state', '17'), ('county', '165'))),
 'Lake County, Illinois': censusgeo((('state', '17'), ('county', '097'))),
 'Massac County, Illinois': censusgeo((('state', '17'), ('county', '127'))),
 'Cass County, Illinois': censusgeo((('state', '17'), ('county', '017'))),
 'Huntington County, Indiana': censusgeo((('state', '18'), ('county', '069'))),
 'White County, Indiana': censusgeo((('state', '18'), ('county', '181'))),
 'Jay County, Indiana': censusgeo((('state', '18'), ('county', '075'))),
 'Shelby County, Indiana': censusgeo((('state', '18'), ('county', '145'))),
 'Sullivan County, Indiana': censusgeo((('state', '18'), ('county', '153'))),
 'Tippecanoe County, Indiana': censusgeo((('state', '18'), ('county', '157'))),
 'Hamilton County, Indiana': censusgeo((('state', '18'), ('county', '057'))),
 '

Counties are about the limit of what we can request directly from the census API without applying more geography elements

### Example 3: All tracts in Texas


As discussed, can't grab all the tracts for the whole country at once. But we can request all the tracts in a state

Note our new list of geographic elements.
```
[
    ('state', 48) <-- This is the fips code for Texas (a state),
    ('tract', *) <-- Here we use an asterisk to indicate that we want all tracts
]
```

In [21]:
censusdata.geographies(
    censusdata.censusgeo([("state", "48"), ("tract", "*")]), "acs5", 2019
)

{'Census Tract 133.05, Cameron County, Texas': censusgeo((('state', '48'), ('county', '061'), ('tract', '013305'))),
 'Census Tract 133.09, Cameron County, Texas': censusgeo((('state', '48'), ('county', '061'), ('tract', '013309'))),
 'Census Tract 134.02, Cameron County, Texas': censusgeo((('state', '48'), ('county', '061'), ('tract', '013402'))),
 'Census Tract 135, Cameron County, Texas': censusgeo((('state', '48'), ('county', '061'), ('tract', '013500'))),
 'Census Tract 126.13, Cameron County, Texas': censusgeo((('state', '48'), ('county', '061'), ('tract', '012613'))),
 'Census Tract 133.06, Cameron County, Texas': censusgeo((('state', '48'), ('county', '061'), ('tract', '013306'))),
 'Census Tract 102.01, Cameron County, Texas': censusgeo((('state', '48'), ('county', '061'), ('tract', '010201'))),
 'Census Tract 108, Cameron County, Texas': censusgeo((('state', '48'), ('county', '061'), ('tract', '010800'))),
 'Census Tract 105, Cameron County, Texas': censusgeo((('state', '48')


### Example 4: All tracts in  NYC

For this example we can first search for all counties in NY state (fips code = 36) to find the fips codes for New York, Bronx, Kings, Richmand and Queens counties. 


In [25]:
nys = censusdata.geographies(
    censusdata.censusgeo([("state", "36"), ("county", "*")]), "acs5", 2019
)

In [32]:
print(nys['Kings County, New York'])
print(nys['Queens County, New York'])
print(nys['New York County, New York'])
print(nys['Bronx County, New York'])
print(nys['Richmond County, New York'])

Summary level: 050, state:36> county:047
Summary level: 050, state:36> county:081
Summary level: 050, state:36> county:061
Summary level: 050, state:36> county:005
Summary level: 050, state:36> county:085


So we need county fips codes: ['047', '081', '061', '005', '085']

In [109]:
kings_geo =  censusdata.censusgeo([("state", "36"), ("county","047"), ('tract',"*")])
queens_geo =  censusdata.censusgeo([("state", "36"), ("county","081"), ('tract',"*")])
ny_geo = censusdata.censusgeo([("state", "36"), ("county","061"), ('tract',"*")])
bronx_geo =  censusdata.censusgeo([("state", "36"), ("county","005"), ('tract',"*")])
richmond_geo = censusdata.censusgeo([("state", "36"), ("county","085"), ('tract',"*")])

We will store these nyc county geos in a list

In [111]:
nyc_tracts = [kings_geo, queens_geo, ny_geo, bronx_geo, richmond_geo]

We will show how to use this later in this notebook when downloading data

## 3. Downloading Data

We will now request data from the `B05006` for all counties in the US. If you recall, above we requested all the info for `B05006` and saved it in our `table_info` variable.

### Example 1: Manually Selecting Variables

To download specific variables of interest we can pass the variable-column code to the `censusdata.download` function, along with some info that is probably becoming familiar at this point. 

We will request ["B05006_001E", "B05006_003E"] which is the **Estimate!!Total:** and **Estimate!!Total:!!Europe:!!Northern Europe:**, respectively. 

See below:

In [55]:
censusdata.download(
    "acs5", #<- survey
    2019, #<- year 
    censusdata.censusgeo([("state", "48"), ("county", "*")]), #<- all counties 
    ["B05006_001E", "B05006_003E"],  # <-our test variables
)

Unnamed: 0,B05006_001E,B05006_003E
"San Jacinto County, Texas: Summary level: 050, state:48> county:407",1701,45
"Upshur County, Texas: Summary level: 050, state:48> county:459",1539,53
"Waller County, Texas: Summary level: 050, state:48> county:473",7181,47
"Wilson County, Texas: Summary level: 050, state:48> county:493",1841,12
"Hockley County, Texas: Summary level: 050, state:48> county:219",1940,9
...,...,...
"Brown County, Texas: Summary level: 050, state:48> county:049",1676,14
"Hall County, Texas: Summary level: 050, state:48> county:191",350,0
"Franklin County, Texas: Summary level: 050, state:48> county:159",602,0
"Frio County, Texas: Summary level: 050, state:48> county:163",3454,13


### Example 2: Selecting Variables from Table Info

To download all of the columns in the table we can use the `table_info` data we created earlier in this notebook.

As a reminder we called `censusdata.censustable('acs5', 2019, 'B05006')` to get the table info. This is a dictionary of information on the different columns related information. To get just the column names we can do the below. 

In [90]:
table_info_column_ids = list(table_info.keys())

In [91]:
censusdata.download(
    "acs5", #<- survey
    2019, #<- year 
    censusdata.censusgeo([("state", "48"), ("county", "*")]), #<- all counties 
    table_info_column_ids # <-our  variables
).sample(100)

Unnamed: 0,B05006_001E,B05006_002E,B05006_003E,B05006_004E,B05006_005E,B05006_006E,B05006_007E,B05006_008E,B05006_009E,B05006_010E,...,B05006_159E,B05006_160E,B05006_161E,B05006_162E,B05006_163E,B05006_164E,B05006_165E,B05006_166E,B05006_167E,B05006_168E
"San Jacinto County, Texas: Summary level: 050, state:48> county:407",1701,93,45,0,0,0,0,45,0,45,...,63,0,0,0,15,0,0,47,47,0
"Upshur County, Texas: Summary level: 050, state:48> county:459",1539,157,53,38,0,0,0,15,0,15,...,0,0,0,0,0,0,0,17,17,0
"Waller County, Texas: Summary level: 050, state:48> county:473",7181,194,47,0,0,0,0,47,36,11,...,106,0,0,0,0,30,0,52,52,0
"Wilson County, Texas: Summary level: 050, state:48> county:493",1841,89,12,0,0,0,0,12,11,1,...,17,0,0,0,0,0,0,53,53,0
"Hockley County, Texas: Summary level: 050, state:48> county:219",1940,20,9,2,1,0,0,6,0,6,...,0,0,0,0,0,0,0,5,5,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"Brown County, Texas: Summary level: 050, state:48> county:049",1676,42,14,0,2,0,0,12,4,0,...,60,0,0,14,0,5,0,54,54,0
"Hall County, Texas: Summary level: 050, state:48> county:191",350,1,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
"Franklin County, Texas: Summary level: 050, state:48> county:159",602,100,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
"Frio County, Texas: Summary level: 050, state:48> county:163",3454,21,13,4,0,0,0,9,0,9,...,0,70,0,9,0,9,0,25,25,0


Above we got only estimate columns (all the columns end with an `E`) but sometimes you get estimates and margins of error. If we run the same command with year 2015 you will see that we get many "M" columns which indicate Margin of Error. 

Note: It seems that data prior to 2016 has margins of error while 2016 or after does not. 

We can see this below calling the `censusdata.censustable` function again with year 2015. 

```
OrderedDict([('B05006_001E',
              {'label': 'Total:',
               'concept': 'B05006. Place of Birth for the Foreign-Born Population in the United States',
               'predicateType': 'int'}),
             ('B05006_001M',
              {'label': 'Margin of Error for!!Total:',
               'concept': 'B05006. Place of Birth for the Foreign-Born Population in the United States',
               'predicateType': 'int'}),
...
```

In [80]:
# Calling with year = 2015
censusdata.censustable("acs5", 2015, "B05006")

OrderedDict([('B05006_001E',
              {'label': 'Total:',
               'concept': 'B05006. Place of Birth for the Foreign-Born Population in the United States',
               'predicateType': 'int'}),
             ('B05006_001M',
              {'label': 'Margin of Error for!!Total:',
               'concept': 'B05006. Place of Birth for the Foreign-Born Population in the United States',
               'predicateType': 'int'}),
             ('B05006_002E',
              {'label': 'Europe:',
               'concept': 'B05006. Place of Birth for the Foreign-Born Population in the United States',
               'predicateType': 'int'}),
             ('B05006_002M',
              {'label': 'Margin of Error for!!Europe:',
               'concept': 'B05006. Place of Birth for the Foreign-Born Population in the United States',
               'predicateType': 'int'}),
             ('B05006_003E',
              {'label': 'Europe:!!Northern Europe:',
               'concept': 'B05006. Pla

So if you wanted to filter out the margins of error, we can create a for loop to remove them;

In [84]:
just_estimates = []
for variable in censusdata.censustable("acs5", 2015, "B05006"):
    if "E" in variable:
        just_estimates.append(variable)

just_estimates[:5]

['B05006_001E', 'B05006_002E', 'B05006_003E', 'B05006_004E', 'B05006_005E']

Now that we have just estimates in our `just_estimates` variable we can request it as we did above

In [95]:
censusdata.download(
    "acs5", 2015, censusdata.censusgeo([("county", "*")]), just_estimates
).sample(100)

Unnamed: 0,B05006_001E,B05006_002E,B05006_003E,B05006_004E,B05006_005E,B05006_006E,B05006_007E,B05006_008E,B05006_009E,B05006_010E,...,B05006_153E,B05006_154E,B05006_155E,B05006_156E,B05006_157E,B05006_158E,B05006_159E,B05006_160E,B05006_161E,B05006_162E
"Pulaski County, Indiana: Summary level: 050, state:18> county:131",91.0,31.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.0,7.0,0.0
"Marion County, Iowa: Summary level: 050, state:19> county:125",866.0,313.0,26.0,16.0,0.0,16.0,0.0,10.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,12.0,12.0,0.0
"Grant County, Arkansas: Summary level: 050, state:05> county:053",263.0,23.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"Androscoggin County, Maine: Summary level: 050, state:23> county:001",3074.0,464.0,103.0,65.0,35.0,30.0,0.0,38.0,0.0,0.0,...,3.0,0.0,20.0,0.0,0.0,0.0,0.0,843.0,843.0,0.0
"Tangipahoa Parish, Louisiana: Summary level: 050, state:22> county:105",2658.0,259.0,74.0,51.0,0.0,51.0,0.0,0.0,0.0,0.0,...,15.0,10.0,13.0,5.0,0.0,0.0,0.0,16.0,16.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"Madison County, Texas: Summary level: 050, state:48> county:313",895.0,6.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,14.0,0.0,0.0,0.0,0.0,0.0,0.0,3.0,3.0,0.0
"Plymouth County, Massachusetts: Summary level: 050, state:25> county:023",41731.0,7097.0,2198.0,1225.0,464.0,693.0,68.0,712.0,73.0,97.0,...,177.0,231.0,157.0,298.0,61.0,24.0,15.0,1710.0,1700.0,10.0
"Greene County, Tennessee: Summary level: 050, state:47> county:059",1152.0,168.0,25.0,21.0,4.0,17.0,0.0,4.0,0.0,0.0,...,8.0,0.0,0.0,0.0,0.0,2.0,0.0,57.0,57.0,0.0
"Ochiltree County, Texas: Summary level: 050, state:48> county:357",2200.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,9.0,0.0,0.0,0.0,0.0,0.0,0.0


Note: In the download call below you should pass an api key as follows:
```
censusdata.download('acs5', 2019,
                         censusdata.censusgeo([('county', '*')]),
                         just_estimates
                        key='##########################################'
                        )
```

### Putting it all together 

Now that we know how to:
* search for the data we want
* limit to the geographies we need
* and download data 

We can run through a few complete examples where we will get full dataframes and convert the column codes to contextual labels. 

We will also share some functions for helping deal with how the census groups different variables. 

### Example 1 - All counties in the US for B05006

In [159]:
all_counties = censusdata.download(
    "acs5", #<- survey
    2019, #<- year 
    censusdata.censusgeo([("county", "*")]), #<- all counties 
    table_info_column_ids # <-our  variables
)

**Getting Text Columns**

Our columns right now are variable codes, these are confusing to work with so let's reassign them with text info. We use list comprehension and the `table_info` variable to get the text. 

In [164]:
contextual_columns = []
for column in all_counties.columns:
    contextual_columns.append(table_info[column]['label'])
all_counties.columns = contextual_columns

In [165]:
all_counties.head()

Unnamed: 0,Estimate!!Total:,Estimate!!Total:!!Europe:,Estimate!!Total:!!Europe:!!Northern Europe:,Estimate!!Total:!!Europe:!!Northern Europe:!!Ireland,Estimate!!Total:!!Europe:!!Northern Europe:!!Denmark,Estimate!!Total:!!Europe:!!Northern Europe:!!Norway,Estimate!!Total:!!Europe:!!Northern Europe:!!Sweden,Estimate!!Total:!!Europe:!!Northern Europe:!!United Kingdom (inc. Crown Dependencies):,"Estimate!!Total:!!Europe:!!Northern Europe:!!United Kingdom (inc. Crown Dependencies):!!United Kingdom, excluding England and Scotland",Estimate!!Total:!!Europe:!!Northern Europe:!!United Kingdom (inc. Crown Dependencies):!!England,...,Estimate!!Total:!!Americas:!!Latin America:!!South America:!!Colombia,Estimate!!Total:!!Americas:!!Latin America:!!South America:!!Ecuador,Estimate!!Total:!!Americas:!!Latin America:!!South America:!!Guyana,Estimate!!Total:!!Americas:!!Latin America:!!South America:!!Peru,Estimate!!Total:!!Americas:!!Latin America:!!South America:!!Uruguay,Estimate!!Total:!!Americas:!!Latin America:!!South America:!!Venezuela,Estimate!!Total:!!Americas:!!Latin America:!!South America:!!Other South America,Estimate!!Total:!!Americas:!!Northern America:,Estimate!!Total:!!Americas:!!Northern America:!!Canada,Estimate!!Total:!!Americas:!!Northern America:!!Other Northern America
"Fayette County, Illinois: Summary level: 050, state:17> county:051",277.0,33.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,20.0,20.0,0.0
"Logan County, Illinois: Summary level: 050, state:17> county:107",468.0,109.0,13.0,0.0,0.0,0.0,0.0,13.0,0.0,13.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"Saline County, Illinois: Summary level: 050, state:17> county:165",241.0,70.0,7.0,0.0,0.0,0.0,0.0,7.0,0.0,7.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"Lake County, Illinois: Summary level: 050, state:17> county:097",131398.0,25548.0,2484.0,369.0,106.0,45.0,105.0,1747.0,917.0,733.0,...,856.0,232.0,0.0,487.0,16.0,472.0,6.0,1520.0,1520.0,0.0
"Massac County, Illinois: Summary level: 050, state:17> county:127",146.0,4.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,...,0.0,0.0,0.0,0.0,0.0,21.0,0.0,11.0,11.0,0.0


We could then save this out to csv

In [None]:
all_counties.to_csv('all_counties_B05006.csv')

### Example 2 - All tracts in NYC for B05006

Earlier we created `nyc_tracts` which contains a dictionary for each county in NYC. To use this to download tracts for the entire city we can:
* iterate over the `nyc_tracts` list
    * download the data for each time
* combine all the data together

In [115]:
all_nyc_data = pd.DataFrame()
for county in nyc_tracts:
    print('Downloading: ', county)
    county_data = censusdata.download('acs5', 2019, county, table_info_column_ids )
    all_nyc_data = all_nyc_data.append(county_data)

Downloading:  Summary level: 140, state:36> county:047> tract:*
Downloading:  Summary level: 140, state:36> county:081> tract:*
Downloading:  Summary level: 140, state:36> county:061> tract:*
Downloading:  Summary level: 140, state:36> county:005> tract:*
Downloading:  Summary level: 140, state:36> county:085> tract:*


Below we can see a sample of rows from the `all_nyc_data` dataframe, we now have the `B05006` for the whole city of NYC

In [119]:
all_nyc_data.sample(100)

Unnamed: 0,B05006_001E,B05006_002E,B05006_003E,B05006_004E,B05006_005E,B05006_006E,B05006_007E,B05006_008E,B05006_009E,B05006_010E,...,B05006_159E,B05006_160E,B05006_161E,B05006_162E,B05006_163E,B05006_164E,B05006_165E,B05006_166E,B05006_167E,B05006_168E
"Census Tract 998.01, Queens County, New York: Summary level: 140, state:36> county:081> tract:099801",3355,45,7,7,0,0,0,0,0,0,...,68,39,165,0,0,45,0,0,0,0
"Census Tract 503, Kings County, New York: Summary level: 140, state:36> county:047> tract:050300",568,153,39,0,0,0,0,39,39,0,...,0,0,0,0,0,0,0,13,13,0
"Census Tract 287, Kings County, New York: Summary level: 140, state:36> county:047> tract:028700",697,95,11,0,0,0,0,11,11,0,...,0,55,129,0,0,0,0,0,0,0
"Census Tract 1529.01, Queens County, New York: Summary level: 140, state:36> county:081> tract:152901",3028,236,91,91,0,0,0,0,0,0,...,57,0,42,161,0,0,0,0,0,0
"Census Tract 65.01, Queens County, New York: Summary level: 140, state:36> county:081> tract:006501",1256,327,44,0,0,0,0,44,11,0,...,49,39,0,0,0,13,0,21,21,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"Census Tract 369, Kings County, New York: Summary level: 140, state:36> county:047> tract:036900",1657,49,20,0,0,0,0,20,11,9,...,0,0,305,0,0,0,0,0,0,0
"Census Tract 143, Queens County, New York: Summary level: 140, state:36> county:081> tract:014300",1752,507,17,17,0,0,0,0,0,0,...,181,26,18,47,0,0,0,14,14,0
"Census Tract 9, Richmond County, New York: Summary level: 140, state:36> county:085> tract:000900",322,61,32,0,0,0,0,32,16,16,...,0,27,0,0,0,0,0,8,0,8
"Census Tract 279, Queens County, New York: Summary level: 140, state:36> county:081> tract:027900",3907,210,64,15,0,0,0,7,7,0,...,1320,692,0,95,0,107,0,0,0,0


**Attach contextual columns**

In [121]:
contextual_columns = []
for column in all_nyc_data.columns:
    contextual_columns.append(table_info[column]['label'])
all_nyc_data.columns = contextual_columns

In [122]:
all_nyc_data.head()

Unnamed: 0,Estimate!!Total:,Estimate!!Total:!!Europe:,Estimate!!Total:!!Europe:!!Northern Europe:,Estimate!!Total:!!Europe:!!Northern Europe:!!Ireland,Estimate!!Total:!!Europe:!!Northern Europe:!!Denmark,Estimate!!Total:!!Europe:!!Northern Europe:!!Norway,Estimate!!Total:!!Europe:!!Northern Europe:!!Sweden,Estimate!!Total:!!Europe:!!Northern Europe:!!United Kingdom (inc. Crown Dependencies):,"Estimate!!Total:!!Europe:!!Northern Europe:!!United Kingdom (inc. Crown Dependencies):!!United Kingdom, excluding England and Scotland",Estimate!!Total:!!Europe:!!Northern Europe:!!United Kingdom (inc. Crown Dependencies):!!England,...,Estimate!!Total:!!Americas:!!Latin America:!!South America:!!Colombia,Estimate!!Total:!!Americas:!!Latin America:!!South America:!!Ecuador,Estimate!!Total:!!Americas:!!Latin America:!!South America:!!Guyana,Estimate!!Total:!!Americas:!!Latin America:!!South America:!!Peru,Estimate!!Total:!!Americas:!!Latin America:!!South America:!!Uruguay,Estimate!!Total:!!Americas:!!Latin America:!!South America:!!Venezuela,Estimate!!Total:!!Americas:!!Latin America:!!South America:!!Other South America,Estimate!!Total:!!Americas:!!Northern America:,Estimate!!Total:!!Americas:!!Northern America:!!Canada,Estimate!!Total:!!Americas:!!Northern America:!!Other Northern America
"Census Tract 228, Kings County, New York: Summary level: 140, state:36> county:047> tract:022800",966,62,0,0,0,0,0,0,0,0,...,21,28,0,0,1,0,0,52,52,0
"Census Tract 1134, Kings County, New York: Summary level: 140, state:36> county:047> tract:113400",661,0,0,0,0,0,0,0,0,0,...,17,55,58,0,0,0,0,0,0,0
"Census Tract 1156, Kings County, New York: Summary level: 140, state:36> county:047> tract:115600",717,7,0,0,0,0,0,0,0,0,...,0,0,167,0,0,8,0,0,0,0
"Census Tract 1178, Kings County, New York: Summary level: 140, state:36> county:047> tract:117800",796,32,32,0,0,0,0,32,0,32,...,7,23,228,0,0,0,0,0,0,0
"Census Tract 878, Kings County, New York: Summary level: 140, state:36> county:047> tract:087800",1332,0,0,0,0,0,0,0,0,0,...,0,0,227,0,0,0,0,0,0,0


**Export Data**

In [None]:
all_nyc_data.to_csv('all_counties_B05006.csv')

## Extra: Subsetting census data for easier analysis

We will use the `all_counties` dataframe generated earlier in this section. This dataframe holds all the columns in the `B05006` table.

One challenge with this table is that we have regions,  sub regions and countries all mixed together. 

See below we have `!! Estimate Total:` which is all the table total, `!! !! Estimate Total: Europe: ` which is Europe's total and then later we have `!! !! !! !! Estimate Total: Europe: Northern Europe: Sweden` which is just the info for Sweden.  This is just an example but we have many other regions, counties etc. 

```
Variable     | Table                          | Label                                                    | Type 
-------------------------------------------------------------------------------------------------------------------
B05006_001E  | PLACE OF BIRTH FOR THE FOREIGN | !! Estimate Total:                                       | int  
B05006_002E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! Estimate Total: Europe:                            | int  
B05006_003E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! !! Estimate Total: Europe: Northern Europe:        | int  
B05006_004E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! !! !! Estimate Total: Europe: Northern Europe: Ire | int  
B05006_005E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! !! !! Estimate Total: Europe: Northern Europe: Den | int  
B05006_006E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! !! !! Estimate Total: Europe: Northern Europe: Nor | int  
B05006_007E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! !! !! Estimate Total: Europe: Northern Europe: Swe | int  
B05006_008E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! !! !! Estimate Total: Europe: Northern Europe: Uni | int  
B05006_009E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! !! !! !! Estimate Total: Europe: Northern Europe:  | int  
B05006_010E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! !! !! !! Estimate Total: Europe: Northern Europe:  | int  
B05006_011E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! !! !! !! Estimate Total: Europe: Northern Europe:  | int  
B05006_012E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! !! !! Estimate Total: Europe: Northern Europe: Oth 
```


See full table below:

In [123]:
censusdata.printtable(table_info)

Variable     | Table                          | Label                                                    | Type 
-------------------------------------------------------------------------------------------------------------------
B05006_001E  | PLACE OF BIRTH FOR THE FOREIGN | !! Estimate Total:                                       | int  
B05006_002E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! Estimate Total: Europe:                            | int  
B05006_003E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! !! Estimate Total: Europe: Northern Europe:        | int  
B05006_004E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! !! !! Estimate Total: Europe: Northern Europe: Ire | int  
B05006_005E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! !! !! Estimate Total: Europe: Northern Europe: Den | int  
B05006_006E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! !! !! Estimate Total: Europe: Northern Europe: Nor | int  
B05006_007E  | PLACE OF BIRTH FOR THE FOREIGN | !! !! !! !! Estimate Total: Europe: Northern 

Now let's say we need to create 3 different dataframes:
1. Totals for each continent
2. Totals for each sub region
3. Totals for each country

We've written some code to help process out desired data subsets below. We will request data based on "level". In the print out above we see that `!!` indicate groupings and sub groupings. `!!` indicates a high level grouping while `!! !! !! !!` indicates a deep subgrouping. 

In the cell block below we have code to help segment data based on these different groupings. 

In [127]:
# Helper Functions
def filter_data_by_level(
    cols, level, include_higher_level=False
):
    """
    Function to select columns that have a specific number of levels
    
    Parameters:
        cols: columns from dataframe
        level: number of instances of '!!' to filter
        include_higher_level: If true then function will return the columsn where the 
            number of levels defined is >= level, if false then will return level. 
            
    Returns:
        List of columns of specific level or more
    """
    keep_cols = []
    for col in cols:
        if col.count("!!") == level or (
            include_higher_level and col.count("!!") > level
        ):
            keep_cols.append(col)
    return keep_cols


### Example 1: Get all Continents

If we scroll back up and look at how many levels or instances of `!!` we have - we see we have 2 for continents. So in our function call below we will set `level=2`.

In [131]:
filter_data_by_level(all_counties.columns,level=2)

['Estimate!!Total:!!Europe:',
 'Estimate!!Total:!!Asia:',
 'Estimate!!Total:!!Africa:',
 'Estimate!!Total:!!Oceania:',
 'Estimate!!Total:!!Americas:']

We can pass this inside brackets to our dataframe to get a dataframe of only continents

In [134]:
all_counties[filter_data_by_level(all_counties.columns,level=2)]

Unnamed: 0,Estimate!!Total:!!Europe:,Estimate!!Total:!!Asia:,Estimate!!Total:!!Africa:,Estimate!!Total:!!Oceania:,Estimate!!Total:!!Americas:
"San Jacinto County, Texas: Summary level: 050, state:48> county:407",93,9,52,0,1547
"Upshur County, Texas: Summary level: 050, state:48> county:459",157,139,29,30,1184
"Waller County, Texas: Summary level: 050, state:48> county:473",194,294,186,4,6503
"Wilson County, Texas: Summary level: 050, state:48> county:493",89,163,11,10,1568
"Hockley County, Texas: Summary level: 050, state:48> county:219",20,50,38,0,1832
...,...,...,...,...,...
"Brown County, Texas: Summary level: 050, state:48> county:049",42,221,22,0,1391
"Hall County, Texas: Summary level: 050, state:48> county:191",1,8,0,0,341
"Franklin County, Texas: Summary level: 050, state:48> county:159",100,0,0,5,497
"Frio County, Texas: Summary level: 050, state:48> county:163",21,135,89,0,3209


### Example 2: Get All Subregions 

In [141]:
filter_data_by_level(all_counties.columns, level=3)

['Estimate!!Total:!!Europe:!!Northern Europe:',
 'Estimate!!Total:!!Europe:!!Western Europe:',
 'Estimate!!Total:!!Europe:!!Southern Europe:',
 'Estimate!!Total:!!Europe:!!Eastern Europe:',
 'Estimate!!Total:!!Europe:!!Europe, n.e.c.',
 'Estimate!!Total:!!Asia:!!Eastern Asia:',
 'Estimate!!Total:!!Asia:!!South Central Asia:',
 'Estimate!!Total:!!Asia:!!South Eastern Asia:',
 'Estimate!!Total:!!Asia:!!Western Asia:',
 'Estimate!!Total:!!Asia:!!Asia,n.e.c.',
 'Estimate!!Total:!!Africa:!!Eastern Africa:',
 'Estimate!!Total:!!Africa:!!Middle Africa:',
 'Estimate!!Total:!!Africa:!!Northern Africa:',
 'Estimate!!Total:!!Africa:!!Southern Africa:',
 'Estimate!!Total:!!Africa:!!Western Africa:',
 'Estimate!!Total:!!Africa:!!Africa, n.e.c.',
 'Estimate!!Total:!!Oceania:!!Australia and New Zealand Subregion:',
 'Estimate!!Total:!!Oceania:!!Fiji',
 'Estimate!!Total:!!Oceania:!!Micronesia',
 'Estimate!!Total:!!Oceania:!!Oceania, n.e.c.',
 'Estimate!!Total:!!Americas:!!Latin America:',
 'Estimate!!

In [142]:
all_counties[filter_data_by_level(all_counties.columns,level=3)]

Unnamed: 0,Estimate!!Total:!!Europe:!!Northern Europe:,Estimate!!Total:!!Europe:!!Western Europe:,Estimate!!Total:!!Europe:!!Southern Europe:,Estimate!!Total:!!Europe:!!Eastern Europe:,"Estimate!!Total:!!Europe:!!Europe, n.e.c.",Estimate!!Total:!!Asia:!!Eastern Asia:,Estimate!!Total:!!Asia:!!South Central Asia:,Estimate!!Total:!!Asia:!!South Eastern Asia:,Estimate!!Total:!!Asia:!!Western Asia:,"Estimate!!Total:!!Asia:!!Asia,n.e.c.",...,Estimate!!Total:!!Africa:!!Northern Africa:,Estimate!!Total:!!Africa:!!Southern Africa:,Estimate!!Total:!!Africa:!!Western Africa:,"Estimate!!Total:!!Africa:!!Africa, n.e.c.",Estimate!!Total:!!Oceania:!!Australia and New Zealand Subregion:,Estimate!!Total:!!Oceania:!!Fiji,Estimate!!Total:!!Oceania:!!Micronesia,"Estimate!!Total:!!Oceania:!!Oceania, n.e.c.",Estimate!!Total:!!Americas:!!Latin America:,Estimate!!Total:!!Americas:!!Northern America:
"San Jacinto County, Texas: Summary level: 050, state:48> county:407",45,17,17,14,0,0,0,9,0,0,...,0,52,0,0,0,0,0,0,1500,47
"Upshur County, Texas: Summary level: 050, state:48> county:459",53,100,4,0,0,80,4,55,0,0,...,0,0,29,0,30,0,0,0,1167,17
"Waller County, Texas: Summary level: 050, state:48> county:473",47,137,0,10,0,163,10,121,0,0,...,0,39,38,66,4,0,0,0,6451,52
"Wilson County, Texas: Summary level: 050, state:48> county:493",12,77,0,0,0,100,63,0,0,0,...,0,0,0,0,0,0,10,0,1515,53
"Hockley County, Texas: Summary level: 050, state:48> county:219",9,2,0,9,0,10,19,21,0,0,...,0,0,19,0,0,0,0,0,1827,5
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"Brown County, Texas: Summary level: 050, state:48> county:049",14,12,16,0,0,59,59,103,0,0,...,0,0,22,0,0,0,0,0,1337,54
"Hall County, Texas: Summary level: 050, state:48> county:191",0,1,0,0,0,0,0,8,0,0,...,0,0,0,0,0,0,0,0,341,0
"Franklin County, Texas: Summary level: 050, state:48> county:159",0,91,9,0,0,0,0,0,0,0,...,0,0,0,0,5,0,0,0,497,0
"Frio County, Texas: Summary level: 050, state:48> county:163",13,0,0,8,0,17,100,9,9,0,...,0,0,26,0,0,0,0,0,3184,25


### Example 3. Get All Countries

This will be a little harder because of the various subgroups but with some extra code we should be able to do it.

In the previous data we see that Micronesia and Fiji are in with the other sub regions. Since these are countries we need to add those in to our countries list. 

In [160]:
missing_countries = [
    "Estimate!!Total:!!Oceania:!!Fiji",
    "Estimate!!Total:!!Oceania:!!Micronesia",
]

In [166]:
countries = filter_data_by_level(all_counties.columns, level=4, include_higher_level=True) + missing_countries

In [167]:
all_counties[countries].shape

(3220, 142)

In [168]:
countries = all_counties[countries]

See below for the dataframe of countries we have generated. Now there may be some regions that snuck through because of how the census formats its data, but this is likely 95% of the way there. 

In [170]:
countries

Unnamed: 0,Estimate!!Total:!!Europe:!!Northern Europe:!!Ireland,Estimate!!Total:!!Europe:!!Northern Europe:!!Denmark,Estimate!!Total:!!Europe:!!Northern Europe:!!Norway,Estimate!!Total:!!Europe:!!Northern Europe:!!Sweden,Estimate!!Total:!!Europe:!!Northern Europe:!!United Kingdom (inc. Crown Dependencies):,"Estimate!!Total:!!Europe:!!Northern Europe:!!United Kingdom (inc. Crown Dependencies):!!United Kingdom, excluding England and Scotland",Estimate!!Total:!!Europe:!!Northern Europe:!!United Kingdom (inc. Crown Dependencies):!!England,Estimate!!Total:!!Europe:!!Northern Europe:!!United Kingdom (inc. Crown Dependencies):!!Scotland,Estimate!!Total:!!Europe:!!Northern Europe:!!Other Northern Europe,Estimate!!Total:!!Europe:!!Western Europe:!!Austria,...,Estimate!!Total:!!Americas:!!Latin America:!!South America:!!Ecuador,Estimate!!Total:!!Americas:!!Latin America:!!South America:!!Guyana,Estimate!!Total:!!Americas:!!Latin America:!!South America:!!Peru,Estimate!!Total:!!Americas:!!Latin America:!!South America:!!Uruguay,Estimate!!Total:!!Americas:!!Latin America:!!South America:!!Venezuela,Estimate!!Total:!!Americas:!!Latin America:!!South America:!!Other South America,Estimate!!Total:!!Americas:!!Northern America:!!Canada,Estimate!!Total:!!Americas:!!Northern America:!!Other Northern America,Estimate!!Total:!!Oceania:!!Fiji,Estimate!!Total:!!Oceania:!!Micronesia
"Fayette County, Illinois: Summary level: 050, state:17> county:051",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,6.0,...,0.0,0.0,0.0,0.0,0.0,0.0,20.0,0.0,0.0,0.0
"Logan County, Illinois: Summary level: 050, state:17> county:107",0.0,0.0,0.0,0.0,13.0,0.0,13.0,0.0,0.0,1.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"Saline County, Illinois: Summary level: 050, state:17> county:165",0.0,0.0,0.0,0.0,7.0,0.0,7.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"Lake County, Illinois: Summary level: 050, state:17> county:097",369.0,106.0,45.0,105.0,1747.0,917.0,733.0,97.0,112.0,173.0,...,232.0,0.0,487.0,16.0,472.0,6.0,1520.0,0.0,0.0,0.0
"Massac County, Illinois: Summary level: 050, state:17> county:127",0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,21.0,0.0,11.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"Crockett County, Tennessee: Summary level: 050, state:47> county:033",0.0,0.0,0.0,0.0,10.0,5.0,5.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"Lake County, Tennessee: Summary level: 050, state:47> county:095",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"Knox County, Tennessee: Summary level: 050, state:47> county:093",19.0,14.0,0.0,290.0,574.0,248.0,283.0,43.0,0.0,0.0,...,37.0,0.0,77.0,4.0,264.0,0.0,610.0,7.0,0.0,0.0
"Benton County, Washington: Summary level: 050, state:53> county:005",16.0,37.0,10.0,20.0,362.0,183.0,149.0,30.0,42.0,9.0,...,8.0,21.0,75.0,0.0,5.0,0.0,737.0,0.0,0.0,0.0


That concludes the census examples. This code can be re-purposed to get other census datasets and then filter to the information that is most useful and relevant for your work. 

# End