# Census Data Tools

In [2]:
import morpc

# morpc.census

MORPC works regularly with census data, including but not limited to ACS 5 and 1-year, Decennial Census, PEP, and geographies. The following module is useful for gathering and organizing census data for processes in various workflow. Those workflows are linked when appropriate. 

### ACS functions and variables

acs_get() is a low-level wrapper for Census API requests that returns the results as a pandas dataframe. If necessary, it splits the request into several smaller requests to bypass the 50-variable limit imposed by the API.  

The resulting dataframe is indexed by GEOID (regardless of whether it was requested) and omits other fields that are not requested but which are returned automatically with each API request (e.g. "state", "county") 

In [3]:
url = 'https://api.census.gov/data/2022/acs/acs1'
params = {
    "get": "GEO_ID,NAME,B01001_001E",
    "for": "county:049,041",
    "in": "state:39"
}

In [4]:
acs = morpc.census.acs_get(url, params)

Total variables requested: 3
Starting request #1. 3 variables remain.


In [5]:
acs

Unnamed: 0_level_0,NAME,B01001_001E
GEO_ID,Unnamed: 1_level_1,Unnamed: 2_level_1
0500000US39041,"Delaware County, Ohio",226296
0500000US39049,"Franklin County, Ohio",1321820


### Using morpc-censusacs-fetch as an input

When using ACS data, generally we will be digesting data produded using the [morpc-censusacs-fetch](https://github.com/morpc/morpc-censusacs-fetch) workflow. The data that is produced from that script is by default saved in its output_data folders ./morpc-censusacs-fetch/output_data/

Run that script according to the documentation and then use acs_generate_dimension_table() downstream. 

#### Load the data using frictionless.load_data()

In [6]:
data, resource, schema = morpc.frictionless.load_data('../../morpc-censusacs-fetch/output_data/morpc-acs5-2023-us-B01001.resource.yaml', verbose=False)

morpc.load_data | INFO | Loading Frictionless Resource file at location ..\..\morpc-censusacs-fetch\output_data\morpc-acs5-2023-us-B01001.resource.yaml
morpc.load_data | INFO | Loading data, resource file, and schema from their source locations
morpc.load_data | INFO | --> Data file: ..\..\morpc-censusacs-fetch\output_data\morpc-acs5-2023-us-B01001.csv
morpc.load_data | INFO | --> Resource file: ..\..\morpc-censusacs-fetch\output_data\morpc-acs5-2023-us-B01001.resource.yaml
morpc.load_data | INFO | --> Schema file: ..\..\morpc-censusacs-fetch\output_data\morpc-acs5-2023-us-B01001.schema.yaml
morpc.load_data | INFO | Loading data.


In [7]:
data

Unnamed: 0,GEO_ID,SUMLEVEL,NAME,B01001_001E,B01001_001M,B01001_002E,B01001_002M,B01001_003E,B01001_003M,B01001_004E,...,B01001_045E,B01001_045M,B01001_046E,B01001_046M,B01001_047E,B01001_047M,B01001_048E,B01001_048M,B01001_049E,B01001_049M
0,0100000US,10,United States,332387540,-555555555,164545087,6966,9688436,4185,10296243,...,5576237,15826,7978348,17513,5461052,16334,3631914,12460,4050652,15097


#### Using ACS_ID_FIELDS to get the fields ids

In [9]:
idFields = [field["name"] for field in morpc.census.ACS_ID_FIELDS['us']]

In [10]:
morpc.census.acs_generate_universe_table(data.set_index("GEO_ID"), "B01001_001")

Unnamed: 0_level_0,Universe,Universe MOE
GEOID,Unnamed: 1_level_1,Unnamed: 2_level_1
,332387540,-555555555


#### Create a dimension table with the data and the dimension names

In [11]:
dim_table = morpc.census.acs_generate_dimension_table(data.set_index("GEO_ID"), schema, idFields=idFields, dimensionNames=["Sex", "Age group"])

In [12]:
dim_table.loc[dim_table['Variable type'] == 'Estimate'].head()

Unnamed: 0,GEOID,Variable,Value,Sex,Age group,Variable type
0,,B01001_001E,332387540,,,Estimate
2,,B01001_002E,164545087,Male,,Estimate
4,,B01001_003E,9688436,Male,Under 5 years,Estimate
6,,B01001_004E,10296243,Male,5 to 9 years,Estimate
8,,B01001_005E,11032019,Male,10 to 14 years,Estimate
