# Census Data Tools

MORPC works regularly with census data, including but not limited to ACS 5 and 1-year, Decennial Census, PEP, and geographies. The following module is useful for gathering and organizing census data for processes in various workflow. Those workflows are linked when appropriate. 

In [None]:
import morpc

## API functions and variables

api_get() is a low-level wrapper for Census API requests that returns the results as a pandas dataframe. If necessary, it splits the request into several smaller requests to bypass the 50-variable limit imposed by the API.  

The resulting dataframe is indexed by GEOID (regardless of whether it was requested) and omits other fields that are not requested but which are returned automatically with each API request (e.g. "state", "county") 

In [None]:
url = 'https://api.census.gov/data/2022/acs/acs1'
params = {
    "get": "GEO_ID,NAME,B01001_001E",
    "for": "county:049,041",
    "in": "state:39"
}

In [None]:
api = morpc.census.api_get(url, params)

In [None]:
api

## American Community Survey (ACS) Data Class

When using ACS data, generally we will be digesting data produded using the [morpc-censusacs-fetch](https://github.com/morpc/morpc-censusacs-fetch) workflow. The data that is produced from that script is by default saved in its output_data folders ./morpc-censusacs-fetch/output_data/

The Census ACS Fetch script leverages the `acs_data` class form `morpc.census`


### Create an initial object which represents a variable in the ACS data api.

The class takes 3 arguments:

1. variable group number
2. the year
3. the type of survey (1 or 5 year estimates)

In [None]:
from morpc.census import acs_data

In [None]:
acs = acs_data('B01001', '2023', '5')

The initial call creates queries the Census for the variable definitions and returns a dictionary of the available variables in the group. see `acs.VARS`

In [None]:
acs.VARS

The initial call alse fetchs a list of dimensions from a cached json file in ./morpc/census/acs_variable_group.json and is stored in morpc.census.ACS_VAR_GROUPS.

#### Manual verfication for variable dimension names. 

The list of dimensions are automatically created from the Census Variable labels and need verified before being used. If the dimesion names have not be verified, the will not be stored. Navigate to the JSON and check to make sure that there are the correct number of dimension and that they are in the correct order. Change the verfication field to `true`.

In [None]:
acs.DIMENSIONS

### Query the API for the deisred variables and geography

The `.query()` method queries the API and caches the data in memory under `acs.DATA`. At the same time it creates a frictionless schema that corrosponds with the data. 

#### scope:
These are pre-defined sumlevels and scopes for commonly queried geographies. see `morpc.census.SCOPES`.

In [None]:
acs = acs.query(for_param='county:*', in_param='state:39')

### For custom queries, use for and in parameters to pass to api query. 

#### for_param:
(optional) The geographies for which to call the the query "state:*" represents all states. "state:39" represent Ohio.

#### in_param:
(optional) A filter for the for parameter. In combinations this allows you do call for small geograhpies inside larger ones. 

> Examples: for_param="county:\*", in_param="state:39" would get all counties in Ohio.
> for_param="tract:\*", in_param='state:39,county:041,049' gets all census tracts in Delaware and Franklin Counties.

### Filter the variables using the get parameter

#### get_param:
(Optional) If you want to return a subset of variables, they can be passed here as a list.

### Dimension Tables

When the query is called the class makes table with the dimensions included that can be used to get summaries of the data. 

This can be used to get quick queries for summaries. 

In [None]:
acs.DIM_TABLE.LONG

In [None]:
acs.DIM_TABLE.WIDE

In [None]:
acs.DIM_TABLE.PERCENT

### Save raw data (not dim table) as a frictionless resource with schema

After querying the data, save the data as a frictionless resource with reasonable descriptors. 

In [None]:
acs.save(output_dir='./temp_data/')

In [None]:
acs.SCHEMA

In [None]:
acs.RESOURCE

## Load data from cached file

In [None]:
import morpc

In [None]:
acs = morpc.census.ACS('B25010', '2023', '5').load(scope='region15-tracts', dirname='./temp_data/')

## Georeference the data to map

Add geometries by joining GEOS to DATA.

In [None]:
acs.GEOS

In [None]:
import geopandas as gpd
acs.DATA = gpd.GeoDataFrame(acs.DATA.join(acs.GEOS), geometry='geometry')

In [None]:
acs.DATA.plot(column='B01001_002E')

## Use the built in .explore() method to view a map of all the columns in data

In [None]:
acs.explore(table='PERCENTS')

In [None]:
acs.MAP

## Using the rest_api module to fetch geometry data from Census API

In [None]:
import morpc.rest_api as rest_api
import morpc.census as census

In [None]:

url =  rest_api.get_layer_url(2024, 'county subdivisions', survey='ACS')

query = "STATE = '39' and COUNTY = '049'"

resource = rest_api.resource(
    name = 'morpc-franklin-tracts',
    url = url,
    where = query,
    max_record_count=500
)

In [None]:
gdf = rest_api.gdf_from_resource(resource)

## Below should still be functional, but hoping to implement into ACS class

#### Load the data using frictionless.load_data()

In [None]:
data, resource, schema = morpc.frictionless.load_data('./temp_data/morpc-acs5-2023-state-B01001.resource.yaml', verbose=False)

#### Using ACS_ID_FIELDS to get the fields ids

In [None]:
morpc.census.acs_generate_universe_table(data.set_index("GEO_ID"), "B01001_001")

#### Create a dimension table with the data and the dimension names

In [None]:
dim_table = morpc.census.acs_generate_dimension_table(data.set_index("GEO_ID"), schema, idFields=idFields, dimensionNames=["Sex", "Age group"])

In [None]:
dim_table.loc[dim_table['Variable type'] == 'Estimate'].head()

### Build ACS Variable Group JSON for Dimension names

In [None]:
import morpc

In [None]:
acs = morpc.census.ACS('B02001', 2023, 5).query(for_param='county:*', in_param='state:39')

In [None]:
acs.map(table='PERCENTS', verbose=True)

In [1]:
from morpc import census



In [2]:
acs = census.ACS(group="B12001", year='2023', survey="5").scope('tract', 'region15')

MESSAGE | morpc.census.fetch_geos | Combining geometries...


In [4]:
acs.DIM_TABLE.WIDE

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,GEO_ID,1400000US39041010100,1400000US39041010200,1400000US39041010420,1400000US39041010421,1400000US39041010422,1400000US39041010520,1400000US39041010530,1400000US39041011101,1400000US39041011102,1400000US39041011200,...,1400000US39159050303,1400000US39159050304,1400000US39159050401,1400000US39159050402,1400000US39159050501,1400000US39159050502,1400000US39159050601,1400000US39159050602,1400000US39159050701,1400000US39159050702
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,NAME,Census Tract 101; Delaware County; Ohio,Census Tract 102; Delaware County; Ohio,Census Tract 104.20; Delaware County; Ohio,Census Tract 104.21; Delaware County; Ohio,Census Tract 104.22; Delaware County; Ohio,Census Tract 105.20; Delaware County; Ohio,Census Tract 105.30; Delaware County; Ohio,Census Tract 111.01; Delaware County; Ohio,Census Tract 111.02; Delaware County; Ohio,Census Tract 112; Delaware County; Ohio,...,Census Tract 503.03; Union County; Ohio,Census Tract 503.04; Union County; Ohio,Census Tract 504.01; Union County; Ohio,Census Tract 504.02; Union County; Ohio,Census Tract 505.01; Union County; Ohio,Census Tract 505.02; Union County; Ohio,Census Tract 506.01; Union County; Ohio,Census Tract 506.02; Union County; Ohio,Census Tract 507.01; Union County; Ohio,Census Tract 507.02; Union County; Ohio
Unnamed: 0_level_2,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,REFERENCE_YEAR,2023,2023,2023,2023,2023,2023,2023,2023,2023,2023,...,2023,2023,2023,2023,2023,2023,2023,2023,2023,2023
DIM_0,DIM_1,DIM_2,DIM_3,DIM_4,Unnamed: 5_level_3,Unnamed: 6_level_3,Unnamed: 7_level_3,Unnamed: 8_level_3,Unnamed: 9_level_3,Unnamed: 10_level_3,Unnamed: 11_level_3,Unnamed: 12_level_3,Unnamed: 13_level_3,Unnamed: 14_level_3,Unnamed: 15_level_3,Unnamed: 16_level_3,Unnamed: 17_level_3,Unnamed: 18_level_3,Unnamed: 19_level_3,Unnamed: 20_level_3,Unnamed: 21_level_3,Unnamed: 22_level_3,Unnamed: 23_level_3,Unnamed: 24_level_3,Unnamed: 25_level_3
Total,,,,,4670.0,5073.0,2443.0,2240.0,4845.0,6081.0,2701.0,2462.0,4788.0,3186.0,...,7235.0,2146.0,4686.0,3148.0,4262.0,1959.0,8524.0,3654.0,3479.0,2494.0
Total,Female,,,,2300.0,2477.0,1203.0,1178.0,2490.0,2893.0,1382.0,1156.0,2518.0,1723.0,...,3868.0,1037.0,3554.0,1398.0,2187.0,1031.0,4187.0,1659.0,1817.0,1297.0
Total,Female,Divorced,,,240.0,463.0,212.0,145.0,290.0,313.0,203.0,204.0,363.0,182.0,...,396.0,117.0,664.0,225.0,320.0,179.0,197.0,188.0,72.0,127.0
Total,Female,Never married,,,1053.0,795.0,391.0,188.0,632.0,483.0,431.0,298.0,611.0,526.0,...,1139.0,217.0,1604.0,341.0,557.0,313.0,636.0,476.0,238.0,326.0
Total,Female,Now married,,,938.0,1080.0,551.0,665.0,1391.0,1801.0,516.0,546.0,1448.0,805.0,...,2288.0,599.0,1019.0,756.0,1097.0,420.0,3207.0,911.0,1426.0,730.0
Total,Female,Now married,"Married, spouse absent",,17.0,167.0,20.0,121.0,94.0,63.0,49.0,39.0,28.0,58.0,...,214.0,0.0,462.0,42.0,57.0,0.0,173.0,16.0,14.0,36.0
Total,Female,Now married,"Married, spouse absent",Other,8.0,67.0,0.0,58.0,45.0,55.0,33.0,4.0,19.0,13.0,...,137.0,0.0,357.0,4.0,24.0,0.0,23.0,0.0,14.0,3.0
Total,Female,Now married,"Married, spouse absent",Separated,9.0,100.0,20.0,63.0,49.0,8.0,16.0,35.0,9.0,45.0,...,77.0,0.0,105.0,38.0,33.0,0.0,150.0,16.0,0.0,33.0
Total,Female,Now married,"Married, spouse present",,921.0,913.0,531.0,544.0,1297.0,1738.0,467.0,507.0,1420.0,747.0,...,2074.0,599.0,557.0,714.0,1040.0,420.0,3034.0,895.0,1412.0,694.0
Total,Female,Widowed,,,69.0,139.0,49.0,180.0,177.0,296.0,232.0,108.0,96.0,210.0,...,45.0,104.0,267.0,76.0,213.0,119.0,147.0,84.0,81.0,114.0
