# 2024/06/05 LODES User Manual
_Author: Meaghan Freund_

LODES (LEHD Origin-Destination Employment Statistics) is a collection of state-organized data files. The data consists of the number workers that move from a home geocode and/or a work geocode, with additional categories that characterize the types of workers.

There are three types of data files:
- RAC: Residence Area Characteristic data details the jobs totaled from home Census blocks
- WAC: Workplace Area Characteristic data details the jobs totaled from work Census blocks
- OD: Origin-Destination data details the jobs totaled from both the home and work Census blocks

For the following queries, the ADRIO maker relies on the Origin-Destination data files to properly show the movement from a home Census geoid to a work Census geoid. LODES files are organized in block granularity, but a user can query to see the movement from higher granularities. 

## Queries and uses of the LODES ADRIO Template

Below is the basic setup for calling the LODES ADRIO template, which varies depending on the granularity the user is seeking. In this case, the call makes ADRIOs based on a state granularity based on the given states in the input.

State granularity will output all workers that move from one state to another.

In [2]:
from epymorph.geo.adrio import adrio_maker_library
from epymorph.data_shape import Shapes
from epymorph.geo.dynamic import DynamicGeo
from epymorph.geo.spec import DynamicGeoSpec, Year, attrib
from epymorph.geography.us_census import (BlockGroupScope, CountyScope,
                                          StateScope, TractScope)
default_num = 4

spec = DynamicGeoSpec(
    attributes=[
        attrib('label', str, Shapes.N),
        attrib('geoid', str, Shapes.N),
        attrib('commuters', int, Shapes.NxN),
        attrib('commuters_29_under', int, Shapes.NxN),
        attrib('commuters_30_to_54', int, Shapes.NxN),
        attrib('commuters_55_over', int, Shapes.NxN),
        attrib('commuters_1250_under_earnings', int, Shapes.NxN),
        attrib('commuters_1251_to_3333_earnings', int, Shapes.NxN),
        attrib('commuters_3333_over_earnings', int, Shapes.NxN),
        attrib('commuters_goods_producing_industry', int, Shapes.NxN),
        attrib('commuters_trade_transport_utility_industry', int, Shapes.NxN),
        attrib('commuters_3333_over_earnings', int, Shapes.NxN),
        attrib('all_jobs', int, Shapes.NxN),
        attrib('primary_jobs', int, Shapes.NxN),
        attrib('all_private_jobs', int, Shapes.NxN),
        attrib('private_primary_jobs', int, Shapes.NxN),
        attrib('all_federal_jobs', int, Shapes.NxN),
        attrib('federal_primary_jobs', int, Shapes.NxN)
    ],
    time_period=Year(2015),
    scope=StateScope.in_states_by_code(["AZ", "CO", "NV", "NM"]),
    source={
        'label': 'LODES:name',
        'geoid': 'LODES',
        'commuters': 'LODES',
        'commuters_29_under': 'LODES',
        'commuters_30_to_54': 'LODES',
        'commuters_55_over': 'LODES',
        'commuters_1250_under_earnings': 'LODES',
        'commuters_1251_to_3333_earnings': 'LODES',
        'commuters_3333_over_earnings': 'LODES',
        'commuters_goods_producing_industry': 'LODES',
        'commuters_trade_transport_utility_industry': 'LODES',
        'commuters_other_industry': 'LODES',
        'all_jobs': 'LODES',
        'primary_jobs': 'LODES',
        'all_private_jobs': 'LODES',
        'private_primary_jobs': 'LODES',
        'all_federal_jobs': 'LODES',
        'federal_primary_jobs': 'LODES'
    }
)

geo = DynamicGeo.from_library(spec, adrio_maker_library)

A user may replace the scope and change the granularity and input with a 'replace' function. The example below organizes matrices in a county granularity with a state input. This means that the program takes all counties in each given state and counts the workers moving between them.

In [2]:
from dataclasses import replace

spec = replace(spec, scope=CountyScope.in_states_by_code(["AZ", "CO", "NV", "NM"]))
geo = DynamicGeo.from_library(spec, adrio_maker_library)

A user may also call specific geoids for an ADRIO, with the below example also organizing in the county granularity, but between the input county codes.

In [None]:
from dataclasses import replace

spec = replace(spec, scope=CountyScope.in_counties(
    ["04001", "04003", "04005", "04009"]))
geo = DynamicGeo.from_library(spec, adrio_maker_library)

# Query Types

## Label
The first type of query outputs the proper names of the matrices to better represent how the matrix is being read and output. This only works under state and county granularities and will error under lower granularities.

In [4]:
print(f"label: {geo['label'][0:default_num]}")

label: ['Arizona' 'Colorado' 'Nevada' 'New Mexico']


## Geoid

The 'geoid' query lists out all of the geoids that are incorporated in an ADRIO, whether that is only the geoids from the input, or all instances of a granularity due to a state call.

In [5]:
print(f"geoid: {geo['geoid'][0:default_num]}")

geoid: ['04' '08' '32' '35']


## Total Commuters

The first matrix query is the "commuters" call, which outputs a matrix with the given locations and granularities as the positions with the total number of commuters to and from each location.

In [6]:
print(f"total commuters:\n")

for i in range(default_num):
    print(f"work geocode: {geo['geoid'][i]} {geo['commuters'][i]}")

total commuters:

work geocode: Arizona [2550132    1202    3552    6813]
work geocode: Colorado [   2582 2405258     535    4824]
work geocode: Nevada [  13263     382 1179411     409]
work geocode: New Mexico [  8100   5557    361 764244]


# Query Workers by Job Type

Under the Origin-Destination data files, there are multiple attributes that consist of one file. Files are specified by year, state, whether residence is only in-state or out-of-state, file type, and job type. Users can specify the type of job for the matrices, which the jobs are:
- JT00: All Jobs
- JT01: Primary Jobs
- JT02: All Private Jobs
- JT03: Private Primary Jobs
- JT04: All Federal Jobs
- JT05: Federal Primary Jobs

With the exception of the 'all jobs' query, the queries only display the total number of workers under that job type.

## All Jobs

All jobs is the default call and is currently the only one which can show different categories of these workers, which is shown below in the next section. Other jobs simply include the total workers under that job type.

In [None]:
print(f"total workers for all jobs:\n")

for i in range(default_num):
    print(f"work geocode: {geo['geoid'][i]} {geo['all_jobs'][i]}")

## Primary Jobs

In [4]:
print(f"total workers only under primary jobs:\n")

for i in range(default_num):
    print(f"work geocode: {geo['geoid'][i]} {geo['primary_jobs'][i]}")

total workers only under primary jobs:

work geocode: Arizona [2388534    1099    3244    6528]
work geocode: Colorado [   2318 2251857     466    4417]
work geocode: Nevada [  12181     349 1085723     380]
work geocode: New Mexico [  7517   5101    331 715373]


## All Private Jobs


In [None]:
print(f"total workers under all private jobs:\n")

for i in range(default_num):
    print(f"work geocode: {geo['geoid'][i]} {geo['all_private_jobs'][i]}")

## Private Primary Jobs

In [None]:
print(f"total workers only under private primary jobs:\n")

for i in range(default_num):
    print(f"work geocode: {geo['geoid'][i]} {geo['private_primary_jobs'][i]}")

## All Federal Jobs

In [None]:
print(f"total workers under all federal jobs:\n")

for i in range(default_num):
    print(f"work geocode: {geo['geoid'][i]} {geo['all_federal_jobs'][i]}")

## Federal Primary Jobs

In [None]:
print(f"total workers only under all federal primary jobs:\n")

for i in range(default_num):
    print(f"work geocode: {geo['geoid'][i]} {geo['federal_primary_jobs'][i]}")

## Query Workers by Categories

Further beyond total commuters, LODES holds data about the type of workers that commute from location to location, being categorized under their age, monthly earnings, and the industry they work under.

## Commuters By Age

LODES holds information about commuting employees age through three age ranges, being:
- Workers 29 years old and younger
- Workers 30 years old to 54 years old
- Workers 55 years old and older

The three ranges can be called as separate queries:

### 29 and Younger

In [None]:
print(f"total commuters that are 29 or younger:\n")

for i in range(default_num):
    print(f"work geocode: {geo['geoid'][i]}\n  {geo['commuters_29_under'][i]}")

### 30 to 54 Years Old

In [None]:
print(f"total commuters that are ages between 30 to 54:\n")

for i in range(default_num):
    print(f"work geocode: {geo['geoid'][i]}\n  {geo['commuters_30_to_54'][i]}")

### 55 and Older

In [None]:
print(f"total commuters that are 55 or older:\n")

for i in range(default_num):
    print(f"work geocode: {geo['geoid'][i]}\n  {geo['commuters_55_over'][i]}")

## Commuters by Earnings

LODES datasets specify the earnings by monthly salaries, similarly to the age categories with the three different ranges. 

- Workers earning $1250 or under per month
- Workers earning $1250+ to $3333 per month
- Workers earning $3333+ per month

With the same manner as the ages, these three categories can be called as three separate queries:

### $1250 and Under Earned Monthly

In [None]:
print(f"total commuters that earn $1250 monthly:\n")

for i in range(default_num):
    print(
        f"work geocode: {geo['geoid'][i]}\n  {geo['commuters_1250_under_earnings'][i]}")

### Between $1251 and $3333 Earned Monthly

In [None]:
print(f"total commuters that earn between $1251-$3333 monthly:\n")

for i in range(default_num):
    print(
        f"work geocode: {geo['geoid'][i]}\n  {geo['commuters_1251_to_3333_earnings'][i]}")

### Above $3333 Earned Monthly

In [None]:
print(f"total commuters that earn greater than $3333 monthly:\n")

for i in range(default_num):
    print(
        f"work geocode: {geo['geoid'][i]}\n  {geo['commuters_3333_over_earnings'][i]}")

## Commuters By Industry

LODES also tracks the basic industry that commuters work under. Similar to age and earnings, LODES separates the industry into three categories, however, the categories are not split by ranges as industry is not as cut and dry as earnings or age.

LODES splits commuters by the following industries:
- Workers that work in Goods Producing industry sectors
- Workers that work in Trade, Transportation, and Utility industry sectors
- Workers that work under all other service industry sectors other than the above claimed industries


### Goods and Producing Industry

In [None]:
print(f"total commuters that work in Goods Producing industry sectors:\n")

for i in range(default_num):
    print(
        f"work geocode: {geo['geoid'][i]}\n  {geo['commuters_goods_producing_industry'][i]}")

### Trade, Transport, Utility Industries

In [None]:
print(f"total commuters that work in Trade, Transportation, and Utility industry sectors:\n")

for i in range(default_num):
    print(
        f"work geocode: {geo['geoid'][i]}\n  {geo['commuters_trade_transport_utility_industry'][i]}")

### Any Other Industry

In [None]:
print(f"total commuters that work in all other services industry sectors:\n")

for i in range(default_num):
    print(f"work geocode: {geo['geoid'][i]}\n  {geo['commuters_other_industry'][i]}")