# 2024/06/05 LODES User Manual
_Author: Meaghan Freund_

LODES (LEHD Origin-Destination Employment Statistics) is a collection of state-organized data files. The data consists of the number workers that move from a home geocode or to a work geocode, which includes specifications of those workers through various categories, depending on the type of file being viewed.

There are three types of data files:
- RAC: Residence Area Characteristic data details the jobs totaled from home Census blocks
- WAC: Workplace Area Characteristic data details the jobs totaled from work Census blocks
- OD: Origin-Destination data details the jobs totaled from both the home and work Census blocks

For the below queries, the ADRIO maker relies on the Origin-Destination data files to properly show the movement from a home Census block to a work Census block. 

## Queries and uses of the LODES ADRIO Template

Set up for the LODES call, users are able to change the call depending on which states or further granularities they specifically want to  see.

### Example: getting all instances of counties from given states

In [1]:
from pathlib import Path

import numpy as np

from epymorph.data_shape import Shapes
from epymorph.geo.spec import DynamicGeoSpec, Year, attrib
from epymorph.geography.us_census import (BlockGroupScope, CountyScope,
                                          StateScope, TractScope)

spec = DynamicGeoSpec(
    attributes=[
        attrib('label', str, Shapes.N),
        attrib('commuters', int, Shapes.NxN),
        attrib('commuters_29_under', int, Shapes.NxN),
        attrib('commuters_30_to_54', int, Shapes.NxN),
        attrib('commuters_55_over', int, Shapes.NxN),
        attrib('commuters_1250_under_earnings', int, Shapes.NxN),
        attrib('commuters_1251_to_3333_earnings', int, Shapes.NxN),
        attrib('commuters_3333_over_earnings', int, Shapes.NxN),
        attrib('commuters_goods_producing_industry', int, Shapes.NxN),
        attrib('commuters_trade_transport_utility_industry', int, Shapes.NxN),
        attrib('commuters_3333_over_earnings', int, Shapes.NxN),
        attrib('all_jobs', int, Shapes.NxN),
        attrib('primary_jobs', int, Shapes.NxN),
        attrib('all_private_jobs', int, Shapes.NxN),
        attrib('private_primary_jobs', int, Shapes.NxN),
        attrib('all_federal_jobs', int, Shapes.NxN),
        attrib('federal_primary_jobs', int, Shapes.NxN)
    ],
    time_period=Year(2015),
    scope=CountyScope.in_states_by_code(["AZ", "CO", "NV", "NM"]),
    source={
        'label': 'LODES:name',
        'home_geoid': 'LODES',
        'work_geoid': 'LODES',
        'commuters': 'LODES',
        'commuters_29_under': 'LODES',
        'commuters_30_to_54': 'LODES',
        'commuters_55_over': 'LODES',
        'commuters_1250_under_earnings': 'LODES',
        'commuters_1251_to_3333_earnings': 'LODES',
        'commuters_3333_over_earnings': 'LODES',
        'commuters_goods_producing_industry': 'LODES',
        'commuters_trade_transport_utility_industry': 'LODES',
        'commuters_other_industry': 'LODES',
        'all_jobs': 'LODES',
        'primary_jobs': 'LODES',
        'all_private_jobs': 'LODES',
        'private_primary_jobs': 'LODES',
        'all_federal_jobs': 'LODES',
        'federal_primary_jobs': 'LODES'
    }
)

### Example 2: only using specific county geoids given by the user

In [None]:
from pathlib import Path

import numpy as np

from epymorph.data_shape import Shapes
from epymorph.geo.spec import DynamicGeoSpec, Year, attrib
from epymorph.geography.us_census import (BlockGroupScope, CountyScope,
                                          StateScope, TractScope)

spec = DynamicGeoSpec(
    attributes=[
        attrib('label', str, Shapes.N),
        attrib('geoid', str, Shapes.N),
        attrib('commuters', int, Shapes.NxN),
        attrib('commuters_29_under', int, Shapes.NxN),
        attrib('commuters_30_to_54', int, Shapes.NxN),
        attrib('commuters_55_over', int, Shapes.NxN),
        attrib('commuters_1250_under_earnings', int, Shapes.NxN),
        attrib('commuters_1251_to_3333_earnings', int, Shapes.NxN),
        attrib('commuters_3333_over_earnings', int, Shapes.NxN),
        attrib('commuters_goods_producing_industry', int, Shapes.NxN),
        attrib('commuters_trade_transport_utility_industry', int, Shapes.NxN),
        attrib('commuters_3333_over_earnings', int, Shapes.NxN),
        attrib('all_jobs', int, Shapes.NxN),
        attrib('primary_jobs', int, Shapes.NxN),
        attrib('all_private_jobs', int, Shapes.NxN),
        attrib('private_primary_jobs', int, Shapes.NxN),
        attrib('all_federal_jobs', int, Shapes.NxN),
        attrib('federal_primary_jobs', int, Shapes.NxN)
    ],
    time_period=Year(2021),
    scope=CountyScope.in_counties(["04001", "04003", "04005", "04009"]),
    source={
        'label': 'LODES:name',
        'geoid': 'LODES',
        'commuters': 'LODES',
        'commuters_29_under': 'LODES',
        'commuters_30_to_54': 'LODES',
        'commuters_55_over': 'LODES',
        'commuters_1250_under_earnings': 'LODES',
        'commuters_1251_to_3333_earnings': 'LODES',
        'commuters_3333_over_earnings': 'LODES',
        'commuters_goods_producing_industry': 'LODES',
        'commuters_trade_transport_utility_industry': 'LODES',
        'commuters_other_industry': 'LODES',
        'all_jobs': 'LODES',
        'primary_jobs': 'LODES',
        'all_private_jobs': 'LODES',
        'private_primary_jobs': 'LODES',
        'all_federal_jobs': 'LODES',
        'federal_primary_jobs': 'LODES'
    }
)

In [2]:
from epymorph.geo.adrio import adrio_maker_library
from epymorph.geo.dynamic import DynamicGeo

geo = DynamicGeo.from_library(spec, adrio_maker_library)

# Query Types

## Label
The first type of query outputs the gathered input to better represent how the matrix is being read and output.

In [3]:
print(f"label: {geo['label'][0:5]}")

label: ['04001' '04003' '04005' '04007' '04009']


## Total Commuters

The first matrix query is the "commuters" call, which outputs a matrix with the given locations and granularities as the positions with the total number of commuters to and from each location.

In [4]:
print(f"total commuters:\n")

for i in range(5):
    print(f"work geocode: {geo['label'][i]}\n  {geo['commuters'][i]}")

total commuters:

work geocode: 04001
  [7313   39 1109   12   30   12    6 1211   19 2177  244   95    8   56
   11    1    0    5   14    0    0    0    0    0    0    0    0    0
    0    0   18    0    0    0    0    0   18    0    0    0    0    0
    0    0    0    2    0    0    0   57    1    0    0    0    0    0
    0   35   19    0    0    0    0    1    0    0    0    0    0    0
    0    2   10    0    0    0    0    3    0    1   22    0    0    0
    0    0    0    0    0    0    2    0    0    2    0    1  502   16
    0  101    0    1    0    0    0   22    1    0    2    1    0    1
   15 2155    0    1    0    6    1   54  675    2   22   10   32    7
    6    0   14]
work geocode: 04003
  [   21 21713    67    52   453    80    13  1045    93    64  2731   307
   535   121    92     0     0     0     0     0     0     0     1     0
     0     0     0     0     0     0     0     0     0     1     0     0
     3     0     0     0     0     0     0     0     0     2   

# Query Workers by Job Type

Under the Origin-Destination data files, there are multiple attributes that consist of one file. Files are specified by year, state, whether residence is only in-state or out-of-state, file type, and job type. Users can specify the type of job for the matrices, which the jobs are:
- JT00: All Jobs
- JT01: Primary Jobs
- JT02: All Private Jobs
- JT03: Private Primary Jobs
- JT04: All Federal Jobs
- JT05: Federal Primary Jobs

## All Jobs

All jobs is the default call and is currently the only one which can show different categories of these workers, which is shown below in the next section. 

## Primary Jobs

## All Private Jobs

## Private Primary Jobs

## All Federal Jobs

## Federal Primary Jobs

## Query Workers by Categories

Further beyond total commuters, LODES holds data about the type of workers that commute from location to location, being categorized under their age, monthly earnings, and the industry they work under.

## Commuters By Age

LODES holds information about commuting employees age through three age ranges, being:
- Workers 29 years old and younger
- Workers 30 years old to 54 years old
- Workers 55 years old and older

The three ranges can be called as separate queries:

### 29 and Younger

In [3]:
print(f"total commuters that are 29 or younger:\n")

for i in range(5):
    print(f"work geocode: {geo['label'][i]}\n  {geo['commuters_29_under'][i]}")

total commuters that are 29 or younger:

work geocode: 04001
  [866   9 150   2   3   5   0 239   7 332  57  25   4  15   5   1   0   0
   3   0   0   0   0   0   0   0   0   0   0   0   3   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   1   0   0   0   9   1   0   0   0
   0   0   0   2   5   0   0   0   0   0   0   0   0   0   0   0   0   0
   2   0   0   0   0   3   0   0   4   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0  56   1   0  20   0   1   0   0   0   7   0   0
   1   1   0   0   3 261   0   0   0   0   0   5  96   0   2   4   7   1
   0   0   1]
work geocode: 04003
  [   7 4126   13   14  102   27    2  248   11   14  571   61  127   20
   20    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    1    0    0    0    0    0
    0    0    0    1    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0

: 

### 30 to 54 Years Old

In [None]:
print(f"total commuters that are ages between 30 to 54:\n")

for i in range(5):
    print(f"work geocode: {geo['label'][i]}\n  {geo['commuters_30_to_54'][i]}")

### 55 and Older

In [None]:
print(f"total commuters that are 55 or older:\n")

for i in range(5):
    print(f"work geocode: {geo['label'][i]}\n  {geo['commuters_55_over'][i]}")

## Commuters by Earnings

LODES datasets specify the earnings by monthly salaries, similarly to the age categories with the three different ranges. 

- Workers earning $1250 or under per month
- Workers earning $1250+ to $3333 per month
- Workers earning $3333+ per month

With the same manner as the ages, these three categories can be called as three separate queries:

### $1250 and Under Earned Monthly

In [None]:
print(f"total commuters that earn $1250 monthly:\n")

for i in range(5):
    print(
        f"work geocode: {geo['label'][i]}\n  {geo['commuters_1250_under_earnings'][i]}")

### Between $1251 and $3333 Earned Monthly

In [None]:
print(f"total commuters that earn between $1251-$3333 monthly:\n")

for i in range(5):
    print(
        f"work geocode: {geo['label'][i]}\n  {geo['commuters_1251_to_3333_earnings'][i]}")

### Above $3333 Earned Monthly

In [None]:
print(f"total commuters that earn greater than $3333 monthly:\n")

for i in range(5):
    print(
        f"work geocode: {geo['label'][i]}\n  {geo['commuters_3333_over_earnings'][i]}")

## Commuters By Industry

LODES also tracks the basic industry that commuters work under. Similar to age and earnings, LODES separates the industry into three categories, however, the categories are not split by ranges as industry is not as cut and dry as earnings or age.

LODES splits commuters by the following industries:
- Workers that work in Goods Producing industry sectors
- Workers that work in Trade, Transportation, and Utility industry sectors
- Workers that work under all other service industry sectors other than the above claimed industries


### Goods and Producing Industry

In [None]:
print(f"total commuters that work in Goods Producing industry sectors:\n")

for i in range(5):
    print(
        f"work geocode: {geo['label'][i]}\n  {geo['commuters_goods_producing_industry'][i]}")

### Trade, Transport, Utility Industries

In [None]:
print(f"total commuters that work in Trade, Transportation, and Utility industry sectors:\n")

for i in range(5):
    print(
        f"work geocode: {geo['label'][i]}\n  {geo['commuters_trade_transport_utility_industry'][i]}")

### Any Other Industry

In [None]:
print(f"total commuters that work in all other services industry sectors:\n")

for i in range(5):
    print(f"work geocode: {geo['label'][i]}\n  {geo['commuters_other_industry'][i]}")