Skip to content

Commit

Permalink
Merge pull request #136 from USEPA/release-v1.1.0
Browse files Browse the repository at this point in the history
Release v1.1.0
  • Loading branch information
bl-young authored Jun 16, 2023
2 parents 70158b1 + f8f6212 commit 084a311
Show file tree
Hide file tree
Showing 91 changed files with 24,921 additions and 23,717 deletions.
7 changes: 5 additions & 2 deletions .github/workflows/conda-secondary_cntxt.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,10 @@
name: Generate single inventory (conda sec_ctxt)

on:
workflow_dispatch: # manual trigger only
pull_request:
branches: [master]
types: [opened, reopened, ready_for_review]
workflow_dispatch:
inputs:
year:
description: "Year"
Expand Down Expand Up @@ -37,7 +40,7 @@ jobs:
run: |
conda env update --file env_sec_ctxt.yaml --name base
- name: Generate inventory files
- name: Generate inventory files with secondary context enabled
env:
YEAR: ${{ github.event.inputs.year }}
INVENTORY: ${{ github.event.inputs.inventory }}
Expand Down
13 changes: 6 additions & 7 deletions .github/workflows/generate_all_inventories.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,12 @@
name: Generate All Inventories

on:
# push:
# branches: [master]
pull_request:
branches: [master]
types: [opened, reopened, ready_for_review]
schedule:
- cron: '0 6 14 * *' # Runs 14th of every month at 6:00 UTC
workflow_dispatch: # manual trigger only
workflow_dispatch:

jobs:
build:
Expand All @@ -16,9 +17,9 @@ jobs:
fail-fast: false

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v2
uses: actions/setup-python@v3
with:
python-version: "3.10"

Expand All @@ -31,8 +32,6 @@ jobs:
- name: Install package and dependencies
run: |
pip install .
# FEDEFL required for TRI validation
pip install git+https://github.com/USEPA/Federal-LCA-Commons-Elementary-Flow-List#egg=fedelemflowlist
- name: Generate inventory files
run: |
Expand Down
10 changes: 4 additions & 6 deletions .github/workflows/generate_combined_inventory.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@ jobs:
fail-fast: false

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v2
uses: actions/setup-python@v3
with:
python-version: "3.10"

Expand All @@ -30,10 +30,8 @@ jobs:
- name: Install package and dependencies
run: |
pip install .
# FEDEFL required for TRI validation
pip install git+https://github.com/USEPA/Federal-LCA-Commons-Elementary-Flow-List#egg=fedelemflowlist
- name: Generate inventory files
- name: Combine inventory files
run: |
pytest -m combined
Expand All @@ -42,7 +40,7 @@ jobs:
uses: actions/upload-artifact@v3
with:
# Artifact name
name: StEWI Inventory files
name: StEWI Combined inventory files
# A file, directory or wildcard patter that describes what to upload
path: | # uses local user data dir for ubuntu
~/.local/share/stewi/facility/*
Expand Down
8 changes: 3 additions & 5 deletions .github/workflows/generate_select_inventories.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,9 @@ jobs:
fail-fast: false

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v2
uses: actions/setup-python@v3
with:
python-version: "3.10"

Expand All @@ -35,8 +35,6 @@ jobs:
- name: Install package and dependencies
run: |
pip install .
# FEDEFL required for TRI validation
pip install git+https://github.com/USEPA/Federal-LCA-Commons-Elementary-Flow-List#egg=fedelemflowlist
- name: Generate inventory files
env:
Expand All @@ -51,7 +49,7 @@ jobs:
uses: actions/upload-artifact@v3
with:
# Artifact name
name: StEWI Inventory files
name: "${{ github.event.inputs.inventory }}"
# A file, directory or wildcard patter that describes what to upload
path: | # uses local user data dir for ubuntu
~/.local/share/stewi/facility/*
Expand Down
10 changes: 5 additions & 5 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ name: Python CI/CD tests

on:
push:
branches: [master, develop]
# branches: [master, develop]
paths-ignore: # prevents workflow execution when only these types of files are modified
- '**.md' # wildcards prevent file in any repo dir from trigering workflow
- '**.bib'
Expand All @@ -15,7 +15,7 @@ on:
- '.gitignore'
pull_request:
branches: [master, develop]
types: [opened, reopened] # excludes syncronize to avoid redundant trigger from commits on PRs
types: [opened, reopened, ready_for_review] # excludes syncronize to avoid redundant trigger from commits on PRs
paths-ignore:
- '**.md'
- '**.bib'
Expand All @@ -31,14 +31,14 @@ jobs:
fail-fast: false
matrix:
os: [ubuntu-latest, windows-latest, macos-latest]
py-version: ['3.7', '3.8', '3.9', '3.10']
py-version: ['3.8', '3.9', '3.10', '3.11']

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3

# general Python setup
- name: Set up Python ${{ matrix.py-version }}
uses: actions/setup-python@v2
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.py-version }}

Expand Down
45 changes: 28 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,18 +11,19 @@ StEWI consists of a core module, `stewi`, that digests and provides the USEPA in
and `chemicalmatcher`, provide commons IDs for facilities and flows across inventories, which is used by the `stewicombo` module
to combine the data, and optionally remove overlaps and remove double counting of groups of chemicals based on user preferences.

StEWI v1 was peer-reviewed internally at USEPA and externally through _Applied Sciences_. An article describing StEWI was published in a special issue of Applied Sciences: [Advanced Data Engineering for Life Cycle Applications](https://doi.org/10.3390/app12073447).
StEWI v1 was peer-reviewed internally at USEPA and externally through _Applied Sciences_.
An article describing StEWI was published in a special issue of Applied Sciences: [Advanced Data Engineering for Life Cycle Applications](https://doi.org/10.3390/app12073447).

## USEPA Inventories Covered By Data Reporting Year (current version)

|Source|2008|2009|2010|2011|2012|2013|2014|2015|2016|2017|2018|2019|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|[Discharge Monitoring Reports](https://echo.epa.gov/tools/data-downloads/icis-npdes-dmr-and-limit-data-set)* | | | | | | |x|x|x|x|x|x|
|[Greenhouse Gas Reporting Program](https://www.epa.gov/ghgreporting) | | | |x|x|x|x|x|x|x|x|x|
|[Emissions & Generation Resource Integrated Database](https://www.epa.gov/energy/emissions-generation-resource-integrated-database-egrid) | | | | | | |x| |x| |x|x|
|[National Emissions Inventory](https://www.epa.gov/air-emissions-inventories/national-emissions-inventory-nei)** | | | |x|i|i|x|i|i|x|i| |
|[RCRA Biennial Report](https://www.epa.gov/hwgenerators/biennial-hazardous-waste-report)* | |x| |x| |x| |x| |x| |x|
|[Toxic Release Inventory](https://www.epa.gov/toxics-release-inventory-tri-program)* |x|x|x|x|x|x|x|x|x|x|x|x|
|Source|2011|2012|2013|2014|2015|2016|2017|2018|2019|2020|2021|
|---|---|---|---|---|---|---|---|---|---|---|---|
|[Discharge Monitoring Reports](https://echo.epa.gov/tools/data-downloads/icis-npdes-dmr-and-limit-data-set)* |x|x|x|x|x|x|x|x|x|x|x|
|[Greenhouse Gas Reporting Program](https://www.epa.gov/ghgreporting) |x|x|x|x|x|x|x|x|x|x|x|
|[Emissions & Generation Resource Integrated Database](https://www.epa.gov/energy/emissions-generation-resource-integrated-database-egrid) | | | |x| |x| |x|x|x|x|
|[National Emissions Inventory](https://www.epa.gov/air-emissions-inventories/national-emissions-inventory-nei)** |x|i|i|x|i|i|x|i|i|x| |
|[RCRA Biennial Report](https://www.epa.gov/hwgenerators/biennial-hazardous-waste-report)* |x| |x| |x| |x| |x| | |
|[Toxic Release Inventory](https://www.epa.gov/toxics-release-inventory-tri-program)* |x|x|x|x|x|x|x|x|x|x|x|

*Earlier data exist and are accessible but have not been validated

Expand All @@ -34,7 +35,8 @@ The core `stewi` module produces the following output formats:

[Flow-By-Facility](./format%20specs/FlowByFacility.md): Each row represents the total amount of release or waste of a single type in a given year from the given facility.

[Flow-By-Process](./format%20specs/FlowByProcess.md): Each row represents the total amount of release or waste of a single type in a given year from a specific process within the given facility. Applicable only to NEI and GHGRP.
[Flow-By-Process](./format%20specs/FlowByProcess.md): Each row represents the total amount of release or waste of a single type in a given year from a specific process within the given facility.
Applicable only to NEI and GHGRP.

[Facility](./format%20specs/Facility.md): Each row represents a unique facility in a given inventory and given year

Expand Down Expand Up @@ -64,7 +66,8 @@ Processing of the DMR uses the custom search option of the [Water Pollutant Load
- Estimation: On - estimates loads when monitoring data are not reported for one or more monitoring periods in a reporting year
- Nutrient Aggregation: On - Nitrogen and Phosphorous flows are converted to N and P equivalents

For validation, the sum of facility releases (excluding N & P) are compared against reported state totals. Some validation issues are expected due to differences in default parameters used by the water pollutant loading tool for calculating state totals.
For validation, the sum of facility releases (excluding N & P) are compared against reported state totals.
Some validation issues are expected due to differences in default parameters used by the water pollutant loading tool for calculating state totals.

### eGRID

Expand All @@ -74,7 +77,9 @@ For validation, the sum of facility releases are compared against reported U.S.
### GHGRP

GHGRP data are sourced from EPA's [Envirofacts API](https://enviro.epa.gov/)
For validation, the sum of facility releases by subpart are compared against reported U.S. totals by subpart and flow. The validation of some flows (HFC, HFE, and PFCs) are reported in carbon dioxide equivalents. Mixed reporting of these flows in the source data in units of mass or carbon dioxide equivalents results in validation issues.
For validation, the sum of facility releases by subpart are compared against reported U.S. totals by subpart and flow.
The validation of some flows (HFC, HFE, and PFCs) are reported in carbon dioxide equivalents.
Mixed reporting of these flows in the source data in units of mass or carbon dioxide equivalents results in validation issues.

### NEI

Expand All @@ -93,7 +98,8 @@ For validation, the sum of facility releases are compared to national totals by

## Combined Inventories

`stewicombo` module combines inventory data from within and across selected inventories by matching facilities in the [Facility Registry Service](https://www.epa.gov/frs) and chemical flows using the [Substance Registry Service](https://sor.epa.gov/sor_internet/registry/substreg/LandingPage.do).
`stewicombo` module combines inventory data from within and across selected inventories by matching facilities in the [Facility Registry Service](https://www.epa.gov/frs) and
chemical flows using the [Substance Registry Service](https://sor.epa.gov/sor_internet/registry/substreg/LandingPage.do).
If the `remove_overlap` parameter is set to True (default), `stewicombo` combines records using the following default logic:
- Records that share a common compartment, SRS ID and FRS ID _within_ an inventory are summed.
- Records that share a common compartment, SRS ID and FRS ID _across_ an inventory are assessed by compartment preference (see `INVENTORY_PREFERENCE_BY_COMPARTMENT`).
Expand All @@ -106,9 +112,9 @@ If the `remove_overlap` parameter is set to True (default), `stewicombo` combine
## Installation Instructions

Install a release directly from github using pip. From a command line interface, run:
> pip install git+https://github.com/USEPA/standardizedinventories.git@v1.0.5#egg=StEWI
> pip install git+https://github.com/USEPA/standardizedinventories.git@v1.1.0#egg=StEWI
where you can replace 'v1.0.5' with the version you wish to use under [Releases](https://github.com/USEPA/standardizedinventories/releases).
where you can replace 'v1.1.0' with the version you wish to use under [Releases](https://github.com/USEPA/standardizedinventories/releases).

Alternatively, to install from the most current point on the repository:
```
Expand All @@ -131,11 +137,16 @@ or
pip install . -r requirements.txt -r rcrainfo_requirements.txt
```

### Secondary Context Installation Steps
In order to enable calculation and assignment of urban/rural secondary contexts, please refer to
[esupy's README.md](https://github.com/USEPA/esupy/tree/main#installation-instructions-for-optional-geospatial-packages) for installation instructions,
which may require a copy of the [`env_sec_ctxt.yaml`](https://github.com/USEPA/standardizedinventories/blob/master/env_sec_ctxt.yaml) file included here.

## Data Products
Output of StEWI can be accessed for selected releases without having to run StEWI. See the [Data Product Links](https://github.com/USEPA/standardizedinventories/wiki/DataProductLinks) page for direct links to StEWI output files in Apache parquet format.
Output of StEWI can be accessed for selected releases without having to run StEWI.
See the [Data Product Links](https://github.com/USEPA/standardizedinventories/wiki/DataProductLinks) page for direct links to StEWI output files in Apache parquet format.

## Wiki

See the [Wiki](https://github.com/USEPA/standardizedinventories/wiki) for instructions on installation and use and for
citation and contact information.

Expand Down
Loading

0 comments on commit 084a311

Please sign in to comment.