You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Updates pudl dependency from v2022.11.30 to v2023.12.01, which includes a number of updates to the database structure and naming conventions (see pudl release notes)
Changes source of PUDL database download to AWS rather than Zenodo, providing faster access to PUDL data releases
PUDL’s CEMS database now includes data from AK, HI, and PR, which should improve hourly emissions data coverage for plants in AK and HI
A cleaned and standardized version of the EPA-EIA power sector data crosswalk is now included in the pudl database, meaning we no longer have to manually load and standardize this data
Emissions control equipment data from EIA-860 is now included in the pudl database, meaning we no longer need to manually load and standardize this data
Leading zeros removed from boiler_ids, which should improve mapping between boiler tables
The EIA-923 generation and fuel allocation process is now fully integrated into PUDL
Fixes an issue where certain plants in NY state were being assigned the wrong BA code.
NOX and SO2 emissions factors: added new factors for boiler configurations that had not previously been included in the table.
Balancing Areas: Added retirement dates for the CFE (July 2018), GLHB (September 2022), GRIF (November 2023) balancing areas
Added new EPA-EIA plant and unit crosswalks based on 2022 data
Added several new mappings between utilities and balancing areas
Infrastructure Updates
Updates Python dependency from 3.10 to 3.11
Refactors and packages OGE codebase so that functions, reference tables, and data from OGE can be imported into other projects. This package will go live on PyPi soon. (#323)
Re-organizes location of data files. The data/manual files have been renamed to reference_tables and moved to src/oge, while all downloads, output files, and result files will now be saved in the user’s home directory in a folder called open_grid_emissions_data (#324)
Adds support for pipenv environment management in addition to conda (#313)
Changes PUDL and gridemissions dependencies to forks within the singularity-energy organization, rather than forked versions that lived in individual authors’ github accounts.
Moves documentation from separately-maintained repo into the OGE repo (#303)
Changes code formatting from black to ruff and adds formatting checks that must pass before merging code (#317)
Other bug/data quality fixes
Ensure complete as possible EPA-EIA power sector data crosswalk by combining pudl-standardized PSDC, plant code mappings from eGRID, and our own manual crosswalking.
Add handling for negative fuel consumption reported in EIA-923
Stop dropping missing and zero values to help ensure complete timeseries
Previously, we had dropped data from CEMS that reflected units that only reported steam generation but no electricity generation. Based on an updated understanding of this data, we no longer drop this data from OGE.
Fixes bug in EIA-923 generation and fuel allocation process that was resulting in certain reported fuel consumption data being dropped for plants that retire mid-year
Updates manual timestamp corrections to EIA-930 data for 2022 and on CAISO data (#300), 2021 and on TEPC data (#322)
Adds new data validation checks
Flags when different plant primary fuel identification methods result in different primary fuel assignments: Exports the primary_fuel_table with all intermediate columns to outputs to help with validation. Adds a new validation check to flag when the plant primary fuel assigned by the pipeline does not match the capacity-based primary fuel assignment. (#296)
Flags when subplants only contain a single combined cycle component: Combined cycle generators contain a steam part (CA) and turbine part (CT) that are linked together. Thus, our subplant groups that contain one part of a combined cycle plant should always in theory contain the other part as well. This PR adds a test that checks that both parts exist in a subplant if one exists. Besides CT and CA prime movers, there is also CS prime movers which represent a "single shaft" combined cycle unit where the steam and turbine parts share a single generator. These prime movers are allowed to be by themselves in a subplant, as are CC prime movers, which represent a "total unit." This PR adds a prime_mover_code column to the subplant crosswalk table to help validating this.(#297)
Checks for complete monthly data within a single year: Checks that 12 monthly “report_date”s exist for each plant/subplant, and also checks that the number of missing monthly datapoints matches the number of missing datapoints in the input data from CEMS and EIA-923.
Checks for complete hourly timestamps within a single year or single month: If the period is a 'year', checks that the length of the timeseries is 8760 (for a non-leap year) or 8784 (for a leap year). If the period is a 'month', checks that the length of the timeseries is equal to the length of the complete date_range between the earliest and latest timestamp in a month.(#299)
Exports a new output table that identifies whether input data (and non-zero input data) exists for each plant in EIA-923 and/or CEMS.