Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename MCOE and plant part list assets #2904

Merged
merged 7 commits into from Oct 6, 2023

Conversation

bendnorman
Copy link
Member

@bendnorman bendnorman commented Sep 29, 2023

PR Overview

This PR:

  • Applies new naming convention to MCOE and PPL assets
  • Remove reference to mcoe in assets because they don't fully calculate mcoe yet. See discussion in Apply naming convention to output assets #2788
  • Adds Package.get_sorted_resourced() method so we can order the tables in the data dictionary and datasette.
  • Adds a dictionary of helper functions called JINJA_FILTERS to pudl.metadata.helpers that can be added to jinja environments. I added this because I was getting a malformed cross ref sphinx error on intermediate table names because of the preceding underscore.

PR Checklist

  • Merge the most recent version of the branch you are merging into (probably dev).
  • All CI checks are passing. Run tests locally to debug failures
  • Make sure you've included good docstrings.
  • For major data coverage & analysis changes, run data validation tests
  • Include unit tests for new functions and classes.
  • Defensive data quality/sanity checks in analyses & data processing functions.
  • Update the release notes and reference reference the PR and related issues.
  • Do your own explanatory review of the PR to help the reviewer understand what's going on and identify issues preemptively.

@codecov
Copy link

codecov bot commented Oct 3, 2023

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (7a7a441) 88.5% compared to head (9f578b3) 88.5%.

Additional details and impacted files
@@                Coverage Diff                 @@
##           rename-core-assets   #2904   +/-   ##
==================================================
  Coverage                88.5%   88.5%           
==================================================
  Files                      90      90           
  Lines                   10808   10819   +11     
==================================================
+ Hits                     9570    9580   +10     
- Misses                   1238    1239    +1     
Files Coverage Δ
src/pudl/analysis/allocate_gen_fuel.py 91.3% <ø> (ø)
src/pudl/analysis/mcoe.py 97.4% <ø> (ø)
src/pudl/analysis/plant_parts_eia.py 96.5% <100.0%> (ø)
src/pudl/metadata/helpers.py 97.8% <100.0%> (+<0.1%) ⬆️
src/pudl/metadata/resources/eia.py 100.0% <ø> (ø)
src/pudl/metadata/resources/eia860.py 100.0% <ø> (ø)
src/pudl/metadata/resources/mcoe.py 100.0% <ø> (ø)
src/pudl/output/eia.py 59.0% <100.0%> (ø)
src/pudl/output/pudltabl.py 89.1% <ø> (ø)
src/pudl/metadata/classes.py 86.4% <90.0%> (-0.1%) ⬇️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

compute_kind="Python",
io_manager_key="pudl_sqlite_io_manager",
description=f"{agg_freqs[freq].title()} heat rate estimates by generation unit. Generation "
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the descriptions to the assets because their metadata will be removed from pudl.metadata.resources when we deprecate PudlTabl.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true for other assets created through asset factories, or is there something special about this set?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true for other intermediate assets that are currently written to the database because we provide access to them via PudlTabl. These assets will use the default IO manager when we deprecate PudlTabl.

There are a handful of other output intermediate assets that are being written to the database that don't have descriptions specified in the asset decorators. I can remove these descriptions for now and handle it when we deprecate PudlTabl.

Comment on lines +2028 to +2033
if exclude_intermediate_resources:
[
resource
for resource in self.resources
if not resource.name.startswith("_")
]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this if we want to exclude intermediate assets from datasette. I think we should include them for now so the database, data dictionary and datasette are all consistent.

@@ -183,12 +183,10 @@
for freq in AGG_FREQS
}
| {
f"mcoe_generators_{freq}": {
f"out_eia__{freq}_generators": {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided to add all generator attributes to this table so it's a one stop shop for users.

@@ -114,7 +114,7 @@ def out_eia__yearly_plants(
},
compute_kind="Python",
)
def out_eia__yearly_generators(
def _out_eia__yearly_generators(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this an intermediate table because the old mcoe_generators_{freq} table (now out_eia__{freq}_generators) has all of the same attributes plus the valuable derived attributes.

@bendnorman
Copy link
Member Author

@ella the migrations changed on rename-core-assets when I merged dev in so you'll probably need to recreate you db.

Copy link
Member

@e-belfer e-belfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One blocking question re the plant_parts asset name, otherwise just some comments and questions for clarification. Fast ETL worked perfectly for me out of the box.

docs/templates/package.rst.jinja Outdated Show resolved Hide resolved
compute_kind="Python",
io_manager_key="pudl_sqlite_io_manager",
description=f"{agg_freqs[freq].title()} heat rate estimates by generation unit. Generation "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true for other assets created through asset factories, or is there something special about this set?

src/pudl/analysis/mcoe.py Show resolved Hide resolved
io_manager_key="pudl_sqlite_io_manager",
compute_kind="Python",
)
def plant_parts_eia_asset(
mega_generators_eia: pd.DataFrame,
def out_eia__plant_parts(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This asset name doesn't seem to fit with our existing model - doesn't it need some kind of asset type (probably yearly?).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

asset type can be optional. I didn't think there was logical frequency or type for the table though I could be wrong. @cmgosnell and @katie-lamb what do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the plant parts are assigned annually, but correct me if I'm wrong @cmgosnell

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's an annual table! (it technically could be generated as a monthly table but it would be ginombo and rn this is mostly being used to link up to annual ferc data)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! It has been renamed to out_eia__yearly_plant_parts.

src/pudl/etl/__init__.py Show resolved Hide resolved
src/pudl/output/pudltabl.py Outdated Show resolved Hide resolved
src/pudl/metadata/helpers.py Show resolved Hide resolved
@bendnorman
Copy link
Member Author

This is ready for another review @e-belfer!

@e-belfer e-belfer self-requested a review October 5, 2023 18:10
Copy link
Member

@e-belfer e-belfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and fast ETL runs as is.

@bendnorman bendnorman merged commit 1b5100e into rename-core-assets Oct 6, 2023
11 checks passed
@bendnorman bendnorman deleted the rename-mcoe-assets branch October 6, 2023 19:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

3 participants