Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dbf xbrl mapping #2088

Merged
merged 11 commits into from
Nov 29, 2022
Merged

Dbf xbrl mapping #2088

merged 11 commits into from
Nov 29, 2022

Conversation

zaneselvans
Copy link
Member

@zaneselvans zaneselvans commented Nov 22, 2022

What this PR does

  • Adds some pre-commit / git magic to ensure that all CSVs (and other text files) have Unix-style line endings in the repository, so we don't get full-file diffs whenever we edit a CSV using Excel / Libre Office.
  • Applies that magic to all our existing CSVs.
  • Renames some of the values & columns in the DBF to XBRL row mapping CSV to be more generic, since the nomenclature of the Plant in Service table doesn't apply across all tables we are using it to transform.
  • Add a new CSV dbf_to_xbrl_tables.csv that associates old DBF tables with new XBRL tables, so we know which of them are related to each other. This will supplant the table names dictionary we have in pudl.extract.ferc1 right now, and is necessary because there are one-to-many and many-to-one relationships between these tables.
  • Corrected & added details to our FERC Form 1 data dictionary table descriptions -- stuff I discovered while trying to map the DBF & XBRL tables to each other.
  • Removed the old row-mapping metadata for the plant_in_service_ferc1 table, since it has been replaced by the new DBF + XBRL alignment process & mapping.

PR Checklist

Before requesting a review of your pull request, please make sure you've done the
following:

  • Merge the most recent version of dev (or the appropriate upstream branch) into
    your branch and resolved any merge conflicts. You may need to do this several
    times over the course of a PR as dev changes frequently.
  • Verify that all of the CI checks on your PR are passing. See
    Running Tests with Tox
    for details on how to run the full test suite locally if you need to debug a
    particular failure.
  • Ensure that the docstrings for any new modules, classes, functions, or methods are
    descriptive enough for developers and users to understand your code.
  • If you expanded data coverage or changed the outputs, ensure that the full
    data validation tests
    pass locally on a fresh DB.
  • If you've added new functions or classes, ensure that they have at least basic
    unit tests.
  • If you've added new analyses, make sure they include defensive sanity checks that
    will catch unexpected data issues.
  • Update the
    release notes
    to reflect your changes. Make sure to reference the PR and any related issues.
  • Do your own review of the PR. Add comments highlighting areas where you have
    questions you'd like reviewers to answer, known issues, solutions you're
    unsatisfied with, or other things that deserve special attention from the
    reviewer.

zaneselvans and others added 7 commits November 21, 2022 18:36
Corrected an incorrect mapping of the billed vs. unbilled electricity
revenue tables.

Also changed the `mixed-line-ending` pre-commit hook to always enforce
Unix style LF endings, rather than Windows CRLF endings, since switching
back and forth when someone edits a CSV in Excel causes whole-file diffs
when only a few lines have actually changed, and that's not helpful.
@codecov
Copy link

codecov bot commented Nov 22, 2022

Codecov Report

Base: 85.1% // Head: 85.1% // No change to project coverage 👍

Coverage data is based on head (9fd011f) compared to base (ba4a3ad).
Patch coverage: 100.0% of modified lines in pull request are covered.

Additional details and impacted files
@@          Coverage Diff          @@
##             dev   #2088   +/-   ##
=====================================
  Coverage   85.1%   85.1%           
=====================================
  Files         72      72           
  Lines       8202    8202           
=====================================
  Hits        6981    6981           
  Misses      1221    1221           
Impacted Files Coverage Δ
src/pudl/transform/ferc1.py 94.7% <100.0%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@zaneselvans zaneselvans linked an issue Nov 22, 2022 that may be closed by this pull request
22 tasks
@zaneselvans zaneselvans marked this pull request as ready for review November 29, 2022 14:58
@zaneselvans zaneselvans added ferc1 Anything having to do with FERC Form 1 metadata Anything having to do with the content, formatting, or storage of metadata. Mostly datapackages. xbrl Related to the FERC XBRL transition labels Nov 29, 2022
.gitattributes Show resolved Hide resolved
devtools/environment.yml Show resolved Hide resolved
.pre-commit-config.yaml Show resolved Hide resolved
Copy link
Member

@cmgosnell cmgosnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks mostly good. this is a future suggestion, but I think for this kind of pr it would have been helpful to have a lil list of the things you did and why bc there are a few disparate things going on in here. but generally...

  • lots of csv's touched bc of the gitattributes/pre-commit changes
  • more table descriptions in our ferc db data dictionaries
  • renaming xbrl_column_stem to xbrl_factoid
  • renaming some of the row_types in the dbf to xbrl map.
  • anything else?

I have a few calrifying questions but generally this looks just fine to me.

devtools/environment.yml Show resolved Hide resolved
src/pudl/package_data/ferc1/dbf_to_xbrl.csv Show resolved Hide resolved
@zaneselvans
Copy link
Member Author

@cmgosnell I added a description to the PR. Sorry I left that out. This is a mishmash.

@zaneselvans zaneselvans merged commit 9a5b3d1 into dev Nov 29, 2022
@cmgosnell cmgosnell deleted the dbf-xbrl-mapping branch November 29, 2022 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ferc1 Anything having to do with FERC Form 1 metadata Anything having to do with the content, formatting, or storage of metadata. Mostly datapackages. xbrl Related to the FERC XBRL transition
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Map FERC 1 DBF rows to XBRL columns for targeted tables
2 participants