update the 860m doi #3189

cmgosnell · 2023-12-22T22:49:46Z

Overview

Closes #3186.

What problem does this address?
We had to make a new zenodo archive with eia860m a years worth of monthly files zipped together into one resource because file upload limits. So this pr

What did you change?
basically nothing except the DOI. bc @e-belfer added a little logic into the datastore to work with partitions that are lists of partitions. And the excel extractor/the datastore combined already knows how to grab a file out of a zipped file bc of course it does bc so many of our one partition resources have many files. The main place where this is happening is load_excel_file.

We could remove the first try in load_excel_file because the old eia860m archive being individual files was actually the edge case.

Testing

How did you make sure this worked? How can a reviewer verify this?
I ran the fast etl locally. But first I thought I was going to have to muck with the excel extractor so I setup a little notebook testing situation and the simplest setup gave me the eia860m outputs:

from pudl.extract.eia860m import Extractor
from pudl.workspace.datastore import Datastore

ds = Datastore(local_cache_path=pudl.workspace.setup.PudlPaths().pudl_input)
self = Extractor(ds=ds)
raw_eia860m_dfs = self.extract(year_month=pudl.metadata.sources.SOURCES["eia860m"]["working_partitions"]["year_month"])

To-do list

Give feedback

Make sure full ETL runs & `make pytest-integration-full` passes locally

Make sure full ETL runs & make pytest-integration-full passes locally
Options
Successfully updated the issue's project

There was an error updating the issue's project
For major data coverage & analysis changes, [run data validation tests](https://catalystcoop-pudl.readthedocs.io/en/latest/dev/testing.html#data-validation)

For major data coverage & analysis changes, run data validation tests
Options
Successfully updated the issue's project

There was an error updating the issue's project
If updating analyses or data processing functions: make sure to update or write data validation tests

If updating analyses or data processing functions: make sure to update or write data validation tests
Options
Successfully updated the issue's project

There was an error updating the issue's project
Update the [release notes](../docs/release_notes.rst): reference the PR and related issues.

Update the release notes: reference the PR and related issues.
Options
Successfully updated the issue's project

There was an error updating the issue's project
Review the PR yourself and call out any questions or issues you have

Review the PR yourself and call out any questions or issues you have
Options
Successfully updated the issue's project

There was an error updating the issue's project
Options

it seems to all just work which is tres fun but makes sense after looking at it

e-belfer

Looks good! I'll do the final validation in the CEMS branch.

zaneselvans · 2023-12-26T16:28:09Z

The change to allow lists in the partitions has broken the docs build script, which reads those partitions to generate the dataset docs and.

e-belfer · 2023-12-26T16:29:00Z

The change to allow lists in the partitions has broken the docs build script, which reads those partitions to generate the dataset docs and.

Already on it!

zaneselvans

The documentation build script isn't expecting to find lists in the partitions, and so is failing when it attempts to build the data source specific docs pages using the Jinja templates. It needs to be updated to accommodate the new metadata structure associated with the newly bundled raw archives.

update the 860m doi

ee9b3bf

it seems to all just work which is tres fun but makes sense after looking at it

cmgosnell requested a review from e-belfer December 22, 2023 22:49

cmgosnell linked an issue Dec 22, 2023 that may be closed by this pull request

EIA 860M: Retool extraction to handle listed partitions #3186

Closed

e-belfer approved these changes Dec 26, 2023

View reviewed changes

e-belfer marked this pull request as ready for review December 26, 2023 15:58

zaneselvans self-requested a review December 26, 2023 16:28

zaneselvans requested changes Dec 26, 2023

View reviewed changes

Fix docs build

61bc757

e-belfer merged commit 542ce85 into cems-extraction Dec 26, 2023
12 of 13 checks passed

e-belfer deleted the eia860m-extraction branch December 26, 2023 17:08

cmgosnell mentioned this pull request Dec 29, 2023

cleanup the excel extractor to no longer look for a single file #3200

Closed

zaneselvans added eia860 Anything having to do with EIA Form 860 zenodo Issues having to do with Zenodo data archiving and retrieval. excel Issues involving data in Microsoft Excel spreadsheets labels Feb 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update the 860m doi #3189

update the 860m doi #3189

cmgosnell commented Dec 22, 2023 •

edited

Loading

To-do list

e-belfer left a comment

zaneselvans commented Dec 26, 2023

e-belfer commented Dec 26, 2023

zaneselvans left a comment

update the 860m doi #3189

update the 860m doi #3189

Conversation

cmgosnell commented Dec 22, 2023 • edited Loading

Overview

Testing

To-do list

e-belfer left a comment

Choose a reason for hiding this comment

zaneselvans commented Dec 26, 2023

e-belfer commented Dec 26, 2023

zaneselvans left a comment

Choose a reason for hiding this comment

cmgosnell commented Dec 22, 2023 •

edited

Loading