Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create EIA API v2 fuel price archiving script #1762

Closed
2 tasks done
Tracked by #1708
zaneselvans opened this issue Jul 18, 2022 · 2 comments
Closed
2 tasks done
Tracked by #1708

Create EIA API v2 fuel price archiving script #1762

zaneselvans opened this issue Jul 18, 2022 · 2 comments
Assignees
Labels
eia923 Anything having to do with EIA Form 923 new-data Requests for integration of new data.

Comments

@zaneselvans
Copy link
Member

zaneselvans commented Jul 18, 2022

The EIA API contains data that's not directly available from the spreadsheets they publish, including aggregate fuel prices which include redacted fuel deliveries to IPPs/merchant generators. The API itself isn't completely reliable, and we really shouldn't have to download the same data over and over again.

To provide reliable access to this information within PUDL:

  • Create a python script which will download the entire history of aggregate fuel prices ($/mmbtu) and quantities delivered (mmbtu), including, broken down by time step, geographic area, fuel type, and industrial sector. This script can be modeled after the EPA CEMS scraper, which works outside of Scrapy (since it's pulling from the API directly). Store the archived data in as close to its original form as possible, probably a single zipped JSON file. Ideally this script should be written such that we can easily add other data series to archive from the EIA API if we need them in the future as well.
  • Integrate this script into the pudl-scrapers repo alongside the other non-scrapy script that we use to download the EPA CEMS data from their janky FTP server.
@zaneselvans zaneselvans added eia923 Anything having to do with EIA Form 923 new-data Requests for integration of new data. labels Jul 18, 2022
@TrentonBush
Copy link
Member

There are multiple possible sources of this data: the EIA API, the EIA API bulk data downloader, and the EIA Electric Power Monthly (EPM). I already compared the API vs EPM in #1712

After looking into the API and bulk downloader, there are differences there as well:

  • The API has many more fuel categories (45) than the bulk download (7). But many of the extra categories are irrelevant (renewables, nuclear, hydro, etc) and I'm not sure how many have actual data.
  • The API has one additional sector: Electric Power Sector Non-CHP
  • The GUI API explorer doesn't let you select quarterly resolution for fuel receipts and costs, but it is present in both the bulk data and the actual API.

I assume most of the extra fuel categories are mostly nulls, but I'll check. Under that assumption, I think the operational advantages of the bulk download outweigh the few extra categories from the API.

@zaneselvans
Copy link
Member Author

@TrentonBush Is this issue done?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
eia923 Anything having to do with EIA Form 923 new-data Requests for integration of new data.
Projects
None yet
Development

No branches or pull requests

2 participants