-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update OGE for to work with PUDL v.2022.11.30 and integrate 2021 data #259
Conversation
Remaining steps to do:
|
When running the pipeline with the newly-downloaded EIA-930 data, it seems that there may have been some retroactive revisions to the 2020 data, because now when outputting consumed emission factors, we are getting an error message about there being negative emission factors when exporting monthly consumed factors for SEC. The root cause of this issue is described by #214, and was patched by #221 (in which we created a list of BAs with this issue and told the pipeline to use the reported demand values from EIA-930 instead of calculating net demand from generation and interchange). SEC was not previously included in this list, but now seems to be exhibiting these same symptoms. The larger fix to this is captured by #220, but in the meantime, there are a couple ways we can go about patching this.
|
Patch negative efs
So we are currently running into an issue where the consumed emission calculation in step 18 is returning missing values for all hours in all regions through April of 2021 (for the 2021 year run). I think I've traced the source of this issue to
for the first 2,900 datetimes in the for loop (corresponding with first four months), the
The comment on this code suggests that this error would only be raised when we don't have full data for all of the BAs. However, I'm not sure why there would not be full data now compared to the current release of OGE. It seems like the source of this new issue either has to result from 1) a change to the raw downloaded data from EIA or 2) a change to the way we are processing the data. However, I'm not noticing any changes in the code that would have changed this, so I'm kind of stumped about the cause. To further trace the source, I actually tried running the consumed emission calculation using the Another thing we may want to investigate: Is this maybe a result of our manual timestamp cleaning? Gailin I know that you looked into this already, but I'm wondering if something was corrected in the 930 balance files that we aren't catching, and that's leading to issues with several months of the data? There are a few things related to the cleaning of the 930 data that I'm noticing that I had questions about (may or may not be related to the above issue):
|
One other thing I'm noticing about this issue: in 2021, it is affecting all data from 1/1/2021 - 4/30/2021. In 2020, it is only affecting data for the month of March. It's strange that the impact is so neatly cutoff by month, which makes me wonder if there is a clue there - is there some step that we are doing that affects data on a month by month basis? |
As a side-note, because the interchange post physics-based cleaning is >1, it won't be set to zero in our filter for imputed ones. Also, the interchange is 1 in the balance files direct from EIA, so it's not noise introduced by |
The prior six months is needed for the rolling filter used by
No need to filter to 2021 values, since we only run the computation on dates in the OGE generation, which is limited to |
Actually, rewriting this check to make it more general will fix the HGMA bug and seems like the lowest impact option; I'll do that |
Run interchange check *after* converting nans to 0s
Fix consumed calc
This PR should close #163 |
This PR closes #258 by updating OGE to work with PUDL v2022.11.30, and integrating 2021 data.
NOTE: We should review and merge #246 into this branch before merging this branch into
development