Negative EFs for some BAs #214

ewezerek · 2022-09-02T22:57:08Z

Perhaps this can be explained by a more detailed reading of the OGEI methodology; some BAs have negative EFs. For example, for the annual 2020 data, SCL, PGE, and PSEI all have negative values. I would expect the minimum EF to be 0, which would represent a 100% renewable mix. Are these negative values errors? If not, would an entity really be able to report negative emissions from their use of carbon-free electricity procured from one of these BAs, or should the EFs be rounded up to 0?

grgmiller · 2022-09-02T23:31:55Z

Thanks for bringing this to our attention! This appears to be a bug somewhere in our pipeline.

Why can emissions factors be negative?

In certain rare cases, it is possible for a plant to have a negative emission rate. However, this is usually the result of a plant having positive emissions output but negative net generation (which means that the plant is consuming more electricity from the grid than it generates). This may happen if a plant is idling/on standby, and not exporting electricity to the grid, but still consuming some fuel. In the hourly data, we've also sometimes noticed that during plant startup, during an hour when the plant starts consuming fuel but has not yet started exporting electricity to the grid, that negative emissions factors are possible (#155). Oftentimes, though, this is only an issue when examining plant-level data, since once aggregated to the regional level, it is rare that there is negative net generation for the entire region.

In general though, reporting a negative emission rate can be confusing, so to be consistent with the methodology used in eGRID, we should probably change negative emissions rates as zero.

This specific issue

The specific issue that you've reported seems to be unrelated to negative net generation, but instead seems to be a bug somewhere in the data pipeline.

Focusing on the 2020 PGE data:

Looking at the power sector data for PGE, it appears that there is no negative net generation reported in the hourly data, so that should not be the cause of the negative emissions.
The hourly carbon accounting file for PGE likewise has no negative values for any of the consumed emission rates, so it seems that this is not an issue with the consumed emissions calculation in general.
The negative EFs only show up in the monthly and annually-aggregated PGE values, which suggests an issue with the aggregation function (how could we get negative values when we're only aggregating positive hourly values?)

Possible sources to investigate:

Is the calculated net_consumed_mwh value somehow becoming negative as a result of the MRIO calculation?
Could this somehow be the result of output_data.round_table?

@gailin-p any other ideas?

TODO

Add a validation test to check for negative emissions rates before exporting the data
Before exporting generated emissions rates, replace any rates < 0 with 0
Fix bug in aggregated consumed outputs

grgmiller · 2022-09-03T16:59:44Z

So after investigating this a bit further, here's a list of where we have data with negative emissions factors in our data (just focused on 2020 for now):

Power sector data, hourly: ['AEC' 'IID' 'NEVP' 'NYIS' 'OHMS']
Power sector data, monthly: none
Power sector data, annual: none
Carbon accounting data, hourly: ['CPLW']
Carbon accounting data, monthy: ['PACW' 'PGE' 'PSEI' 'SCL' 'TPWR']
Carbon accounting data, annual: ['PGE' 'PSEI' 'SCL' 'TPWR']

For the power sector data, it seems like we just need to replace negative emission rates with zero, consistent with eGRID.

The source of the issue for carbon accounting data appears unrelated to the issue with the power sector data, since there is no overlap in the regions where the negative data exists between the power sector data (which is used as an input for the consumed emissions calculation) and the carbon accounting data.

I think I may have identified the source of this error: in consumed.output_results(), we calculate net_consumed_mwh as net_generation_mwh + total_interchange. However, it appears that the sign convention for total interchange is that negative total interchange is imports and positive total interchange is exports. Thus, net_consumed_mwh should be calculated as net_generation_mwh - total_interchange.

@gailin-p since you're more familiar with the consumed emissions code, I wanted to clarify a few details:

It looks like in the eia930_elec.csv data at least, the sign convention on total interchange is negative means imports. Does our code intentionally reverse this sign convention at any point?
When running the consumed emissions matrix calculation, is that calculation using the correct sign convention, or is it reversing imports and exports?
Since in theory demand = generation - total interchange, why don't we just use the demand value instead of calculating net_consumed_mwh? Shouldn't the cleaned EIA-930 demand be equal to net_consumed_mwh? Or do the total interchange values get changed during the consumed emissions calculation?
I see that the code refers to both TI and ID when referring to interchange. What is the difference between these values? Is ID interchange between two specific BAs?

grgmiller · 2022-09-03T18:43:28Z

@gailin-p So I tested updating the calculation of net_consumed_mwh to be net_generation_mwh - total_interchange, and that seems to have mostly done the trick. There are now no negative EFs in the annual consumed data, the monthly data includes some negative values for EEI, and the hourly data only includes negative values for CPLW.

Looking at the CPLW values, it looks like the only negative values are really small negatives (e.g. 2e-15), so if we updated our output_data.round_table() function to use the median value of a column rather than the minimum value, this should fix this (for CPLW, the typical value is in the hundreds of lbs, but because this small negative value is there, all of the values are getting rounded to 15 significant figures).

Two things I wanted to follow up on though:

When digging around in the CPLW data, I noticed for one hour (2020-06-23 11:00:00+00:00), the net_consumed_mwh is now reported as -69.6 MWh. However, looking at the cleaned data in eia930_elec.csv for the same hour, it appears that net generation was 573 MWh, demand was 492 MWh, and total interchange was 81 MWh (which in this sign convention means CPLW was exporting 81 MWh. I'm just wondering where the -69 MWh is coming from (when it should be +492)?
Is it even physically possible for net_consumed_mwh to be negative? Wouldn't this mean that a region is exporting more electricity than it generates?
I'm still wondering how we can be getting negative consumed emission rates in the monthly values for EEI when there are not any negative consumed EFs in the hourly data. My suspicion is that it has to do with how the calculation calculates total emissions as consumed rate * net_consumed_mwh and then after aggregating re-calculates the rate as consumed emissions / net_consumed_mwh. I haven't confirmed this but I'm wondering if both the consumed rate and net consumed mwh are negative, we end up with a positive value for consumed emissions, but then after aggregating, we are dividing positive emissions by negative net consumed to get a negative rate. Is it possible for the consumed rate to be negative? If so, what does this mean?

TODO:

Update output_data.round_table() to use the median column value instead of the minimum
Update stoplight documentation of consumed emissions calculation to provide more detail about how this calculation works (once we've figured out this issue)

ewezerek · 2022-09-06T03:38:33Z

Thank you for the open documentation around your methodology. Looks like your team is working through this issue.

On a general note, as you develop the electricity EF dataset, consider that users have historically trusted eGRID as the go-to resource and it has the authority/brand recognition of EPA as the data provider. Similarly, carbon accountants refer to GHG Protocol as the go-to resource for emissions inventorying. The GHG Protocol offers a “Built on GHG Protocol” mark "for GHG Protocol to recognize products that have been developed in conformance with a GHG Protocol standard. Those that acquire the mark will benefit from the GHG Protocol’s reputation as the gold standard for GHG accounting." Consider partnering with EPA/asking for them to verify your methodology to reduce the friction of users potentially switching electricity EF data sources.

gailin-p · 2022-09-06T14:55:19Z

Some answers to Greg's specific questions:

It looks like in the eia930_elec.csv data at least, the sign convention on total interchange is negative means imports. Does our code intentionally reverse this sign convention at any point?

No, it's our aggregation code that's incorrect

When running the consumed emissions matrix calculation, is that calculation using the correct sign convention, or is it reversing imports and exports?

The matrix calc is correct.

open-grid-emissions/src/consumed.py

Line 162 in 799cac2

Imp = (-ID).clip(min=0) # trade matrix reports exports - we want imports

Since in theory demand = generation - total interchange, why don't we just use the demand value instead of calculating net_consumed_mwh? Shouldn't the cleaned EIA-930 demand be equal to net_consumed_mwh? Or do the total interchange values get changed during the consumed emissions calculation?

We use our output generation data to calculate net_consumed_mwh, which is not equal to 930 generation. It's the 930 generation that fulfills demand = generation - total interchange. As a side note, that equality will always hold for 930 data after gridemissions cleaning, even when it doesn't hold for raw 930 data.

We could use 930 generation to calculate net_consumed_mwh. This would mean that in the consumed calculation we would only be using our data for generated emission intensity. I originally chose to use our net_generation number to calculate net_consumed_mwh because in general we trust our generation over 930-reported generation, but it does mean that we're mixing data sources

I see that the code refers to both TI and ID when referring to interchange. What is the difference between these values? Is ID interchange between two specific BAs?

Yes, ID is for BA-specific interchange, while TI is total interchange. This naming convention is from EIA's V1 API; we use it because it's used in gridemissions, which we use for 930 cleaning.

When digging around in the CPLW data, I noticed for one hour (2020-06-23 11:00:00+00:00), the net_consumed_mwh is now reported as -69.6 MWh. However, looking at the cleaned data in eia930_elec.csv for the same hour, it appears that net generation was 573 MWh, demand was 492 MWh, and total interchange was 81 MWh (which in this sign convention means CPLW was exporting 81 MWh. I'm just wondering where the -69 MWh is coming from (when it should be +492)?

I will investigate this. The issue may be that we're not using 930 data for net generation. (see above)

Is it even physically possible for net_consumed_mwh to be negative? Wouldn't this mean that a region is exporting more electricity than it generates?

It is not physically possible, but may be possible because we're sourcing interchange data from 930 and generation data from our power system results. One solution would be to run the gridemissions physics-based cleaning on our net generation results before the consumed emissions calculation, which would ensure that interchange is always physically possible. A simpler solution would be to limit interchange to min(interchange, net_generation)

I'm still wondering how we can be getting negative consumed emission rates in the monthly values for EEI when there are not any negative consumed EFs in the hourly data. My suspicion is that it has to do with how the calculation calculates total emissions as consumed rate * net_consumed_mwh and then after aggregating re-calculates the rate as consumed emissions / net_consumed_mwh. I haven't confirmed this but I'm wondering if both the consumed rate and net consumed mwh are negative, we end up with a positive value for consumed emissions, but then after aggregating, we are dividing positive emissions by negative net consumed to get a negative rate. Is it possible for the consumed rate to be negative? If so, what does this mean?

As discussed above, consumed_mwh should never be negative, so fixing that may fix this.

gailin-p · 2022-09-07T02:09:30Z

After fixing the sign switch in the calculation of net_consumed_mwh, we found that:

some BAs still had negative hourly rates. These largest magnitude of these was 1e-11, indicating that they are probably 0 rates in BAs exporting all their generation or generating only zero-carbon energy. These values will round to zero during output.
EEI still has negative rates during aggregation, an issue caused by negative net_consumed_mwh in hours when our net_generation_mwh is less than the EIA-930 interchange. During aggregation, we weight rates by net_consumed_mwh so if, for example, a BA consumes more during the day, the consumed emission rates during the day are weighted more than at night. This doesn’t make sense if consumed_mwh is sometimes negative, first because we know that those negative values are the erroneous result of combining 930 and our data, second because it can introduce negative rates

We see negative net_consumed_mwh during 2020 in SPA, CPLW, SEC, GCPD, AZPS, and EEI. In SEC, the issue is isolated to a few hours. In the rest of the BAs, we assume inconsistencies between our net_generation_mwh and EIA-930 interchange result in negative net_consumed_mwh during enough hours that the resulting aggregations are inaccurate. To patch this issue, we will use EIA-930 demand (after gridemissions cleaning) to weight consumed emission rates during aggregation to monthly and annual time periods.

In the future, we may implement gridemissions physics-based cleaning after calculation of net_generation_mwh, ensuring that interchange would always be consistent with generation. See #220

gailin-p · 2022-09-07T02:48:21Z

In power_sector_data outputs, we will handle negative rates by setting them to zero (following eGRID's methodology). Before rounding, we confirmed that these events are localized.

BA	fuel	# negative CO2 rate events in 2020
AEC	coal	34
IID	natural_gas	95
NEVP	coal	7
NYIS	petroleum	490
OHMS	coal	4

The magnitude of these events can be very large, < -1,000,000 lb CO2/MWh, but they are limited to fossil fuel types and rare, indicating that they are limited to plant startup.

The most widespread issues occur in NYIS, where there are multiple negative rate events in August.

grgmiller · 2022-09-08T15:50:08Z

This issue has been fixed with our latest patch release.

Thank you again for flagging this!

grgmiller added bug Something isn't working data outputs New output files or data fields that should be added labels Sep 2, 2022

grgmiller assigned gailin-p Sep 3, 2022

gailin-p mentioned this issue Sep 7, 2022

Re-assess cleaning of EIA-930 data #220

Open

gailin-p mentioned this issue Sep 7, 2022

Gailin/fix consumed sign bug #221

Merged

grgmiller closed this as completed Sep 8, 2022

This was referenced Dec 20, 2022

Update OGE for to work with PUDL v.2022.11.30 and integrate 2021 data #259

Merged

Further patching the negative consumed EF issue #263

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Negative EFs for some BAs #214

Negative EFs for some BAs #214

ewezerek commented Sep 2, 2022

grgmiller commented Sep 2, 2022 •

edited by gailin-p

Loading

grgmiller commented Sep 3, 2022 •

edited

Loading

grgmiller commented Sep 3, 2022 •

edited by gailin-p

Loading

ewezerek commented Sep 6, 2022 •

edited

Loading

gailin-p commented Sep 6, 2022

gailin-p commented Sep 7, 2022

gailin-p commented Sep 7, 2022

grgmiller commented Sep 8, 2022

Negative EFs for some BAs #214

Negative EFs for some BAs #214

Comments

ewezerek commented Sep 2, 2022

grgmiller commented Sep 2, 2022 • edited by gailin-p Loading

Why can emissions factors be negative?

This specific issue

TODO

grgmiller commented Sep 3, 2022 • edited Loading

grgmiller commented Sep 3, 2022 • edited by gailin-p Loading

ewezerek commented Sep 6, 2022 • edited Loading

gailin-p commented Sep 6, 2022

gailin-p commented Sep 7, 2022

gailin-p commented Sep 7, 2022

grgmiller commented Sep 8, 2022

grgmiller commented Sep 2, 2022 •

edited by gailin-p

Loading

grgmiller commented Sep 3, 2022 •

edited

Loading

grgmiller commented Sep 3, 2022 •

edited by gailin-p

Loading

ewezerek commented Sep 6, 2022 •

edited

Loading