FERC regressions #204

alanawlsn · 2018-09-28T18:46:51Z

Implement regressions within FERC O&M dataset that will allow us to attribute fixed vs. variable cost components at the (FERC) plant level.

zaneselvans · 2018-09-28T19:52:08Z

Also, the scikit-learn python library has a nice page talking about its collection of linear models, with some background on each one.

The end goal of looking at these costs is really to build some kind of model of the cost per MWh on a per plant basis, that depends on... what? Like, if we wanted to predict the marginal cost of electricity for a given plant, what would we need? I think the relevant inputs (independent variables, X) end up being something like:

plant capacity (MW)
plant heat rate (mmBTU/MWh)
capacity factor (unitless)
fuel heat content (mmBTU/unit)
fuel price ($/mmBTU or $/unit)
primary fuel type (categorical -- coal or gas)
plant technology (categorical)
year of construction (number? or is it categorial?)

And the output we're trying to obtain is just the cost of a marginal unit of electricity production ($/MWh).

Especially given how little information we have about what goes into all of those different FERC cost categories, and whether or not the utilities are reporting those cost categories in a really standard way, I wonder if we might have better luck just using these simpler inputs / output in the regression?

These inputs would allow terms that are a function of plant capacity (i.e $/MW installed), as well as capacity factor, and of course also true fixed costs. The regression wouldn't be confounded by fuel cost volatility, which are probably one of the larger sources of variance, since it would have all the information required to get the fuel cost component right (heat rate, fuel price, fuel heat content). We could even just leave in all the different fuels with their different prices and heat contents separated if we wanted to (since some coal plants use a non-trivial fraction of gas or oil).

Does that seem sensible?

zaneselvans · 2018-09-28T19:52:15Z

Thinking about this some more... don't we really just need to figure out a model for the non-fuel costs since we know pretty much exactly how the fuel costs contribute to the overall cost per MWh? Then we just need to fit the remaining non-fuel costs per MWh to a function of the plant capacity, capacity factor, plant & fuel type, and year/decade of construction. We can leave out the fuel price, heat rate, heat content, etc. since we can assemble that function from scratch, and (I think) have little reason to believe that those variables would have much influence on the other plant costs. Would we want to include an interaction term to find a dependence on (capacity * capacity factor) (aka net generation) in addition to capacity and capacity factor independently? Or are we interested in just the capacity & net generation terms?

michaelpburt · 2018-09-28T20:36:18Z

Hi Zane, This may not be what you are looking for, but many ISO's publish guidance on what the VOM (variable operations and maintenance) costs are on a per-technology basis (aka supercritical coal, subcritical coal, CT, CC, hydro, etc.). This guidance is used widely as an input in marginal cost models. See page 22 of this PJM manual > https://www.pjm.com/-/media/documents/manuals/archive/m15/m15v28-cost-development-guidelines-10-18-2016.ashx

In my experience, VOM is usually sufficient to encompass all costs beyond fuel input and carbon & MATS compliance related costs. Those costs include things like fly ash, urea, chlorine, or other inputs into scrubbers and such. I am not sure what the magnitude of those costs are, but I bet there is some pretty good documentation out there. Off the top of my head, I think they are around $1-$3 per MWhr for big nasty coal plants.

zaneselvans · 2018-10-17T09:14:44Z

@michaelpburt I definitely don't completely understand the calculations that PJM is describing in that document but the sense I got was that there's an acceptable VOM number that generators can include in their prices based on the technology of the generator, and that that number may be different from the actual variable expenses they've experienced? Is that right? Is that to compensate for typical expenses that just haven't been experienced by a generator yet? Like how you know the cost of maintenance on a new car isn't $0/mi even if it might look that way for the first few years of operations? Are the expenses small enough (relative to fuel) and/or uniform enough across different plants of a given technology that it's not really worth trying to extract the particular per-plant expenses? Is the effect of expected but as of yet unrealized O&M large enough that these categorical estimates are more useful than real per-plant expenditures?

gschivley · 2019-03-20T19:52:00Z

@alanawlsn and @zaneselvans Let me know if there's anything I can do to help with the cost calculations.

zaneselvans · 2019-03-21T23:44:07Z

Okay, I've merged together annualized records from FERC and EIA on the basis of their report_year, plant_id_pudl and primary fuel type, and plotted some of the more interesting values which are available in both datasets (capacity, fuel cost, total heat content of fuel consumed, net generation), as well as some derived values (heat rate, fuel cost per MWh and mmBTU, capacity factor) against each other, separated out for the coal and gas portions of each of the power plants. The results are below.

One thing that I noted: there were only about 1450 records shared between the two datasets, which seems kind of small (this data is for 2009-2017, the years which we have for both of them). Now in retrospect I realize this is probably (yet again) an artifact of the NA values that are common in the EIA data wiping out a bunch of the aggregated values.

Generally it looks better than I expected it would. Thoughts? @alanawlsn @gschivley @cmgosnell

gschivley · 2019-03-22T13:10:22Z

I'm not as familiar with FERC data - who is required to report to them? The plots do show a nice agreement.

Each record in the FERC Form 1 corresponds to a particular type of fuel. Many plants -- especially coal plants -- use more than one fuel, with gas and/or diesel serving as startup fuels. In order to be able to classify the type of plant based on relative proportions of fuel consumed or fuel costs it is useful to aggregate these per-fuel records into a single record for each plant. Fuel cost (in nominal dollars) and fuel heat content (in mmBTU) are calculated for each fuel based on the cost and heat content per unit, and the number of units consumed, and then summed by fuel type (there can be more than one record for a given type of fuel in each plant because we are simplifying the fuel categories). The per-fuel records are then pivoted to create one column per fuel type. The total is summed and stored separately, and the individual fuel costs & heat contents are divided by that total, to yield fuel proportions. Based on those proportions and a minimum threshold that's passed in, a "primary" fuel type is then assigned to the plant-year record and given a string label. Also required for FERC non-fuel OpEx regressions in #204

cmgosnell · 2021-09-02T14:07:15Z

closing because this it no longer relevant. we've generally learned that it is difficult to impossible to categorize the specific O&M lines in FERC as fixed and variable O&M. We have been employing NEMS' breakdown of fixed a variable O&M. See example here.

alanawlsn self-assigned this Sep 28, 2018

zaneselvans added ferc1 Anything having to do with FERC Form 1 analysis Data analysis tasks that involve actually using PUDL to figure things out, like calculating MCOE. labels Sep 28, 2018

zaneselvans added this to the PUDL v0.1 alpha release milestone Sep 28, 2018

cmgosnell mentioned this issue Mar 20, 2019

Apply DataZipper to FERC / EIA Plant Mapping #212

Closed

cmgosnell mentioned this issue Apr 15, 2019

Infer plant characteristics for missing FERC plants #272

Open

zaneselvans modified the milestones: 0.1.0, 0.2.0 Jun 28, 2019

zaneselvans modified the milestones: 0.3.0, future_release Sep 23, 2019

cmgosnell removed this from the future_release milestone Oct 4, 2019

cmgosnell closed this as completed Sep 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FERC regressions #204

FERC regressions #204

alanawlsn commented Sep 28, 2018

zaneselvans commented Sep 28, 2018

zaneselvans commented Sep 28, 2018

michaelpburt commented Sep 28, 2018

zaneselvans commented Oct 17, 2018

gschivley commented Mar 20, 2019

zaneselvans commented Mar 21, 2019

gschivley commented Mar 22, 2019

cmgosnell commented Sep 2, 2021

FERC regressions #204

FERC regressions #204

Comments

alanawlsn commented Sep 28, 2018

zaneselvans commented Sep 28, 2018

zaneselvans commented Sep 28, 2018

michaelpburt commented Sep 28, 2018

zaneselvans commented Oct 17, 2018

gschivley commented Mar 20, 2019

zaneselvans commented Mar 21, 2019

gschivley commented Mar 22, 2019

cmgosnell commented Sep 2, 2021