Sanity check EIA fuel price predictions visually #1720

zaneselvans · 2022-06-27T15:38:41Z

The model error metric gives an extremely condensed (zero-dimensional) view of whether or not we're predicting fuel prices correctly. Develop some additional visualizations that let us see whether things look right, or at least good enough.

Compare by state & fuel type:

Reported data
HistGBR predictions of reported data
Previously implemented groupby-weighted-median aggregations
Aggregated fuel prices obtained from the EIA API

Visualizations

1-D histograms
2-D histograms (to show dispersion between models)
Break down error metrics by fuel
Monthly time series

zaneselvans · 2022-06-27T16:58:49Z

For states with high fuel prices, it seems like the model often ends up being influenced pretty strongly by other markets with lower prices and ends up systematically off. It also predicts high price excursions, even in states where they don't seem to happen:

zaneselvans · 2022-06-27T17:00:13Z

Looking at some 2-d histograms, even when the price distribution looks pretty similar overall, there's not necessarily great 1-to-1 correlation between individual reported and predicted prices.

zaneselvans · 2022-06-27T18:07:53Z

I looked at all of the per-state natural_gas price distributions, and saw that the model predictions were biased low across the board. Some of them had the right range, many were low, but they were basically never high. Which seemed weird.

But then I looked at the coal prices, and of course there the opposite was true -- all the predictions are biased high. How can we make the model differentiate more strongly between the fuel types? Would it make sense to train separate models for each fuel type? They long tail of high price excursions that I think comes entirely from gas and petroleum also shows up in coal price distributions.

zaneselvans · 2022-06-27T18:37:20Z

One thing that seems to be going on is that it predicts coal + natural gas like price distributions for petroleum, and seems to bring petroleum-like high price outliers into to coal+gas distributions. There are way fewer records / MMBTU representing petroleum, so it'll be hard for the model to integrate petroleum-specific pricing information unless it know it only applies to the other petroleum records.

zaneselvans · 2022-06-27T19:06:39Z

It seems to have this same issue with or without the weighting by received MMBTU, and it persists even when I narrow the features down to just fuel type, state, report month, and elapsed days.

I feel like I must be doing something wrong. Isn't the idea with this kind of regression that it can identify different regimes in one variable (like fuel type) that indicate different ranges of desirable predictions?

Would it help to have an anchoring value that can be modulated by other variables? E.g. a state/regional/national average price for the type of fuel each record contains?

zaneselvans · 2022-06-27T19:29:58Z

Restricting the model to just looking at NG records (with very few features) it gets a bit less blobby and more diagonal:

TrentonBush · 2022-06-27T19:35:28Z

That model certainly looks bad! What's your model score?

Also, I'm confused by the y axis on these histograms. The dataset only has 500k points, but the y-axes are scaled in 1e6 to 1e9? I tried to reproduce the Arkansas coal histogram but get something quite different (with whatever model I had in a notebook from last week):

And the correlation is much more linear:

zaneselvans · 2022-06-27T20:19:17Z

The numbers are larger than the number of samples because they're weighted by MMBTU. It seems crazy to me that a unit train (~10,000 tons of coal) would get the same prominence as 1 mcf of natural gas or 1 bbl of diesel fuel oil, which is what the unweighted distribution would show, right?

Your scatter plot looks way way better! Let me re-run with a single model in GridSearchCV and see what the score looks like now. I think the last one I saw was -0.48. What kind of scores are you getting?

TrentonBush · 2022-06-27T20:39:36Z

The one I have from last week uses these features:
["fuel_group_eiaepm", "state", "report_month", "plant_id_eia", "elapsed_days", "fuel_mmbtu_per_unit",]
and these hyperparams:

param_grid = {
    "hist_gbr__max_depth": [7,],
    "hist_gbr__max_leaf_nodes": [2**7],
    "hist_gbr__learning_rate": [0.1],
    "hist_gbr__min_samples_leaf": [25],
}

to get a neg_median_absolute_error of -0.4155

zaneselvans · 2022-06-28T17:03:44Z

Well it turns out my problem was a mixup between pandas and numpy indexing, and when the indexes are aligned, everything is fine. 🙄

zaneselvans · 2022-06-28T23:18:00Z

I was curious how the quality of predictions varies by different fuels, and it looks like we do good on all the major ones: all types of coal and natural gas, plus DFO/RFO. Not so great on petcoke. But overall this seems great!

from sklearn.metrics import mean_absolute_percentage_error

def fuel_price_wmape(df, predict_col):
    return mean_absolute_percentage_error(
        df["fuel_cost_per_mmbtu"],
        df[predict_col],
        sample_weight=df["fuel_received_mmbtu"]
    )

def err_by_fuel(frc):
    fuel_cols = ["energy_source_code", "fuel_group_eiaepm"]
    out_df = pd.DataFrame()
    for col in fuel_cols:
        valid_rows = (
            (frc["fuel_cost_per_mmbtu"].notna())
            & (frc["fuel_received_mmbtu"].notna())
            & (frc[col].notna())
        )
        gb = frc.loc[valid_rows].groupby(col, observed=True)
        wm_err = gb.apply(fuel_price_wmape, predict_col="fuel_cost_per_mmbtu_wm").to_frame(name="wm_wmape")
        hgbr_err = gb.apply(fuel_price_wmape, predict_col="fuel_cost_per_mmbtu_predicted").to_frame(name="hgbr_wmape")
        total_mmbtu = gb["fuel_received_mmbtu"].sum().to_frame(name="total_mmbtu")
        tmp = pd.concat([wm_err, hgbr_err, total_mmbtu], axis="columns")
        out_df = pd.concat([out_df, tmp])
        
    return out_df.sort_values("total_mmbtu", ascending=False)

err_by_fuel(frc_predicted)

	wm_wmape	hgbr_wmape	total_mmbtu
coal	0.129684	0.0563751	1.45131e+11
SUB	0.122974	0.0533036	7.05948e+10
BIT	0.135227	0.0596738	6.90624e+10
NG	0.0940054	0.0999395	5.65777e+10
natural_gas	0.0940054	0.0999395	5.65777e+10
LIG	0.146808	0.0535479	5.39209e+09
petroleum	0.0358797	0.085129	1.40623e+09
PC	0.0931735	0.207044	1.18256e+09
petroleum_coke	0.0931735	0.207044	1.18256e+09
RFO	0.0342987	0.0830832	9.01874e+08
DFO	0.0369909	0.0847551	4.85247e+08
WC	0.123332	0.109256	7.31621e+07
other_gas	0.000248288	0.233354	2.10886e+07
JF	0.000102664	0.0707134	1.52616e+07
SGP	0	0.132342	1.48044e+07
SC	0.00262676	0.1042	8.2274e+06
OG	0.000371954	0.455647	5.69508e+06
WO	0.553216	0.798909	2.83575e+06
KER	0.00184038	0.30512	1.01084e+06
PG	0.00529242	0.622841	589095
NULL	0	0.736635	84378.6
NULL	0	0.736635	84378.6

Looking at just the fuels with more than 1e8 MMBTU reported (SUB through DFO):

big_fuels = ["SUB", "BIT", "NG", "LIG", "PC", "RFO", "DFO"]
valid_rows = (
    (frc_predicted["fuel_cost_per_mmbtu"].notna())
    & (frc_predicted["fuel_received_mmbtu"].notna())
    & (frc_predicted["fuel_cost_per_mmbtu_predicted"].notna())
)
fuel_price_wmape(
    frc_predicted[frc_predicted.energy_source_code.isin(big_fuels)].loc[valid_rows],
    predict_col="fuel_cost_per_mmbtu_predicted")

# Result: 0.06948276509771906

Within the coal category, we seem to do equally well on all the coal types, and better than the weighted medians. With the petroleum and some other fuels, the HGBR has higher relative error. In the weighted median approach, there are a number of fuel deliveries with prices that are exactly identical to the weighted median (any time there's only a single delivery of a given fuel in a given state and month). E.g. the perfect 1:1 lines in the middle of these plots...

Weighted Medians vs. Hist Gradient Boosted Regression (`SUB`)

Weighted Medians vs. Hist Gradient Boosted Regression (`DFO`)

zaneselvans · 2022-07-18T23:38:15Z

See continued work in #1767

zaneselvans added eia923 Anything having to do with EIA Form 923 analysis Data analysis tasks that involve actually using PUDL to figure things out, like calculating MCOE. data-repair Interpolating or extrapolating data that we don't actually have. labels Jun 27, 2022

zaneselvans self-assigned this Jun 27, 2022

zaneselvans mentioned this issue Jun 27, 2022

Explore external validation sources for fuel imputation #1712

Closed

zaneselvans closed this as completed Jul 18, 2022

zaneselvans mentioned this issue Jul 18, 2022

Estimate redacted EIA 923 fuel prices #1708

Open

zaneselvans linked a pull request Jul 19, 2022 that will close this issue

Remove external EIA & Zenodo API calls from CI #1696

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sanity check EIA fuel price predictions visually #1720

Sanity check EIA fuel price predictions visually #1720

zaneselvans commented Jun 27, 2022 •

edited

zaneselvans commented Jun 27, 2022

zaneselvans commented Jun 27, 2022

zaneselvans commented Jun 27, 2022

zaneselvans commented Jun 27, 2022 •

edited

zaneselvans commented Jun 27, 2022

zaneselvans commented Jun 27, 2022

TrentonBush commented Jun 27, 2022

zaneselvans commented Jun 27, 2022 •

edited

TrentonBush commented Jun 27, 2022

zaneselvans commented Jun 28, 2022

zaneselvans commented Jun 28, 2022 •

edited

zaneselvans commented Jul 18, 2022

Sanity check EIA fuel price predictions visually #1720

Sanity check EIA fuel price predictions visually #1720

Comments

zaneselvans commented Jun 27, 2022 • edited

Compare by state & fuel type:

Visualizations

zaneselvans commented Jun 27, 2022

zaneselvans commented Jun 27, 2022

zaneselvans commented Jun 27, 2022

zaneselvans commented Jun 27, 2022 • edited

zaneselvans commented Jun 27, 2022

zaneselvans commented Jun 27, 2022

TrentonBush commented Jun 27, 2022

zaneselvans commented Jun 27, 2022 • edited

TrentonBush commented Jun 27, 2022

zaneselvans commented Jun 28, 2022

zaneselvans commented Jun 28, 2022 • edited

Weighted Medians vs. Hist Gradient Boosted Regression (SUB)

Weighted Medians vs. Hist Gradient Boosted Regression (DFO)

zaneselvans commented Jul 18, 2022

zaneselvans commented Jun 27, 2022 •

edited

zaneselvans commented Jun 27, 2022 •

edited

zaneselvans commented Jun 27, 2022 •

edited

zaneselvans commented Jun 28, 2022 •

edited

Weighted Medians vs. Hist Gradient Boosted Regression (`SUB`)

Weighted Medians vs. Hist Gradient Boosted Regression (`DFO`)