Proposal: Separate load-uncertainty scenario in the optimiser (paralleling the existing PV10 approach) #3612

knackerbrot · 2026-03-21T02:18:45Z

knackerbrot
Mar 21, 2026

Proposal: Separate load-uncertainty scenario in the optimiser (paralleling the existing PV10 approach)

TL;DR

The optimiser's residual battery valuation at rate_min_forward creates a narrow margin (0.64 c/kWh on my tariff) between charge cost and residual credit. When the load forecast underestimates — even modestly — all candidate SoC levels show residual battery, the marginal value of each stored kWh collapses to that 0.64c, and the optimiser becomes indifferent between charge levels. PV uncertainty already has a dedicated pessimistic scenario (pv10) and weighting (pv_metric10_weight). Load uncertainty has no equivalent first-class treatment, and load_scaling10 at 1.1× is too mild to compensate. I'm proposing a dedicated load-high scenario derived from observed historical variance, blended into the metric with a weight derived from the tariff structure — no new user parameters needed.

Setup context

Fronius GEN24 10kW hybrid, BYD 27.65 kWh usable, Predbat v8.34.7
Synergy EV Add-On TOU (Perth, Australia): super off-peak 8.62c (9am–3pm), off-peak 23.69c, overnight 19.38c, peak 53.84c, no export payment
days_previous: 7, load_scaling: 1.0, load_scaling10: 1.1, pv_metric10_weight: 0.15, metric_battery_cycle: 0, metric_min_improvement: 0, debug_enable: on

Observed behaviour

Predbat consistently charges to 75–85% during the super off-peak window rather than 100%, then imports overnight at 2–6× the super off-peak rate. The best_charge_window debug data shows charge targets oscillating wildly between planning cycles — e.g., on March 20:

Perth time	SoC	Highest target in remaining SOP window
10:57	~34%	20.15 kWh
11:07	~35%	27.07 kWh
11:37	~44%	1.935 kWh (reserve)
12:07	~53%	21.65 kWh
12:17	~55%	26.13 kWh
14:27	~80%	1.935 kWh (reserve)

The targets swing from near-full to reserve every few cycles. The battery reached 84% mostly from passive solar, not from planned grid charging. load_inday_adjustment for that day ended at 137.6%.

Root cause in the code

The residual valuation margin

In compute_metric (plan.py L1258):

rate_min = rate_min_forward.get(minutes_now + end_record, rate_min) / inverter_loss / battery_loss
battery_value = soc_at_end * metric_battery_value_scaling * max(rate_min, 1.0, rate_export_min)
metric = cost - battery_value

rate_min_forward (fetch.py L1721) takes min(rate_array[minute:]) — the cheapest future rate. On any TOU tariff with a recurring cheap window, this always resolves to the cheapest slot. On my tariff:

rate_min = 8.62 / 0.96 / 0.97 ≈ 9.26 c/kWh
charge cost = 8.62 c/kWh
margin = 0.64 c/kWh per stored kWh

Why the optimiser becomes indifferent

When the load forecast is even slightly low, the minute-by-minute simulation shows the battery not running out at any candidate SoC. Every candidate ends the forecast window with residual energy. In this regime, the marginal metric difference between adjacent SoC levels is:

Δmetric ≈ Δsoc × (rate_charge - rate_min_adjusted) = Δsoc × 0.64c

Over the full range from 80% to 100% (5.55 kWh), total metric difference ≈ 3.6c. This is within the noise of re-planning every 5 minutes with slightly different inputs, producing the oscillation pattern above.

When the forecast is accurate and the simulation shows the battery running out, the picture is completely different — the marginal kWh avoids importing at 19–54c, creating metric differences of 50+ cents. The system works well in this regime. It fails silently when forecasts are slightly wrong.

The asymmetry that isn't captured

On a high-differential TOU tariff, the cost of forecast error is deeply asymmetric:

Over-charge by 5 kWh: Pay 5 × 8.62c = 43c, displaced by tomorrow's solar. Net waste: 5 × 0.64c = 3.2c.
Under-charge by 5 kWh: Import at overnight/peak rates. Net cost: 5 × 10.76c to 5 × 45.22c = 54–226c.

The downside of under-charging is 17–70× the downside of over-charging. The optimiser doesn't see this because it runs a single point-estimate simulation.

Existing mechanisms and why they're insufficient

Predbat already has tools that partially address this, which I want to acknowledge:

pv10 pathway: Runs a pessimistic scenario with lower PV and load_scaling10 (1.1×) load, blended at pv_metric10_weight. This is architecturally the right idea, but it conflates PV and load uncertainty into one scenario, and 10% load inflation with 15% blending weight is far too mild. On my system, actual load regularly diverges 30–40% from the 7-day average.
load_inday_adjustment: Reactive correction based on today's actual-vs-predicted load. Damped (metric_inday_adjust_damping: 0.95) and weighted toward yesterday's adjustment early in the day. By the time it corrects meaningfully, the cheap window has often passed. It also only corrects the mean — it doesn't inform the optimiser about the range of plausible outcomes.
metric_battery_value_scaling: Manual knob to inflate residual battery value. Addresses the symptom (optimizer doesn't value stored energy enough) but requires the user to understand the internal mechanism and tune it per tariff. Also affects all charge windows equally, not just cheap ones.
load_scaling: Static multiplier. Same issue — requires user tuning, and if set too high, over-charges on genuinely low-load days.

These are all useful tools, but they all require the user to compensate for a structural issue in the optimiser. The pv10 system shows the right architectural pattern: model the uncertainty explicitly and let the optimiser account for it. Load just needs the same treatment.

Proposal

Core idea

Add a dedicated load-high scenario to the optimiser, paralleling the existing pv10 pathway, with the pessimistic load profile derived from historical variance and the blending weight derived from the tariff structure.

Implementation

1. Derive a high-load profile from existing days_previous data.

The data is already there — Predbat fetches 7 days of load history and averages them. Instead of discarding the per-day variance, retain it. The high-load profile could be the 80th or 90th percentile day, or simply the second-highest of the 7 days after modal filtering. This provides a realistic worst-case that's grounded in the household's actual usage patterns.

This computation would sit in fetch.py alongside the existing load_minutes processing, producing a load_minutes_high that parallels load_minutes.

2. Run a third simulation per candidate SoC in optimise_charge_limit.

The infrastructure already exists — launch_run_prediction_charge runs parallel simulations with a pv10 flag. Add a load_high flag (or repurpose the existing mechanism):

# plan.py L1456-1461, existing:
hanres   = launch_run_prediction_charge(try_soc, ..., pv10=False, ...)  # median
hanres10 = launch_run_prediction_charge(try_soc, ..., pv10=True, ...)   # low PV + 1.1× load

# proposed addition:
hanres_lh = launch_run_prediction_charge(try_soc, ..., load_high=True, ...)  # median PV + high load

In prediction.py (L399-405), the load_high flag would select load_minutes_step_high instead of load_minutes_step, analogous to how pv10=True selects pv_forecast_minute10_step.

3. Blend the high-load cost into compute_metric.

After the existing pv10 blending (plan.py L1266-1270), add:

if metric_load_high > metric:
    load_penalty = (metric_load_high - metric) * load_risk_weight
    metric += load_penalty

4. Derive load_risk_weight from the tariff — no user parameter.

load_risk_weight = 1 - (rate_min / rate_average)

This naturally scales with rate differential:

My tariff (8.62c min, 26.2c avg): weight = 0.67 — strong conservatism
Intelligent Octopus Go (7.5p min, 18.75p avg): weight = 0.60
Agile (varies, but typically 5–15p min, ~20p avg): weight = 0.50–0.75
Flat tariff (32c min = 32c avg): weight = 0 — no change to current behaviour

The weight reflects the tariff's inherent asymmetry. High-differential tariffs get proportionally more load pessimism because the cost of under-charging is proportionally higher. Flat tariffs see zero change.

Compute cost

~50% more simulation time per planning cycle (3 scenarios per SoC instead of 2). On typical systems this means going from ~30–60s to ~45–90s per plan, well within the 5-minute cycle.

What this achieves

On my system yesterday, the high-load scenario (using the second-highest day from the past week, roughly 35 kWh) would have shown the battery running out at 80% SoC. The metric difference between 80% and 100% in that scenario would be 50+ cents (avoided overnight imports), and even at a weight of 0.67, the blended penalty of ~35c would dominate the 3.6c base difference. The optimiser would clearly select 100%.

On a low-variance day where the high-load scenario is close to the median, the penalty would be near-zero and the optimiser would behave exactly as it does now — free to charge to less than 100% when it's genuinely optimal.

Who else this helps

This isn't specific to Australian TOU tariffs. The structural fragility — optimizer indifference when rate_min / rate_charge ≈ 1 — affects anyone on a tariff where the cheap charging rate is close to the loss-adjusted residual valuation. In practice, that's most TOU tariffs including Octopus Go, IOG, Cosy, Flux, and Agile (where the cheapest slots are very cheap relative to average). It likely contributes to the volume of "what should I set load_scaling to?" questions in the community.

The broader benefit is reducing user tuning. Rather than expecting users to understand load_scaling, load_scaling10, pv_metric10_weight, and metric_battery_value_scaling and how they interact with their specific tariff, the optimiser would handle the asymmetry structurally. Users on flat tariffs see no change. Users on differential tariffs get the conservatism their tariff demands, automatically.

Incremental path

Phase 1: Derive load_minutes_high from the per-day historical data already available. Small change in fetch.py.
Phase 2: Add the third simulation and metric blending. Moderate change in plan.py and prediction.py.
Phase 3 (future, if warranted): Extend to multiple load quantiles for finer-grained risk modelling.

Happy to contribute code. Would welcome feedback on the design before starting.

gcoan · 2026-03-21T07:48:15Z

gcoan
Mar 21, 2026
Collaborator

Thank you for your idea and such a comprehensive analysis and proposal.

For one I'd like to see a reduction in predbat tuning controls, simplifying wherever possible rather than adding more. I occasionally see users trying things like load scaling 2.0 to make the plan better in cheap charging periods which can't be the right answer.

My only question to this idea is how it fits with Load ML for predicting load? A lot of people, myself included, have changed to forecasting load this way and have turned off in day adjustment as a result.

0 replies

knackerbrot · 2026-03-21T09:14:27Z

knackerbrot
Mar 21, 2026
Author

Good question — I hadn't considered LoadML since I'm not using it. Without LoadML, generating a high-load profile is straightforward: you already have 7 days of per-day load profiles, so you just take the 80th percentile day instead of the average. LoadML is a different story — it produces a single point estimate, and changing it to produce quantile forecasts is a meaningful ML change that I wouldn't be confident making myself.

However, the optimiser-side changes are the same regardless of where the load profiles come from. So I'd suggest we implement and test against the non-LoadML path first — deriving P20/P50/P80 from the existing days_previous historical data — and validate the concept on a real system before anyone needs to touch LoadML. That work is relatively straightforward and I could take it on. LoadML quantile support can follow once the approach is proven.

In thinking through gcoan's question and how the optimiser should use multiple load scenarios, I've also arrived at a cleaner approach than what I originally proposed.

Expected cost instead of penalty-based blending

Rather than adding a "penalty" from the P80 scenario with a tariff-derived weight (as I originally proposed), a straight probability-weighted expected cost is simpler and needs no new parameters at all.

The idea: have LoadML (or the historical fallback) produce a three-point estimate — P20, P50, and P80. These divide the probability space into bands: 20% chance of being at or below P20, 60% in the middle, 20% at or above P80. The formula in compute_metric becomes:

metric = 0.2 * metric_P20 + 0.6 * metric_P50 + 0.2 * metric_P80

The weights (0.2, 0.6, 0.2) aren't tuning parameters — they follow from the definition of P20 and P80. If the model produces a P80 forecast, by definition actual load exceeds it 20% of the time, so a weighting of 0.2 is appropriate. The asymmetry then comes from the cost differences between scenarios, not from any imposed weight.

How it captures the asymmetric risk

Here's a concrete example from my system. It's noon during the super off-peak window (8.62c/kWh). The optimiser is deciding whether to charge to 80% or 100%. It runs three simulations for each option:

Charge to 80% (22.1 kWh):

	P20 (light day)	P50 (typical day)	P80 (heavy day)
Load until next cheap window	~16 kWh	~21 kWh	~27 kWh
Battery runs out?	No, 6 kWh left	No, barely — 1 kWh left	Yes, ~5 kWh short
Grid import needed	0	0	~5 kWh at 19–54c
Extra cost vs ideal	0	0	~100c
Probability weight	× 0.2	× 0.6	× 0.2

Expected extra cost at 80% = (0.2 × 0) + (0.6 × 0) + (0.2 × 100) = 20c

Charge to 100% (27.65 kWh):

	P20 (light day)	P50 (typical day)	P80 (heavy day)
Load until next cheap window	~16 kWh	~21 kWh	~27 kWh
Battery runs out?	No, 11.6 kWh left	No, 6.6 kWh left	No, barely — 0.6 kWh left
Grid import needed	0	0	0
Extra cost vs ideal	~3.5c (excess stored energy displaced by tomorrow's solar)	0	0
Probability weight	× 0.2	× 0.6	× 0.2

Expected extra cost at 100% = (0.2 × 3.5) + (0.6 × 0) + (0.2 × 0) = 0.7c

100% wins by 19.3c. This falls out from the probability weighting alone — no tuning parameter involved.

The worst case for over-charging (100% on a light day) wastes 3.5c. The worst case for under-charging (80% on a heavy day) costs ~100c in expensive grid imports. The downside is 30:1. Even after probability weighting (each tail gets 20%), the expected cost of under-charging (20c) dwarfs the expected cost of over-charging (0.7c). The tariff's rate differentials determine how strong this effect is.

On a flat tariff, the P80 scenario wouldn't produce expensive grid imports (all rates are the same), so the three scenarios would produce similar costs and the optimiser's behaviour wouldn't change.

Why the current optimiser misses this

Today's optimiser only runs the P50 simulation. In that single scenario, both 80% and 100% show the battery not running out. The metric difference is tiny — around 3c (the 0.64c/kWh residual margin from the original post). In practice the optimizer oscillates between them every 5 minutes because small changes in load prediction shift the result back and forth.

With three scenarios, the P80 case breaks the tie decisively. Even though it only gets 20% weight, a 20% chance of a 100c penalty (= 20c expected cost) is much larger than the entire P50 metric difference. The optimizer gets a clear signal.

Why P20 matters too, not just P80

My tariff makes the P80 case dramatic, but P20 matters for other scenarios. On export-heavy tariffs (Flux, Agile export, SEG), if the optimiser over-charges and actual load turns out to be light (P20), the battery is full when the sun comes up — clipping solar and losing export revenue. The P20 simulation captures this cost, pulling the expected cost toward lower charging. Similarly, on Agile import where rates can go very low or negative at unpredictable times, over-charging during one cheap window means missing an even cheaper window later — P20 captures this by showing less battery is needed, leaving headroom.

Both risks are handled by the same mechanism — the tariff determines which direction matters more.

Proposed implementation and phasing

Phase 1 — Test with non-LoadML data (I could do this):

Derive P20 and P80 load profiles from the existing days_previous historical data. With 7 days of history, the P80 profile is roughly the 2nd-highest day and P20 is the 2nd-lowest. This is maybe 20-30 lines in fetch.py.
Plumb the three load profiles through the simulation infrastructure. Run three simulations per candidate SoC, using median PV for all three (PV uncertainty is already handled by the existing PV10 mechanism and doesn't need to be cross-multiplied with load scenarios).
Replace the metric blending in compute_metric with the probability-weighted expected cost: 0.2 * P20 + 0.6 * P50 + 0.2 * P80. About 10 lines of code.
Test on my system, measure the improvement over a few weeks using my counterfactual cost analysis.

This gives us concrete evidence — does it produce stable charge targets? Does it charge appropriately during cheap windows? Does it avoid over-charging on genuinely low-load days? — before asking anyone to touch LoadML. The Phase 1 code also becomes the permanent fallback for users not running LoadML.

Phase 2 — LoadML quantile forecasts (needs help):

Once Phase 1 proves the concept, add quantile outputs (P20, P50, P80) to LoadML. The model change is conceptually small — the output layer goes from 1 value to 3, and each output trains with a different loss function (quantile loss at τ=0.2, τ=0.5, τ=0.8). Quantile loss is a standard ML forecasting technique — instead of penalising over- and under-prediction equally, it penalises one direction more than the other, teaching the model to predict a value at the desired percentile. But getting the training gradient right in the hand-rolled NumPy backpropagation needs someone who understands that code well.

LoadML's quantile forecasts would be better than the Phase 1 historical percentiles because they'd be shaped by time-of-day and temperature — the P80 for a hot afternoon would be wider than the P80 for 3am baseload. I'd suggest producing all three outputs in one go rather than incrementally — the architecture change is the same whether it's 2 or 3 outputs, and having to invalidate saved models twice would be painful for users.

On reducing tuning controls

Completely agree that's the goal. If this works well, load_scaling, load_scaling10, and arguably metric_battery_value_scaling all become much less important — potentially candidates for deprecation. The optimiser would adapt to the user's tariff structure and load variability automatically. Users on flat tariffs see no change; users on Agile/IOG/TOU get the right level of conservatism without touching anything.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposal: Separate load-uncertainty scenario in the optimiser (paralleling the existing PV10 approach) #3612

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Proposal: Separate load-uncertainty scenario in the optimiser (paralleling the existing PV10 approach) #3612

Uh oh!

knackerbrot Mar 21, 2026

Proposal: Separate load-uncertainty scenario in the optimiser (paralleling the existing PV10 approach)

TL;DR

Setup context

Observed behaviour

Root cause in the code

The residual valuation margin

Why the optimiser becomes indifferent

The asymmetry that isn't captured

Existing mechanisms and why they're insufficient

Proposal

Core idea

Implementation

Compute cost

What this achieves

Who else this helps

Incremental path

Replies: 2 comments

Uh oh!

gcoan Mar 21, 2026 Collaborator

Uh oh!

knackerbrot Mar 21, 2026 Author

Expected cost instead of penalty-based blending

How it captures the asymmetric risk

Why the current optimiser misses this

Why P20 matters too, not just P80

Proposed implementation and phasing

On reducing tuning controls

knackerbrot
Mar 21, 2026

gcoan
Mar 21, 2026
Collaborator

knackerbrot
Mar 21, 2026
Author