Skip to content

Next Bill Estimate

delabrcd edited this page Jun 7, 2026 · 2 revisions

Next-Bill Estimate

How the dashboard predicts your next bill before it's issued. All of this is a forward estimate — it is never persisted and never fed to GET /api/verify (which only cross-checks stored numbers against the bill PDFs). Pure logic lives in app/src/lib/prediction.ts and is hand-calculated unit-tested.

Cost basis, as everywhere: Bill.currentCharges (parsed from the PDF), never totalDueAmount. See Data Accuracy.

The short version

predicted bill  =  Σ over the 4 cost components ( fixed $/day · days  +  variable $/unit · usage )

where usage is weather-normalized (degree-days → expected kWh/therms) and each component's fixed $/day and variable $/unit are tracked by a small Kalman filter that learns from your bill history and adapts as rates drift.

Why the naive method isn't enough

The original estimate (issue #9, the "calendar" method) was: same-calendar-month-last-year usage × current trailing-12-month all-in $/unit. It is cheap and robust but has two structural blind spots, both of which made it systematically under-predict on a real, rising-rate account:

  1. A flat $/unit rate ignores the fixed customer charge. Gas delivery in particular is mostly a fixed daily charge. In summer your therms fall to nearly zero, so the effective $/therm explodes (the bill barely changes while usage collapses) — yet $/day actually drops. Pricing near-zero summer usage at a blended annual $/therm is wrong in both directions at once. (A degree-day-only estimate, issue #44, missed this too and was reverted.)
  2. Trailing-12 rates lag a rising market. Averaging a year of rates badly underprices the next bill when rates are climbing.

The model (issue #67)

Three independent pieces, each pure and testable:

1. Weather-normalized usage

Fit usage against degree-days over each bill period (see Weather and Degree-Days), then project onto climatological-normal degree-days for the predicted next-bill window:

electric:  kWh    ≈ baseload_elec + slopeH·HDD + slopeC·CDD     (heating + cooling)
gas:       therms ≈ baseload_gas  + slopeH·HDD                  (heating only)

This separates weather-independent baseload (fridge, lights, standby) from weather-driven load, so a hot summer or cold snap moves the estimate the right way. The normals come from the cached WeatherDaily history — no live network call is required for an estimate. (Reuses the same fit machinery as the 12-month Range and Customization, issue #52.)

2. Per-component rates, tracked by a Kalman filter

The four cost components — electric supply, electric delivery, gas supply, gas delivery — are priced separately, each as a fixed daily charge plus a variable per-unit rate:

component $  ≈  fixed_per_day · days  +  rate_per_unit · usage
  • The fixed term captures the customer/service charge, so a near-zero-usage summer gas bill is correctly priced as "mostly the fixed charge" instead of a blown-up $/therm. (This is the key structural fix over a flat trailing-12 $/therm.)

Instead of a fixed-window least-squares fit, each component's [fixed_per_day, rate_per_unit] is the latent state of a small Kalman filter. Each bill is a linear observation amount = days · fixed + usage · rate + noise; the state follows a random walk, so the filter treats the rate as something that drifts over time and re-estimates it with every bill — weighting recent bills more, automatically, with principled uncertainty. This tracks rising rates without any hand-tuned lookback window, and (unlike the older approach) needs no separate bias-correction term — the drifting-state model removes the systematic under-bias on its own.

Two filter knobs — process noise (how fast a rate may drift) and observation noise (how noisy a single bill is) — sit on a broad robust plateau; the defaults are q = 0.15, r = 0.10. A trend/velocity state was tested and rejected (it over-extrapolates rate hikes and overshoots). Level-only random walk wins.

The four components sum to the period's currentCharges on real bills, so summing the four predictions reconstructs the bill.

Confidence band

low/high come from the spread (≈ ±1σ) of recent costs (falling back to ±15%), so the band reflects how variable the bills have recently been.

Fallback

When there isn't enough history to fit reliably (roughly < 18 bills, or the per-component costs / degree-days are missing), the estimate falls back to the calendar (#9) method automatically. New self-hosters therefore still get a sensible number from day one; the weather-and-rate model takes over once a couple of years of bills exist.

How it was validated

The model was developed against a walk-forward back-test on a real multi-year account — for each bill, train only on prior bills and compare the prediction to the actual currentCharges. Versus the calendar method this roughly halved the error (mean absolute percentage error fell from ~15% to ~7%) and removed the systematic under-bias. Rejected along the way: a seasonal rate multiplier on top of a recent-window fit (double-counts the season and over-corrects); a trend/velocity Kalman state (over-extrapolates rate hikes); and recent-years-only weather normals (no measurable effect — the residual bias was rate-trend, not weather). The random-walk Kalman filter also beat an explicit recent-window-fit-plus-bias-correction approach, while being simpler (no window length or bias term to tune).

Where it lives

  • app/src/lib/prediction.tsestimateNextBillSeasonal() (this model), estimateNextBill() (the #9 fallback), the degree-day usage fits, and the per-component Kalman rate estimator (kalmanComponentRate()). All pure.
  • app/src/lib/queries.tsgetOverview() calls the seasonal estimate and falls back to #9.
  • app/test/ — hand-calculated unit tests for the Kalman update, the full estimate, the degenerate-fit fallback, and the too-little-history fallback path.

Clone this wiki locally