Skip to content

Releases: Parisi-Labs/forecastops

v0.2.1

13 Jun 01:25
0508f4e

Choose a tag to compare

v0.2.1 Pre-release
Pre-release

Patch release adding the agent-facing diagnosis layer and fixing benchmark scoring.

Added

  • fops.diagnose(run_id) (and fops diagnose <run_id>) — a compact, machine-readable run diagnosis: overall metrics, skill vs. benchmark, worst horizons / series / regimes, validation, and artifact URIs. Worst-series is aggregated server-side over the full forecast artifact.
  • Richer capture traces — the root forecast.run span now carries the run's semantic context (adapter, group, cutoff/target ranges, horizon, series/points counts, validation status, artifact URIs).
  • fops backtest CLI command, and a group option on fops capture.

Fixed

  • compare() no longer computes benchmark-side pinball from the model's quantile columns.
  • sMAPE now earns a skill score vs. the benchmark.
  • The local UI reports the installed package version instead of a hardcoded 0.1.0.

See CHANGELOG.md for details.

⚠️ Pre-release: still on the 0.x line; APIs may change.

v0.2.0

12 Jun 17:54
c5a1a16

Choose a tag to compare

v0.2.0 Pre-release
Pre-release

Second release of ForecastOps — the experiment loop and diagnostics layer on top of the 0.1.0 capture/evaluate foundation.

Added

  • fops.backtest(...) — evaluate a rolling-origin forecast panel as one grouped backtest, with per-cutoff and aggregate (mean/std) metrics.
  • Experiment groupscapture(group="...") tags related runs into a named group, with a Groups view and a group detail page showing per-metric mean ± std and stability across runs.
  • New metrics — sMAPE (scale-free ratio) and pinball/quantile loss (averaged over yhat_p<level> columns).
  • Regime slicing — metrics sliced by any categorical columns kept via a schema's extra_columns (region, holiday_flag, event_type, …), so error-by-regime breakdowns appear automatically.
  • Diagnostics cockpit on the run detail page — residual distribution, error by horizon, per-series worst offenders, and per-regime breakdowns.

Changed

  • The Nixtla adapter parses <model>-lo-<level>/<model>-hi-<level> prediction-interval columns into interval bounds and per-level quantile columns, so coverage, interval width, and pinball loss work for statsforecast/neuralforecast outputs.

Existing local stores are migrated in place to add the group columns. See CHANGELOG.md for details.

⚠️ Pre-release: still on the 0.x line; APIs may change.

v0.1.0

11 Jun 20:58

Choose a tag to compare

v0.1.0 Pre-release
Pre-release

Initial public pre-release of ForecastOps — a local-first observability and evaluation layer for production forecasts.

Highlights

  • fops.capture() normalizes forecasts from existing pipelines into local Parquet artifacts with a DuckDB run index; evaluate, compare, and diff compute horizon-aware metrics, benchmark skill, and run-to-run deltas.
  • Local read-only UI (fops ui) with Runs, run detail (per-series forecast inspector), Projects (error trends across captures), and Compare views.
  • Static HTML reports (fops report) and an fops CLI for capture, lint, evaluate, diff, and report workflows.
  • Optional OpenTelemetry export of aggregate metrics and capture traces — off by default, never includes raw forecast points.

Privacy defaults

  • UI binds to 127.0.0.1 and refuses other hosts unless --allow-remote is passed
  • No outbound network calls; raw forecast points never leave the local store

See CHANGELOG.md for the full release notes, including the pre-release hardening pass (merge safety, DuckDB lock handling, query pushdown, real OTel export).

⚠️ Pre-release: APIs may change before a stable 0.x line is established.