Skip to content

Scottlr/Spanfold

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

177 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spanfold

Spanfold

Temporal interval comparison for application state.

When a predicate changes — a service goes down, a threshold is crossed, a status flips — Spanfold records that period as a window. When you have multiple sources reporting the same condition, Spanfold tells you exactly where they agreed, diverged, lagged, or left gaps.

Install

dotnet add package Spanfold

Quick Start

// 1. Define: what condition are you tracking, and for which key?
var pipeline = Spanfold.Spanfold
    .For<MonitorEvent>()
    .RecordWindows()
    .TrackWindow("Outage",
        key:       e => e.ServiceId,
        predicate: e => e.Status == "down");

// 2. Ingest events from each source
pipeline.Ingest(providerAEvents, source: "provider-a");
pipeline.Ingest(providerBEvents, source: "provider-b");

// 3. Compare: who saw what, when, and for how long?
var result = pipeline.History
    .Compare("Outage audit")
    .Target("provider-a", s => s.Source("provider-a"))
    .Against("provider-b", s => s.Source("provider-b"))
    .Within(scope => scope.Window("Outage"))
    .Using(c => c.Overlap().Residual().Missing().Coverage())
    .Run();

// result.OverlapRows   — periods both sources agreed on
// result.ResidualRows  — periods A reported that B missed
// result.MissingRows   — periods B reported that A missed

What It Solves

You have multiple systems — monitoring providers, pipeline stages, model versions, detectors — all reporting observations about the same thing. You want to know:

  • When did each source think this condition was true, and for how long?
  • Where did sources agree, diverge, lag, or leave gaps?
  • What was knowable at a specific point in time, without leaking future data?

A latest-value store tells you the current state. An event log tells you what happened. Spanfold tells you when each source believed what, and where those beliefs differed.


Why Not SQL Interval Joins or Ad Hoc Code?

Interval comparison looks simple until you account for partial overlaps, gaps within coverage, multiple windows on the same key, lead/lag timing, live windows, and known-at filtering. Each question adds another query or another special case.

SQL interval join — overlap only:

SELECT a.service_id,
       GREATEST(a.start, b.start) AS overlap_start,
       LEAST(a.end,   b.end)   AS overlap_end
FROM   outage_windows_a a
JOIN   outage_windows_b b
  ON   a.service_id = b.service_id
 AND   a.start < b.end
 AND   a.end   > b.start
-- Residual (A-only) needs a NOT EXISTS query.
-- Gap detection, lead/lag, coverage % each need another query.
-- Known-at filtering (no future leakage) needs timestamp join bookkeeping.
-- Live/provisional windows need finality state tracked separately.

Spanfold — all of the above in one comparison plan:

var result = pipeline.History
    .Compare("Outage audit")
    .Target("provider-a", s => s.Source("provider-a"))
    .Against("provider-b", s => s.Source("provider-b"))
    .Within(scope => scope.Window("Outage"))
    .Using(c => c
        .Overlap()   // both agreed
        .Residual()  // A saw, B missed
        .Missing()   // B saw, A missed
        .Gap()       // empty periods inside observed scope
        .LeadLag()   // transition timing drift between sources
        .Coverage()) // magnitude and coverage percentage
    .Run();
// Known-at filtering and live horizons are built into the comparison model.

Predicates Are Not Limited To Booleans

A window opens when a predicate changes. The predicate can evaluate anything — the window tracks the period it held.

Predicate type Example
Boolean isUp == true
Threshold cpuPercent > 80
Enum / status status == "degraded"
Numeric range latencyMs >= 500 && latencyMs < 2000
Any value change alert level changed, model output changed, annotation updated

Comparator Families

Comparator What it measures
Overlap Duration where both sides agreed
Residual Target-only duration (what the comparison side missed)
Missing Comparison-only duration (what the target missed)
Coverage Magnitude and coverage percentage
Gap Empty spaces inside an observed scope
Symmetric diff Disagreement in both directions
Containment Whether one period stays inside another
Lead / Lag Transition timing drift between sources
As-of Point-in-time lookup without future leakage

Comparator reference · Comparison guide


Use Cases

Monitoring provider outage comparison

You have two or more monitoring providers watching the same service. When an outage occurs, each provider may report it at a slightly different time, recover at a different time, or miss it entirely. Spanfold records each provider's outage windows and emits structured rows showing exactly where they agreed, where one reported a period the other didn't, and how large each discrepancy was.

Provider outage comparison

Pipeline stage divergence

A processing pipeline passes events through multiple stages — ingestion, enrichment, classification, alerting. When a condition appears at one stage but not another, or arrives late, or disappears before the final stage, Spanfold records a window at each stage and compares them to show where state diverged, lagged, or dropped.

Pipeline stage divergence

Backtesting without future leakage

Auditing a past decision means using only what was knowable at the time — not data that arrived later. Spanfold's known-at filtering separates when a state was observed from when it was available to the system, so backtests, replays, and decision-point audits do not accidentally include future observations even when replaying historical data.

No-future-leakage backtesting


Core Concepts

Windows — a half-open period where a predicate held for a key. Can be closed, still open, or clipped to a live horizon.

Sources and lanes — where an observation came from: providers, monitors, gateways, pipeline stages, model versions, or any other reporting lane.

Segments and tags — segments split a window when analytical context changes; tags attach metadata without splitting.

Comparisons — a staged plan: target side, comparison side, scope, normalization, and comparator families. Produces structured temporal evidence.

Known-at safety — separates when a state happened from when it was observable. Prevents future leakage in backtests and replays.

Live horizons — an explicit cutoff for evaluating still-open windows. Preserves provisional row metadata so live and final rows are distinguishable.


Why Not Just X?

Latest-state tracking — answers what is true now. Spanfold answers when it was true, who saw it, and whether another lane missed it.

Event sourcing — stores durable facts and rebuilds state. Spanfold analyzes the periods where that state held after those facts have been interpreted.

Stream processors — handle online computation, routing, enrichment, and aggregation. Spanfold is narrower: it records interpreted state windows and compares their temporal evidence.

Observability and metrics tools — aggregate time into counters, histograms, and dashboards. Spanfold keeps individual windows and emits comparison rows with full temporal structure.

A database with interval storage — can persist windows, but will not provide staged comparison plans, normalization, live finality, known-at filtering, diagnostics, or deterministic exports.


.NET Package

The .NET package has the original C# API surface, CLI, testing helpers, samples, benchmarks, and package documentation.

dotnet add package Spanfold         # core recording and comparison
dotnet add package Spanfold.Testing # optional: fixtures, snapshots, assertions

.NET package README

Python Package

A typed Python package that tracks the core C# API surface with idiomatic snake-case names, dataclass models, deterministic exports, testing helpers, and fixture CLI commands.

Python package README


Repository Layout

packages/
  dotnet/
    src/
    tests/
    samples/
    benchmarks/
    docs/
    Spanfold.slnx
  python/
    src/
    tests/
    samples/
    docs/
    pyproject.toml
docs/
  index.html
  assets/

Working With Packages

Run the .NET reference tests:

dotnet test packages/dotnet/Spanfold.slnx

Run the Python port tests:

cd packages/python
python -m pip install -e ".[dev]"
pytest

Documentation

About

Spanfold is a composable C# library for stateful event windows and hierarchical rollups.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors