Skip to content

fix: bind @op and emit sidecar events for history-range queries#1198

Merged
bplatz merged 6 commits intomainfrom
fix/history-ranges
Apr 29, 2026
Merged

fix: bind @op and emit sidecar events for history-range queries#1198
bplatz merged 6 commits intomainfrom
fix/history-ranges

Conversation

@bplatz
Copy link
Copy Markdown
Contributor

@bplatz bplatz commented Apr 24, 2026

Fixes two stacked bugs that left history-range queries ("from": "ledger@t:1", "to": "ledger@t:latest") returning wrong results on indexed ledgers:

  1. The @op variable always bound to null — the only call site that would populate it (Binding::from_object_with_t_op) had zero callers.
  2. Retracts and superseded asserts that lived in the history sidecar were never emitted — the binary cursor only read current-state base rows, and the sidecar was only consulted for point-in-time (AS OF t) replay.

Together these meant the reporter's query

{
  "from": "ledger:main@t:1",
  "to":   "ledger:main@t:latest",
  "select": ["?v", "?t", "?op"],
  "where": [{
    "@id": "<concept>",
    "skos:notation": {"@value":"?v","@t":"?t","@op":"?op"}
  }]
}

returned one row (the current asserted value) with ?op = null instead of the full timeline of assert / retract events.

What changed

New module: fluree-db-query/src/binary_history.rs

BinaryHistoryScanOperator is a thin wrapper around BinaryScanOperator. When ctx.history_mode is false it transparently delegates. When true, it runs a dedicated three-source merge in open():

  1. History sidecar entries (HistEntryV2) for each matching leaflet — assert + retract events with explicit op.
  2. Base leaflet rows whose t falls in [from_t, index_t] — emitted as op = assert.
  3. Overlay / novelty flakes in (index_t, to_t] — carry their own flake.op.

The merged Vec<Flake> is handed to the existing BinaryScanOperator via a new prime_history_flakes pub(crate) helper, which routes it through the already-history-aware flakes_to_bindings pipeline (binary_scan.rs:~704 copies flake.op onto the emitted Binding::Lit).

Narrowing

Leaves and leaflets are narrowed by the pattern's bound components so we only touch the relevant slice of the index:

  • subject bound → branch.find_leaves_for_subject(s_id) on SPOT
  • predicate-only bound → PSOT branch, with leaflet p_const / o_type_const skip
  • sidecar segment skipped when history_len == 0 or history_max_t < from_t

A subject-history query for one subject touches only the leaves containing that subject's s_id — no broad database scan.

Wiring

Every scan construction site now produces the history-aware wrapper:

  • ScanDatasetBuilder::build in dataset_operator.rs (the main planner path)
  • execute_pattern / execute_pattern_at / execute_pattern_with_overlay / execute_pattern_with_overlay_at in lib.rs (used by staging, policy, config resolver)

Non-history queries pay one extra vtable hop per next_batch (~1–2 ns, negligible against batch sizes of 100+ rows).

Helper-visibility changes

  • BinaryScanOperator::prime_history_flakes — new pub(crate) method that primes the operator to drain a pre-collected Vec<Flake> via the existing flake pipeline.
  • extract_bound_terms_snapshot, build_filter_from_snapshot_sids, filter_flakes_by_policy — bumped from fn to pub(crate) fn for reuse from binary_history.rs.

Cost model

All heavy lifting happens in open(), so that's where the fuel guardrail lives:

  • Persisted pass: 1000 micro-fuel per leaflet touched — same rate BinaryCursor::next_batch uses for non-history scans.
  • Novelty pass: 1 micro-fuel per matched novelty flake, charged during the walk. A captured FuelExceededError short-circuits the callback before the next flake is pushed. Pattern filter is applied inside the callback so we only charge for flakes we'd actually retain — otherwise a wide predicate on a crowded novelty slice would charge for every flake touched.
  • Per-flake emit: 1 micro-fuel per flake is charged downstream in flakes_to_bindings when next_batch drains the collected vec.

Together these prevent unbounded eager work before the first batch emission.

Tests

Six regression tests in fluree-db-api/tests/it_query_history_range.rs:

  1. history_range_emits_sidecar_events_with_op — reporter's scenario: reindex + upsert + reindex, verify all three events ((Alice, 1, assert), (Alice Smith, 2, assert), (Alice, 2, retract)) emit with correct @t and @op.
  2. history_range_novelty_only — unindexed ledger, verifies the novelty-only path also binds @op.
  3. history_range_op_constant_filter_assert"@op": "assert" as a constant filter returns only asserts.
  4. history_range_op_constant_filter_retract"@op": "retract" returns only retracts.
  5. history_range_sidecar_plus_novelty_boundary — reindex at t=1, transact t=2, query spanning index_t; exercises the to_t > index_t novelty-merge path.
  6. history_range_subject_unbound — predicate-only history across multiple subjects.

Known limitations / follow-ups

Documented in the binary_history.rs module header:

  • Multi-pattern history with joins. When a history query has two or more triple patterns, the first becomes the history-aware scan, but patterns 2+ are driven by batched_subject_probe_binary in join.rs, which still reads current-state base rows only. ?s values joined in from a history-aware outer scan will fetch today's label, not their historical labels. Fix needs a per-pattern history_semantics flag (propagated from parser's existing t_var/op_var tracking) plus a history-aware branch in the join probe.

  • Per-leaf sidecar pruning. open_leaf_handle(..., need_replay=true) fetches the whole sidecar blob up front. For leaves whose directory doesn't yet reveal any leaflet with history_max_t >= from_t, this is wasted I/O on local/cached reads. Fixing cleanly needs either a two-pass open (dir first, sidecar on demand) or leaf-level history_max_t on LeafEntry.

  • Streaming emit. collect_history_flakes materialises the matched event set into a Vec<Flake> before next_batch drains it. Subject- or predicate-bound queries match a small set so this is cheap; broad unconstrained history queries over wide time ranges could benefit from leaflet-at-a-time streaming.

  • BinaryRangeProvider::range hardcodes op: true (binary_range.rs:~495). Non-query callers that pass RangeOptions::history_mode: true (SHACL, reasoner, etc.) would get collapsed-state. No active caller does this today per grep, but it's a sharp edge worth cleaning up separately.

@bplatz bplatz requested review from aaj3f and zonotope April 24, 2026 14:49
Copy link
Copy Markdown
Contributor

@zonotope zonotope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two concerns. I think the fuel calculation could be a real issue for large queries, but I think the fix is easy.

The other concern is architectural. I think it would be worth pushing the history mode decision to the planner to simplify execution, but that change might be too big to do at this point. If that is the case, then we can save it for a later refactor.

Other than that, 👍🏾

Approved.

Comment thread fluree-db-query/src/binary_history.rs Outdated
// it, a broad history query could do unbounded column /
// sidecar loads in `open()` before the caller ever sees
// the first batch.
ctx.tracker.consume_fuel(1000)?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The semantics of BinaryCursor::next_batch is different here. That charges per call, and each call is processing a single batch. A leaflet larger than the batch size means multiple calls and so multiple fuel charges. In the history case, since all the work is done up front in open(), we charge fuel at most once per leaflet regardless of how many rows it has. Large leaflets where we have to do a lot of work would incur the same cost as small ones where we don't. Perhaps as a quick fix we could multiply the fuel charge by the row count. We have access to the row count before we do any work, so we should be able to do this before incurring any cost, and could short circuit before the overrun. The tradeoff is that we could overcharge on some queries though.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in c203a39

use crate::var_registry::VarId;

/// Scan operator that activates a dedicated history-range walk when
/// `ctx.history_mode` is true.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A history mode flag on the context spreads control flow over multiple layers of execution, which increases code complexity. We have to decide whether or not to set history mode in the context, then we have to change behavior at each point we read the flag. That makes the decision tree for query execution much more complex.

I think it would be simpler to remove the concept of history mode from the execution context and instead make it a planner concern. That way we would make the decision once, and then build a separate operator tree for history queries rather than increasing the complexity of an individual operator by making it apply in both cases. That would make the decision tree less "complected". You can take me out of Clojure, but you can't take Clojure out of me ;)

That might be a bigger refactor than this branch warrants, so I'm fine with saving it for later if it isn't feasible to do now.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good and looked at change an big enough for another PR so I can work on that, thanks for pointing it out.

@bplatz bplatz merged commit facccdf into main Apr 29, 2026
10 of 12 checks passed
@bplatz bplatz deleted the fix/history-ranges branch April 29, 2026 10:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants