Skip to content

[bench] Filter PreparedBoolExpression cache regression bench #1232

@aaj3f

Description

@aaj3f

Context

Follow-up bench from PR #1228 review (coverage gap #5). The PreparedBoolExpression (memory: mem:fact-01kmtf7e63dh83atzzsr92t8bx) eliminates per-row cacheability analysis on hot filter paths. A regression that accidentally re-runs the analysis per row would silently slow filter-heavy queries — no panic, no error, just "FILTER queries got 2× slower in production for no obvious reason."

Scope

Add a query-side micro-bench in fluree-db-query/benches/ (or fluree-db-api/benches/ if the bench needs the full Fluree stack to construct the PreparedBoolExpression realistically):

  • Setup: a small populated ledger (gen::people or gen::bsbm).
  • Two scenarios:
    • filter_prepared_cache_hot: a SPARQL FILTER query whose predicate is amenable to prepared-form caching, run repeatedly. The bench measures the per-row cost when the cache is hit.
    • filter_prepared_cache_cold: same query construction, but force a cache-miss shape so the bench measures the analysis cost. (Optional second scenario; skip if simulating a cache miss is awkward — the hot scenario is the priority.)
  • Throughput metric: rows/sec or ns/row.
  • Use the chassis pattern. bench_runtime(), current_profile(), current_scale(), next_ledger_alias().

Acceptance

  • Bench compiles and runs --test green at tiny scale.
  • regression-budget.json has an entry.
  • BENCHMARKING.md's "Current benches" table gets a row.
  • Bench-gate CI job picks it up.

References

Out of scope

The cache-miss scenario is nice-to-have; if it can't be constructed cleanly without forcing internal-API access, skip it and just bench the hot scenario.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions