Skip to content

Conversation

@willkelly
Copy link
Contributor

@willkelly willkelly commented Nov 28, 2025

Summary

This PR significantly improves the performance of the memory package's fact operations through a series of targeted optimizations. The changes are purely performance improvements with no changes to the public API or data model.

Benchmark Results (16KB payload)

Operation Before After Improvement
set fact ~1.2ms 846µs 30% faster
update fact ~906µs 688µs 24% faster
retract fact ~468µs 360µs 23% faster
get fact ~117µs ~51µs 56% faster

What Was Optimized

1. LRU memoization for merkle reference hashing

Added a bounded LRU cache (1000 entries) for refer() results. Provides ~25x speedup on cache hits when the same content is referenced multiple times within or across transactions.

2. SQLite prepared statement caching

Caches 7 frequently-used SQL statements per database connection using a WeakMap for memory safety. Eliminates redundant statement preparation overhead. Result: 2.2x faster queries, ~20% faster writes.

3. Datum/fact hash reordering

The merkle-reference library caches sub-objects by identity during traversal. By computing the datum hash BEFORE the fact hash, the subsequent refer(assertion) call hits the cache when it encounters the payload sub-object. This is a simple reorder with a comment explaining why order matters. Result: ~16% faster sets.

4. Batch label lookups

Previously getLabels() performed N individual SELECT queries for N facts. Now uses a single batched query with SQLite's json_each() function. Reduces N queries to 1 query regardless of transaction size.

5. Use stored fact hash instead of recomputing

In conflict detection, we were reading a fact from the database (which includes the stored hash), discarding the hash, then calling refer() to recompute it. Now recall() returns the stored hash directly. Result: 7-16% faster across all operations.

Why These Changes Are Safe

  1. No API changes - All optimizations are internal implementation details
  2. No data model changes - Same merkle hashes, same storage format, same behavior
  3. All 31 existing tests pass - Full test coverage confirms correctness
  4. Caches are bounded - LRU eviction and WeakMap prevent memory leaks
  5. Pure performance - Same inputs produce same outputs, just faster

Bottleneck Analysis

Investigation showed that SQLite is NOT the bottleneck (~2-5µs for raw INSERT). Time breakdown for a 16KB fact write:

Component Time %
Merkle hashing ~370µs 44%
JSON operations ~280µs 33%
Commit log ~150µs 18%
SQLite ~35µs 4%

The optimizations target the hashing and database overhead where practical, while respecting the merkle DAG architecture that provides content-addressability, conflict detection, and deduplication.

Test plan

  • All 31 space-test.ts tests pass
  • Benchmarks show consistent improvement
  • No changes to public API or data format
  • Memory usage bounded by LRU cache limits

🤖 Generated with Claude Code


Summary by cubic

Optimized memory fact operations with internal caching and SQL statement reuse, yielding ~30% faster writes and ~56% faster reads on 16KB payloads. No API or data model changes.

  • Performance

    • Memoized merkle refer() with a bounded LRU (1000 entries) for repeated content.
    • Cached SQLite prepared statements per connection (WeakMap); ~2.2x faster queries and ~20% faster writes.
    • Computed datum hash before fact hash to reuse merkle sub-object cache (~16% faster sets).
    • Batched label lookups via json_each(), reducing N queries to 1.
    • Used stored fact hash in conflict detection instead of recomputing.
  • New Features

    • Added comprehensive benchmarks and a deno task: deno task bench.
    • Included isolation benchmarks to spotlight hashing vs. SQLite costs.

Written for commit d066037. Summary will update automatically on new commits.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 8 files

Prompt for AI agents (all 3 issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="packages/memory/test/benchmark.ts">

<violation number="1" location="packages/memory/test/benchmark.ts:65">
Warm-up calls `session.transact` without ever checking the returned Result, so initialization failures are silently ignored and every benchmark can run with invalid state.</violation>

<violation number="2" location="packages/memory/test/benchmark.ts:65">
The workflow benchmark drops every Result from `session.transact` and `session.query`, so any failure in create/read/update/retract steps is ignored and the workflow timing becomes meaningless.</violation>

<violation number="3" location="packages/memory/test/benchmark.ts:233">
Each query in the “get fact (10 specific docs)” benchmark discards the Result from `session.query`, so read failures are silently ignored and the benchmark can time no-op iterations.</violation>
</file>

Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR

Copy link
Contributor

@seefeldb seefeldb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks great! cc @ubik2 who might find this really interesting as well.

willkelly and others added 7 commits December 2, 2025 17:48
Add comprehensive benchmarks to measure write and read performance,
including isolation benchmarks to identify specific bottlenecks.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Add bounded LRU cache (1000 entries) to memoize refer() results in
reference.ts. refer() is a pure function computing SHA-256 hashes,
which was identified as the primary bottleneck via isolation benchmarks.

Benchmark results for cache hits:
- 3x refer() calls: 44µs vs ~500µs uncached (27x faster)
- 10x unclaimed refs: 2.5µs (400k ops/sec)

The memoization benefits real-world usage patterns:
- Repeated entity access (queries, updates on same docs)
- unclaimed({ the, of }) patterns called multiple times
- Multi-step transaction flows referencing same content

Implementation:
- reference.ts: LRU cache using Map with bounded eviction
- Updated imports in fact.ts, access.ts, error.ts, entity.ts to
  use memoized refer() from ./reference.ts instead of merkle-reference

The cache uses JSON.stringify as key (~7µs for 16KB) which is ~25x
faster than the SHA-256 hash computation (~170µs for 16KB).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Implemented prepared statement caching to eliminate redundant statement
preparation overhead on every database operation. Uses a WeakMap-based
cache per database connection to ensure proper cleanup and memory safety.

Changes:
- Added PreparedStatements type and getPreparedStatement() helper
- Cached 7 frequently-used SQL statements (EXPORT, CAUSE_CHAIN, GET_FACT,
  IMPORT_DATUM, IMPORT_FACT, IMPORT_MEMORY, SWAP)
- Removed manual finalize() calls as statements are reused
- Added finalizePreparedStatements() to close() for cleanup
- Updated all database query functions to use cached statements

Benchmark results (before → after):
- Single GET query: 117.5µs → 53.4µs (54.6% faster / 2.2x speedup)
- Single UPDATE: 906.6µs → 705.8µs (22.1% faster)
- Batch retract (10): 2.5ms → 1.9ms (24.0% faster)
- Query from 1000 docs: 89.6µs → 66.7µs (25.5% faster)
- Batch SET (100): 99.4ms → 88.1ms (11.4% faster)
- Batch SET (10): 8.6ms → 7.9ms (8.1% faster)
- Single SET: 1.2ms → 1.1ms (8.3% faster)

Overall, the optimization provides consistent improvements across all
operations with particularly strong gains in read-heavy workloads.
All 31 existing tests pass without modifications.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…t caching

The merkle-reference library caches sub-objects by identity during traversal.
By computing the datum hash BEFORE the fact hash, the subsequent refer(assertion)
call hits the cache when it encounters the payload sub-object, avoiding redundant
hashing of the same 16KB payload twice.

Before: refer(assertion) then refer(datum) - payload hashed twice
After:  refer(datum) then refer(assertion) - payload hash reused via WeakMap

This ordering matters because:
1. refer(datum) hashes the payload and caches it by object identity
2. refer(assertion) traverses {the, of, is: payload, cause} - when it reaches
   the 'is' field, the payload object reference hits the WeakMap cache

Benchmark results (16KB payload):
- set fact (single): 1.1ms → 924.7µs (16% faster)
- retract fact (single): 483.8µs → 462.4µs (4% faster)
- update fact (single): ~705µs → ~723µs (within noise)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Previously, getLabels() performed N individual SELECT queries to look up
labels for N facts in a transaction. This adds latency proportional to
the number of facts being processed.

Now uses a single batched query with SQLite's json_each() function to
handle an array of 'of' values:

  SELECT ... WHERE state.the = :the AND state.of IN (SELECT value FROM json_each(:ofs))

This reduces N queries to 1 query regardless of transaction size.

Changes:
- Added GET_LABELS_BATCH query constant using json_each() for IN clause
- Added 'getLabelsBatch' to prepared statement cache
- Rewrote getLabels() to collect 'of' values and execute single batch query

The optimization benefits workloads with label facts (access control,
classification). Benchmarks show ~4% improvement on batch operations,
with larger gains expected in label-heavy workloads.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
In conflict detection, we were reading a fact from the database and then
calling refer(actual) to compute its hash for comparison. But the fact
hash is already stored in the database (row.fact) - we were discarding it
and recomputing it unnecessarily.

Changes:
- Added RevisionWithFact<T> type that includes the stored fact hash
- Updated recall() to return row.fact in the revision
- Use revision.fact directly instead of refer(actual).toString()
- Strip 'fact' field from error reporting to maintain API compatibility

This eliminates a refer() call (~50-170µs) on the conflict detection path,
which is taken for duplicate detection and first insertions.

Benchmark results:
- set fact (single): ~1.0ms → 846µs (15% faster)
- update fact (single): ~740µs → 688µs (7% faster)
- retract fact (single): ~428µs → 360µs (16% faster)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Fix Reference type annotations for memoized refer()
- Validate Result in benchmarks to catch silent failures
- Apply deno fmt

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@willkelly willkelly force-pushed the memory_optimizations branch from 032200e to 74f990f Compare December 3, 2025 00:51
This makes it work automatically with `deno bench`
@willkelly willkelly merged commit 456ae7c into main Dec 3, 2025
10 checks passed
@willkelly willkelly deleted the memory_optimizations branch December 3, 2025 01:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants