Speed up delta calculation on schema insert/update by jviotti · Pull Request #767 · sourcemeta/one

jviotti · 2026-03-20T19:10:37Z

Signed-off-by: Juan Cruz Viotti jv@jviotti.com

Signed-off-by: Juan Cruz Viotti <jv@jviotti.com>

augmentcode · 2026-03-20T19:16:05Z

🤖 Augment PR Summary

Summary: This PR optimizes BuildState::save() to avoid rebuilding the entire on-disk state when only a small set of schema outputs/resolver entries change.

Changes:

Adds a can_patch mode that reuses the existing hash table capacity and copies the prior slot table, applying deletions in-place.
Detects which overlay/resolver overlay keys are updates vs inserts, and updates existing slots rather than rewriting all disk entries.
Introduces combined “old pool + appended pool” offsets so new strings can be appended without rewriting the old string pool.
Speeds up persisted schema-index generation by reusing the previously persisted schema table and recomputing only affected schemas when patching.
Writes the state to a temporary file and renames it into place, reducing risk of partial writes.

Technical Notes: Correctness now depends on careful handling of slot key/data offsets across two pools (old mapped pool and newly appended pool) and on keeping the persisted schema index consistent with incremental deletions/updates.

_{🤖 Was this summary useful? React with 👍 or 👎}

augmentcode

Review completed. 4 suggestions posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2026-03-20T19:16:06Z

src/build/state.cc

+      while (slots[index * SLOT_SIZE + SLOT_OCCUPIED] != 0) {
+        const auto *slot{slots.data() + index * SLOT_SIZE};
+        if (read_field<std::uint64_t>(slot, SLOT_HASH) == hash &&
+            slot_key(slot, this->string_pool) == entry_key) {


In write_new_slot’s update probe, slot_key(slot, this->string_pool) assumes all probed slots’ keys live in the old pool; after earlier updates/inserts in this save, a probed slot can have a key offset into the appended pool, which can lead to invalid reads if a hash collision forces a key-compare.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-03-20T19:16:06Z

src/build/state.cc

+        const auto new_offset{key_offset - old_pool_size};
+        return {pool.data() + new_offset, key_length};
+      }
+      if (this->string_pool != nullptr) {


read_slot_key falls back to this->string_pool even when can_patch is false; in the rebuild path, slot key offsets refer to the newly-built pool, so this can produce incorrect keys or out-of-bounds reads during schema index generation.

Severity: high

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-03-20T19:16:06Z

src/build/state.cc

+      const std::uint8_t *base;
+      if (can_patch && offset >= old_pool_size) {
+        base =
+            reinterpret_cast<const std::uint8_t *>(pool.data()) - old_pool_size;


read_slot_pool_string computes base = pool.data() - old_pool_size, which forms a pointer before the start of the pool allocation; even if the later base+offset lands back in-range, creating/using an out-of-bounds base pointer is undefined behavior in C++.

Severity: high

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-03-20T19:16:06Z

src/build/state.cc

-      const auto sentinel_position{after.find(sentinel_marker)};
-      if (sentinel_position == std::string_view::npos) {
-        continue;
+        save_schema_index.erase(


When patching from persisted_schema_table, this erases the entire schema record for any deleted target under that schema; if a single per-schema artifact is removed (but others remain), the persisted schema index may incorrectly drop the whole schema and later consumers could treat it as missing rather than “present but missing targets”.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

cubic-dev-ai

No issues found across 1 file

Signed-off-by: Juan Cruz Viotti <jv@jviotti.com>

github-actions

Benchmark Index (community)

Details

Benchmark suite	Current: `fbe2b46`	Previous: `3fb41b6`	Ratio
`Add one schema (0 existing)`	`21` ms	`21` ms	`1`
`Add one schema (100 existing)`	`25` ms	`26` ms	`0.96`
`Add one schema (1000 existing)`	`75` ms	`86` ms	`0.87`
`Add one schema (10000 existing)`	`623` ms	`686` ms	`0.91`
`Update one schema (1 existing)`	`18` ms	`19` ms	`0.95`
`Update one schema (101 existing)`	`26` ms	`28` ms	`0.93`
`Update one schema (1001 existing)`	`75` ms	`82` ms	`0.91`
`Update one schema (10001 existing)`	`634` ms	`637` ms	`1.00`
`Cached rebuild (1 existing)`	`10` ms	`11` ms	`0.91`
`Cached rebuild (101 existing)`	`12` ms	`12` ms	`1`
`Cached rebuild (1001 existing)`	`27` ms	`26` ms	`1.04`
`Cached rebuild (10001 existing)`	`195` ms	`188` ms	`1.04`
`Index 100 schemas`	`116` ms	`118` ms	`0.98`
`Index 1000 schemas`	`965` ms	`1154` ms	`0.84`
`Index 10000 schemas`	`13462` ms	`14786` ms	`0.91`

This comment was automatically generated by workflow using github-action-benchmark.

github-actions

Benchmark Index (enterprise)

Details

Benchmark suite	Current: `fbe2b46`	Previous: `3fb41b6`	Ratio
`Add one schema (0 existing)`	`23` ms	`23` ms	`1`
`Add one schema (100 existing)`	`29` ms	`29` ms	`1`
`Add one schema (1000 existing)`	`82` ms	`84` ms	`0.98`
`Add one schema (10000 existing)`	`691` ms	`700` ms	`0.99`
`Update one schema (1 existing)`	`21` ms	`21` ms	`1`
`Update one schema (101 existing)`	`28` ms	`28` ms	`1`
`Update one schema (1001 existing)`	`81` ms	`83` ms	`0.98`
`Update one schema (10001 existing)`	`666` ms	`620` ms	`1.07`
`Cached rebuild (1 existing)`	`12` ms	`12` ms	`1`
`Cached rebuild (101 existing)`	`14` ms	`14` ms	`1`
`Cached rebuild (1001 existing)`	`33` ms	`28` ms	`1.18`
`Cached rebuild (10001 existing)`	`233` ms	`193` ms	`1.21`
`Index 100 schemas`	`124` ms	`127` ms	`0.98`
`Index 1000 schemas`	`1065` ms	`1069` ms	`1.00`
`Index 10000 schemas`	`13988` ms	`13933` ms	`1.00`

This comment was automatically generated by workflow using github-action-benchmark.

Speed up delta calculation on schema insert/update

07f499b

Signed-off-by: Juan Cruz Viotti <jv@jviotti.com>

augmentcode bot reviewed Mar 20, 2026

View reviewed changes

cubic-dev-ai bot reviewed Mar 20, 2026

View reviewed changes

Fixes

fbe2b46

Signed-off-by: Juan Cruz Viotti <jv@jviotti.com>

github-actions bot reviewed Mar 20, 2026

View reviewed changes

jviotti merged commit 780a4bc into main Mar 20, 2026
6 checks passed

jviotti deleted the optimisations-delta branch March 20, 2026 19:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Speed up delta calculation on schema insert/update#767

Speed up delta calculation on schema insert/update#767
jviotti merged 2 commits intomainfrom
optimisations-delta

jviotti commented Mar 20, 2026

Uh oh!

augmentcode bot commented Mar 20, 2026

Uh oh!

augmentcode bot left a comment

Uh oh!

augmentcode bot Mar 20, 2026

Uh oh!

augmentcode bot Mar 20, 2026

Uh oh!

augmentcode bot Mar 20, 2026

Uh oh!

augmentcode bot Mar 20, 2026

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

github-actions bot left a comment

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jviotti commented Mar 20, 2026

Uh oh!

augmentcode bot commented Mar 20, 2026

Uh oh!

augmentcode bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Benchmark Index (community)

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Benchmark Index (enterprise)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant