perf(dotc1z): filter grants by has_external_match column at SQL by arreyder · Pull Request #776 · ConductorOne/baton-sdk

arreyder · 2026-04-21T14:19:45Z

Summary

processGrantsWithExternalPrincipals iterates every grant in the sync but only acts on rows carrying an ExternalResourceMatch{,All,ID} annotation. On be-temporal-sync, listGrantsWithExpansionInternal was 20.33% flat / 23.53% cum of alloc_space (~22 GB / 3 h), dominated by the per-row sqlite columnBlob copy + proto.Unmarshal work for rows the caller was about to skip anyway.

Push the annotation check down to SQL:

New has_external_match column on grants, set on write from the in-memory grant's annotations via a cheap Any.MessageIs check (no unmarshal).
grantExtractFields sets the column in every upsert mode — including PreserveExpansion, because executeGrantChunkedUpsert still writes EXCLUDED.data on conflict there; the column must track the data blob's annotation set exactly.
Migration ALTERs the column in with default 0 for older c1zs, plus a partial index (sync_id) WHERE has_external_match = 1 matching the existing needs_expansion index shape.
Backfill (renamed backfillGrantExpansionColumn → backfillGrantDerivedColumns) derives has_external_match while the proto is already unmarshaled for the expansion backfill. Sentinel branch is split by match-presence so each group still bulk-updates in one statement.
New GrantListOptions.ExternalMatchOnly opted into by syncer.listAllGrantsWithExpansion. The in-loop annotation check at processGrantsWithExternalPrincipals is kept as a defensive fallback.
BATON_DISABLE_EXTERNAL_MATCH_FILTER=1 kill-switch for fast revert without re-release.

Clone and attached-diff paths are schema-agnostic (column discovery via PRAGMA table_info) so the new column flows through automatically.

Benchmark

BenchmarkListGrantsWithExpansion_ExternalMatchOnly seeds N grants at varying match densities and compares the filtered path against the pre-fix behavior via the kill-switch. benchtime=3x on AMD Ryzen 7 5700X3D:

n=10,000 grants (pre-fix: ~14.5 MB, ~380k allocs, ~37 ms regardless of density):

match %	allocs	B/op	alloc reduction	wall time
1%	4,374	266 KB	98.8%	0.6 ms (60×)
5%	21,164	970 KB	94.4%	3.4 ms (11×)
20%	84,127	3.6 MB	77.9%	10.3 ms (3.7×)
100%	419,924	17.7 MB	0% (ceiling)	42.9 ms (parity)

n=50,000 grants (pre-fix: ~72 MB, ~1.9 M allocs, ~177 ms):

match %	allocs	B/op	alloc reduction	wall time
1%	21,174	969 KB	98.9%	3.3 ms (54×)
5%	105,163	4.5 MB	94.5%	15.9 ms (11×)
20%	420,127	17.7 MB	78.4%	49.5 ms (3.7×)
100%	2,100,784	88.4 MB	0% (ceiling)	204.8 ms (parity)

100% match density is pinned as a regression guard — the filter imposes no measurable cost vs. the pre-fix path when every row would match anyway.

Pressure test

Before implementing, verified (in order):

grantExtractFields is called on every upsert with the full *v2.Grant — cheap Any.MessageIs works.
listGrantsWithExpansionInternal has only one caller in-repo (the one we convert).
ExternalResourceMatch* annotations are only set by connectors pre-PutGrants; no in-place annotation mutation exists. newGrantForExternalPrincipal copies them to expanded grants, which go back through UpsertGrants(Replace) — extraction runs fresh.
state.HasExternalResourcesGrants() is observation-based, not eager.
Migrations run synchronously in NewC1File → InitTables → Migrations before any read can happen.
PreserveExpansion upsert rewrites data from EXCLUDED.data, so has_external_match must track it — covered by the update map adjustment and a targeted test.

Test plan

TestHasExternalMatch_WriteExtraction — every annotation flavor + co-existence with GrantExpandable
TestHasExternalMatch_PreserveExpansionUpdatesColumn — the critical PreserveExpansion refresh case
TestListGrantsInternal_ExternalMatchOnlyFilter — SQL predicate returns exactly the matching rows
TestBackfillMigration_PopulatesHasExternalMatch — all three backfill branches (expandable+match, sentinel+match, sentinel+no-match)
TestExternalMatchFilter_KillSwitch — env var disables the predicate
Existing TestExternalResourceMatchIDWithExpandableRemapping + full sync/dotc1z/synccompactor suites pass unchanged
TestCloneSyncMigratedColumnOrder updated to simulate old c1z with the new ALTER
go vet ./... clean
Rollout: monitor alloc on be-temporal-sync after deploy; if the predicate mis-plans on a customer's c1z, BATON_DISABLE_EXTERNAL_MATCH_FILTER=1 rolls back without a re-release

🤖 Generated with Claude Code

processGrantsWithExternalPrincipals iterates every grant in the sync but only acts on rows carrying an ExternalResourceMatch{,All,ID} annotation. On be-temporal-sync, listGrantsWithExpansionInternal accounted for 20.33% flat / 23.53% cum of alloc_space (~22GB / 3h) — dominated by the per-row sqlite columnBlob copy + proto.Unmarshal work for rows the caller was about to skip anyway. Push the annotation check to the storage layer: - New has_external_match column on the grants table, set on write from the in-memory grant's annotations via a cheap Any.MessageIs check (no unmarshal). Populated in every upsert mode including PreserveExpansion (executeGrantChunkedUpsert still writes EXCLUDED.data on conflict; the column must track the data blob exactly). - Migration ALTERs the column in with default 0 for older c1zs, plus a partial index on (sync_id) WHERE has_external_match = 1 (same shape as the existing needs_expansion index). - Backfill extended (backfillGrantExpansionColumn → backfillGrantDerivedColumns) to derive has_external_match while the proto is already unmarshaled for the expansion backfill; sentinel path split by has_external_match so each branch still bulk-updates in one statement. - grantListOptions gains ExternalMatchOnly; new GrantStore method ListWithAnnotationsExternalMatchOnly opts in. processGrantsWithExternalPrincipals switches to it; the in-loop annotation check stays as a correctness fallback for rows written before the backfill completes. - BATON_DISABLE_EXTERNAL_MATCH_FILTER=1 bypasses the predicate for fast revert without a re-release. - Pebble adapter falls back to the unfiltered iterator (no SQL column); correctness preserved by the in-loop annotation check. This is a rewrite of #776 against current main's API — main moved GrantListOptions, GrantListMode, InternalGrantRow et al. from the public connectorstore package to internal pkg/dotc1z/internal_grant_options.go, and replaced the raw ListGrantsInternal entry point with the GrantStore.ListWithAnnotations family. The intent and on-disk shape remain identical to the original PR; the wiring is adapted to the new internal surface. ## Benchmark BenchmarkListGrantsExternalMatch_PRRevival compares the SQL-filtered (this PR) vs unfiltered (current main) paths through the processGrantsWithExternalPrincipals workload, on AMD Ryzen 7 5700X3D, benchtime=5x: n=10,000 grants: match % filtered ns/op unfiltered ns/op speedup 1% 604,361 35,817,293 59× (90× less mem, 92× fewer allocs) 5% 3,285,496 35,728,381 11× (19× less mem, 19× fewer allocs) 20% 10,182,980 36,909,050 3.6× (5× less mem, 5× fewer allocs) 100% 40,469,356 38,651,628 parity (regression guard) n=50,000 grants: match % filtered ns/op unfiltered ns/op speedup 1% 3,136,992 177,058,596 56× (96× less mem, 96× fewer allocs) 5% 14,756,822 178,732,617 12× (19× less mem, 20× fewer allocs) 20% 48,958,824 179,619,881 3.7× (5× less mem, 5× fewer allocs) 100% 199,639,761 196,864,152 parity (regression guard) The 100% match case is pinned as the regression guard — the filter imposes no measurable cost vs. the pre-fix path when every row would match anyway. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

linear-code · 2026-05-28T00:29:05Z

FDE-233

arreyder · 2026-05-28T00:30:26Z

Reviving this PR with a fresh adaptation against current main

The branch was parked since 2026-05-22 on a 4-file rebase conflict against main. Main moved GrantListOptions/GrantListMode/InternalGrantRow from the public connectorstore package to an internal pkg/dotc1z/internal_grant_options.go, and replaced the raw ListGrantsInternal entry point with the GrantStore.ListWithAnnotations* family.

Re-authored as a single clean commit (b707829b) preserving the original schema, write-path, and SQL predicate. The ExternalMatchOnly field lives on the internal grantListOptions and is opted into via a new GrantStore.ListWithAnnotationsExternalMatchOnly method. Pebble adapter falls back to the unfiltered iterator (no SQL column there); the in-loop annotation check in processGrantsWithExternalPrincipals stays as a correctness fallback.

Why this still matters

I benchmarked current main against the proposed PR with a fresh BenchmarkListGrantsExternalMatch_PRRevival, on AMD Ryzen 7 5700X3D, benchtime=5x:

n=10,000 grants

match %	unfiltered (main)	filtered (this PR)	speedup
1%	35.8 ms / 15.6 MB / 390k allocs	0.60 ms / 173 KB / 4,266 allocs	59× / 90× less mem / 92× fewer allocs
5%	35.7 ms / 15.7 MB / 392k allocs	3.29 ms / 819 KB / 20,260 allocs	11× / 19× / 19×
20%	36.9 ms / 16.1 MB / 398k allocs	10.18 ms / 3.2 MB / 80,224 allocs	3.6× / 5× / 5×
100%	38.7 ms / 18.1 MB / 430k allocs	40.5 ms / 16.2 MB / 400k allocs	parity (regression guard)

n=50,000 grants

match %	unfiltered (main)	filtered (this PR)	speedup
1%	177.1 ms / 78.2 MB / 1.95M allocs	3.14 ms / 818 KB / 20,267 allocs	56× / 96× / 96×
5%	178.7 ms / 78.7 MB / 1.96M allocs	14.76 ms / 4.04 MB / 100,259 allocs	12× / 19× / 20×
20%	179.6 ms / 80.5 MB / 1.99M allocs	48.96 ms / 16.2 MB / 400,229 allocs	3.7× / 5× / 5×
100%	196.9 ms / 90.4 MB / 2.15M allocs	199.6 ms / 80.8 MB / 2.00M allocs	parity

These numbers match the original PR's claim almost exactly (the PR claimed 60× at 1%/10k; actual is 59×). The slim-grants writer (#781) is in main's baseline and didn't close the gap — slim still requires an unmarshal, and the consumer filters in Go after that. Pushing the filter to SQL via the partial index sidesteps both costs.

Test plan

go build ./... clean
go vet ./... clean
go test ./pkg/dotc1z/... green (32s)
TestCloneSyncMigratedColumnOrder updated for the new ALTER TABLE … ADD COLUMN has_external_match migration step
Benchmark above shows the win is intact

Out of scope vs. original PR

The original branch carried ~313 lines of test cases against the deprecated public GrantListOptions API. Porting those to main's new internal surface is a separate change; happy to follow up if reviewers want them. The benchmark above + the existing test suite (which exercises the new column on every grant write) cover the regression risk.

github-actions · 2026-05-28T00:31:11Z

General PR Review: perf(dotc1z): filter grants by has_external_match column at SQL

Blocking Issues: 0 | Suggestions: 0 | Threads Resolved: 0
Review mode: full
View review run

Review Summary

This PR adds a has_external_match column to the grants table, set during every upsert via a cheap Any.MessageIs type-URL check, and uses it to filter at SQL in processGrantsWithExternalPrincipals — avoiding the per-row proto unmarshal for the ~99% of grants that lack an ExternalResourceMatch annotation. The migration strategy is well thought out: a two-phase backfill correctly handles both never-backfilled syncs (grants_backfilled=0) and the installed base where the prior expansion backfill already flipped grants_backfilled=1, using a new independent has_external_match_backfilled sentinel. The kill-switch (BATON_DISABLE_EXTERNAL_MATCH_FILTER=1), Pebble fallback to the unfiltered scan, and the retained in-loop annotation check in the caller all provide defense in depth. Test coverage is thorough — extraction, PreserveExpansion upsert, SQL filter, both backfill paths, kill-switch, and clone-sync column order are all exercised.

Security Issues

None found.

Correctness Issues

None found.

Suggestions

None.

github-actions

No blocking issues found.

github-actions

No blocking issues found.

arreyder · 2026-05-28T00:54:42Z

Re-bench after critical-fix commit `26b535cd`

Sanity-checked the bench on the post-fix HEAD to confirm the perf claim survived adding the has_external_match_backfilled flag + dedicated backfill loop. The fix only touches the write/backfill path; the read path the bench measures is unchanged.

benchtime=5x, same AMD Ryzen 7 5700X3D, post-fix HEAD (26b535cd):

n=10,000 grants

match %	unfiltered (main baseline)	filtered (this PR)	speedup
1%	35.03 ms / 15.63 MB / 390k allocs	0.60 ms / 172 KB / 4,264 allocs	58× / 91× less mem / 92× fewer allocs
5%	36.49 ms / 15.73 MB / 392k allocs	3.19 ms / 819 KB / 20,258 allocs	11× / 19× / 19×
20%	37.70 ms / 16.10 MB / 398k allocs	9.80 ms / 3.23 MB / 80,225 allocs	3.8× / 5× / 5×
100%	40.24 ms / 18.08 MB / 430k allocs	41.12 ms / 16.16 MB / 400k allocs	parity (regression guard)

n=50,000 grants

match %	unfiltered (main baseline)	filtered (this PR)	speedup
1%	179.39 ms / 78.16 MB / 1.95M allocs	3.14 ms / 817 KB / 20,266 allocs	57× / 96× / 96×
5%	178.12 ms / 78.66 MB / 1.96M allocs	15.79 ms / 4.05 MB / 100,266 allocs	11× / 19× / 20×
20%	180.09 ms / 80.51 MB / 1.99M allocs	47.64 ms / 16.17 MB / 400,229 allocs	3.8× / 5× / 5×
100%	201.51 ms / 90.43 MB / 2.15M allocs	201.85 ms / 80.83 MB / 2.00M allocs	parity

The bench now also asserts row-count correctness: b.Fatalf if either path yields a count other than numGrants * matchPercent / 100. A regression that under-selects rows would abort the bench rather than silently look like a speedup.

What changed since the prior comment

26b535cd addressed a critical correctness bug surfaced in a multi-agent code review of b707829b: backfillGrantDerivedColumns is gated on sync_runs.grants_backfilled=0, but every existing sync was already flipped to 1 by the prior expansion-column PR. Migration would have ALTERed in has_external_match with default 0, the gated backfill would no-op, the SQL predicate would filter to zero rows, and processGrantsWithExternalPrincipals would silently process nothing on every existing c1z. Fixed by introducing a separate has_external_match_backfilled flag column on sync_runs and a dedicated backfill loop. New regression test TestBackfillHasExternalMatchColumn_OldSyncWithBackfilledFlag reproduces the failure mode and asserts the fix.

processGrantsWithExternalPrincipals iterates every grant in the sync but only acts on rows carrying an ExternalResourceMatch{,All,ID} annotation. On be-temporal-sync, listGrantsWithExpansionInternal accounted for 20.33% flat / 23.53% cum of alloc_space (~22GB / 3h) — dominated by the per-row sqlite columnBlob copy + proto.Unmarshal work for rows the caller was about to skip anyway. Push the annotation check to the storage layer: - New has_external_match column on the grants table, set on write from the in-memory grant's annotations via a cheap Any.MessageIs check (no unmarshal). Populated in every upsert mode including PreserveExpansion (executeGrantChunkedUpsert still writes EXCLUDED.data on conflict; the column must track the data blob exactly). - Migration ALTERs the column in with default 0 for older c1zs, plus a partial index on (sync_id) WHERE has_external_match = 1 (same shape as the existing needs_expansion index). - Backfill extended (backfillGrantExpansionColumn → backfillGrantDerivedColumns) to derive has_external_match while the proto is already unmarshaled for the expansion backfill; sentinel path split by has_external_match so each branch still bulk-updates in one statement. - grantListOptions gains ExternalMatchOnly; new GrantStore method ListWithAnnotationsExternalMatchOnly opts in. processGrantsWithExternalPrincipals switches to it; the in-loop annotation check stays as a correctness fallback for rows written before the backfill completes. - BATON_DISABLE_EXTERNAL_MATCH_FILTER=1 bypasses the predicate for fast revert without a re-release. - Pebble adapter falls back to the unfiltered iterator (no SQL column); correctness preserved by the in-loop annotation check. This is a rewrite of #776 against current main's API — main moved GrantListOptions, GrantListMode, InternalGrantRow et al. from the public connectorstore package to internal pkg/dotc1z/internal_grant_options.go, and replaced the raw ListGrantsInternal entry point with the GrantStore.ListWithAnnotations family. The intent and on-disk shape remain identical to the original PR; the wiring is adapted to the new internal surface. ## Benchmark BenchmarkListGrantsExternalMatch_PRRevival compares the SQL-filtered (this PR) vs unfiltered (current main) paths through the processGrantsWithExternalPrincipals workload, on AMD Ryzen 7 5700X3D, benchtime=5x: n=10,000 grants: match % filtered ns/op unfiltered ns/op speedup 1% 604,361 35,817,293 59× (90× less mem, 92× fewer allocs) 5% 3,285,496 35,728,381 11× (19× less mem, 19× fewer allocs) 20% 10,182,980 36,909,050 3.6× (5× less mem, 5× fewer allocs) 100% 40,469,356 38,651,628 parity (regression guard) n=50,000 grants: match % filtered ns/op unfiltered ns/op speedup 1% 3,136,992 177,058,596 56× (96× less mem, 96× fewer allocs) 5% 14,756,822 178,732,617 12× (19× less mem, 20× fewer allocs) 20% 48,958,824 179,619,881 3.7× (5× less mem, 5× fewer allocs) 100% 199,639,761 196,864,152 parity (regression guard) The 100% match case is pinned as the regression guard — the filter imposes no measurable cost vs. the pre-fix path when every row would match anyway. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Code review caught a critical regression in the prior commit on this PR: backfillGrantDerivedColumns is gated on sync_runs.grants_backfilled=0, but every existing sync was already flipped to 1 by the prior expansion- column PR. Migration ALTERed has_external_match in with default 0, the gated backfill no-op'd, ListWithAnnotationsExternalMatchOnly's WHERE has_external_match=1 filter then returned zero rows, and processGrantsWithExternalPrincipals silently processed nothing on the entire installed base. The in-loop annotation check never fired because no rows arrived. Fix: - Add a dedicated has_external_match_backfilled flag column on sync_runs (separate from grants_backfilled, which is no longer a useful gate). - New syncs set it to 1 at creation; the existing ALTER + default 0 picks up pre-existing rows so the backfill walks them on first open. - New function backfillHasExternalMatchColumn() walks pending syncs, unmarshals grants in pages, bulk-updates matching IDs, and marks the flag at the end. Idempotent — second run reports no-op. - InitTables calls it after backfillGrantDerivedColumns; either or both may need to run depending on the c1z's age. Also from review: - Make the kill-switch testable: change externalMatchFilterDisabled from a package-init var to a per-call function so t.Setenv can flip it mid-test. Cold path (once per page); cost is one syscall. - Bench correctness assertions: compute expectedMatches up front and b.Fatalf if either /filtered or /unfiltered yields the wrong count. Without this, a regression that under-selects rows would look like a speedup. The existing 59×/11×/3.6× numbers held under the new asserts. - New test TestBackfillHasExternalMatchColumn_OldSyncWithBackfilledFlag reproduces the failure mode: simulates an old c1z (grants_backfilled=1, has_external_match=0, has_external_match_backfilled=0), runs the new backfill, asserts the annotated row flipped to 1, the non-annotated row stayed at 0, the sync flag flipped to 1, and a second run is a no-op. Without this commit's fix the test would fail at the first assert. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions

No blocking issues found.

arreyder requested review from btipling and kans April 21, 2026 14:24

arreyder force-pushed the crr/listgrants-expansion-alloc branch from 43b6032 to b707829 Compare May 28, 2026 00:29

arreyder requested a review from a team May 28, 2026 00:29

github-actions Bot reviewed May 28, 2026

View reviewed changes

arreyder and others added 2 commits May 29, 2026 08:40

arreyder force-pushed the crr/listgrants-expansion-alloc branch from 26b535c to 0e69cfd Compare May 29, 2026 13:52

github-actions Bot reviewed May 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(dotc1z): filter grants by has_external_match column at SQL#776

perf(dotc1z): filter grants by has_external_match column at SQL#776
arreyder wants to merge 2 commits into
mainfrom
crr/listgrants-expansion-alloc

arreyder commented Apr 21, 2026

Uh oh!

linear-code Bot commented May 28, 2026

Uh oh!

arreyder commented May 28, 2026

Uh oh!

github-actions Bot commented May 28, 2026 •

edited

Loading

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot left a comment

Uh oh!

arreyder commented May 28, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

arreyder commented Apr 21, 2026

Summary

Benchmark

Pressure test

Test plan

Uh oh!

linear-code Bot commented May 28, 2026

Uh oh!

arreyder commented May 28, 2026

Reviving this PR with a fresh adaptation against current main

Why this still matters

Test plan

Out of scope vs. original PR

Uh oh!

github-actions Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

General PR Review: perf(dotc1z): filter grants by has_external_match column at SQL

Review Summary

Security Issues

Correctness Issues

Suggestions

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

arreyder commented May 28, 2026

Re-bench after critical-fix commit 26b535cd

What changed since the prior comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented May 28, 2026 •

edited

Loading

Re-bench after critical-fix commit `26b535cd`