Skip to content

perf(dotc1z): filter grants by has_external_match column at SQL#776

Open
arreyder wants to merge 2 commits into
mainfrom
crr/listgrants-expansion-alloc
Open

perf(dotc1z): filter grants by has_external_match column at SQL#776
arreyder wants to merge 2 commits into
mainfrom
crr/listgrants-expansion-alloc

Conversation

@arreyder
Copy link
Copy Markdown
Contributor

Summary

processGrantsWithExternalPrincipals iterates every grant in the sync but only acts on rows carrying an ExternalResourceMatch{,All,ID} annotation. On be-temporal-sync, listGrantsWithExpansionInternal was 20.33% flat / 23.53% cum of alloc_space (~22 GB / 3 h), dominated by the per-row sqlite columnBlob copy + proto.Unmarshal work for rows the caller was about to skip anyway.

Push the annotation check down to SQL:

  • New has_external_match column on grants, set on write from the in-memory grant's annotations via a cheap Any.MessageIs check (no unmarshal).
  • grantExtractFields sets the column in every upsert mode — including PreserveExpansion, because executeGrantChunkedUpsert still writes EXCLUDED.data on conflict there; the column must track the data blob's annotation set exactly.
  • Migration ALTERs the column in with default 0 for older c1zs, plus a partial index (sync_id) WHERE has_external_match = 1 matching the existing needs_expansion index shape.
  • Backfill (renamed backfillGrantExpansionColumnbackfillGrantDerivedColumns) derives has_external_match while the proto is already unmarshaled for the expansion backfill. Sentinel branch is split by match-presence so each group still bulk-updates in one statement.
  • New GrantListOptions.ExternalMatchOnly opted into by syncer.listAllGrantsWithExpansion. The in-loop annotation check at processGrantsWithExternalPrincipals is kept as a defensive fallback.
  • BATON_DISABLE_EXTERNAL_MATCH_FILTER=1 kill-switch for fast revert without re-release.

Clone and attached-diff paths are schema-agnostic (column discovery via PRAGMA table_info) so the new column flows through automatically.

Benchmark

BenchmarkListGrantsWithExpansion_ExternalMatchOnly seeds N grants at varying match densities and compares the filtered path against the pre-fix behavior via the kill-switch. benchtime=3x on AMD Ryzen 7 5700X3D:

n=10,000 grants (pre-fix: ~14.5 MB, ~380k allocs, ~37 ms regardless of density):

match % allocs B/op alloc reduction wall time
1% 4,374 266 KB 98.8% 0.6 ms (60×)
5% 21,164 970 KB 94.4% 3.4 ms (11×)
20% 84,127 3.6 MB 77.9% 10.3 ms (3.7×)
100% 419,924 17.7 MB 0% (ceiling) 42.9 ms (parity)

n=50,000 grants (pre-fix: ~72 MB, ~1.9 M allocs, ~177 ms):

match % allocs B/op alloc reduction wall time
1% 21,174 969 KB 98.9% 3.3 ms (54×)
5% 105,163 4.5 MB 94.5% 15.9 ms (11×)
20% 420,127 17.7 MB 78.4% 49.5 ms (3.7×)
100% 2,100,784 88.4 MB 0% (ceiling) 204.8 ms (parity)

100% match density is pinned as a regression guard — the filter imposes no measurable cost vs. the pre-fix path when every row would match anyway.

Pressure test

Before implementing, verified (in order):

  1. grantExtractFields is called on every upsert with the full *v2.Grant — cheap Any.MessageIs works.
  2. listGrantsWithExpansionInternal has only one caller in-repo (the one we convert).
  3. ExternalResourceMatch* annotations are only set by connectors pre-PutGrants; no in-place annotation mutation exists. newGrantForExternalPrincipal copies them to expanded grants, which go back through UpsertGrants(Replace) — extraction runs fresh.
  4. state.HasExternalResourcesGrants() is observation-based, not eager.
  5. Migrations run synchronously in NewC1FileInitTablesMigrations before any read can happen.
  6. PreserveExpansion upsert rewrites data from EXCLUDED.data, so has_external_match must track it — covered by the update map adjustment and a targeted test.

Test plan

  • TestHasExternalMatch_WriteExtraction — every annotation flavor + co-existence with GrantExpandable
  • TestHasExternalMatch_PreserveExpansionUpdatesColumn — the critical PreserveExpansion refresh case
  • TestListGrantsInternal_ExternalMatchOnlyFilter — SQL predicate returns exactly the matching rows
  • TestBackfillMigration_PopulatesHasExternalMatch — all three backfill branches (expandable+match, sentinel+match, sentinel+no-match)
  • TestExternalMatchFilter_KillSwitch — env var disables the predicate
  • Existing TestExternalResourceMatchIDWithExpandableRemapping + full sync/dotc1z/synccompactor suites pass unchanged
  • TestCloneSyncMigratedColumnOrder updated to simulate old c1z with the new ALTER
  • go vet ./... clean
  • Rollout: monitor alloc on be-temporal-sync after deploy; if the predicate mis-plans on a customer's c1z, BATON_DISABLE_EXTERNAL_MATCH_FILTER=1 rolls back without a re-release

🤖 Generated with Claude Code

@arreyder arreyder requested review from btipling and kans April 21, 2026 14:24
arreyder added a commit that referenced this pull request May 28, 2026
processGrantsWithExternalPrincipals iterates every grant in the sync but
only acts on rows carrying an ExternalResourceMatch{,All,ID} annotation.
On be-temporal-sync, listGrantsWithExpansionInternal accounted for 20.33%
flat / 23.53% cum of alloc_space (~22GB / 3h) — dominated by the per-row
sqlite columnBlob copy + proto.Unmarshal work for rows the caller was
about to skip anyway.

Push the annotation check to the storage layer:

- New has_external_match column on the grants table, set on write from
  the in-memory grant's annotations via a cheap Any.MessageIs check
  (no unmarshal). Populated in every upsert mode including
  PreserveExpansion (executeGrantChunkedUpsert still writes EXCLUDED.data
  on conflict; the column must track the data blob exactly).
- Migration ALTERs the column in with default 0 for older c1zs, plus a
  partial index on (sync_id) WHERE has_external_match = 1 (same shape
  as the existing needs_expansion index).
- Backfill extended (backfillGrantExpansionColumn → backfillGrantDerivedColumns)
  to derive has_external_match while the proto is already unmarshaled
  for the expansion backfill; sentinel path split by has_external_match
  so each branch still bulk-updates in one statement.
- grantListOptions gains ExternalMatchOnly; new GrantStore method
  ListWithAnnotationsExternalMatchOnly opts in. processGrantsWithExternalPrincipals
  switches to it; the in-loop annotation check stays as a correctness
  fallback for rows written before the backfill completes.
- BATON_DISABLE_EXTERNAL_MATCH_FILTER=1 bypasses the predicate for fast
  revert without a re-release.
- Pebble adapter falls back to the unfiltered iterator (no SQL column);
  correctness preserved by the in-loop annotation check.

This is a rewrite of #776 against current main's API — main moved
GrantListOptions, GrantListMode, InternalGrantRow et al. from the
public connectorstore package to internal pkg/dotc1z/internal_grant_options.go,
and replaced the raw ListGrantsInternal entry point with the
GrantStore.ListWithAnnotations family. The intent and on-disk shape
remain identical to the original PR; the wiring is adapted to the new
internal surface.

## Benchmark

BenchmarkListGrantsExternalMatch_PRRevival compares the SQL-filtered
(this PR) vs unfiltered (current main) paths through the
processGrantsWithExternalPrincipals workload, on AMD Ryzen 7 5700X3D,
benchtime=5x:

n=10,000 grants:
  match %    filtered ns/op   unfiltered ns/op   speedup
  1%         604,361          35,817,293         59× (90× less mem, 92× fewer allocs)
  5%         3,285,496        35,728,381         11× (19× less mem, 19× fewer allocs)
  20%        10,182,980       36,909,050         3.6× (5× less mem, 5× fewer allocs)
  100%       40,469,356       38,651,628         parity (regression guard)

n=50,000 grants:
  match %    filtered ns/op   unfiltered ns/op   speedup
  1%         3,136,992        177,058,596        56× (96× less mem, 96× fewer allocs)
  5%         14,756,822       178,732,617        12× (19× less mem, 20× fewer allocs)
  20%        48,958,824       179,619,881        3.7× (5× less mem, 5× fewer allocs)
  100%       199,639,761      196,864,152        parity (regression guard)

The 100% match case is pinned as the regression guard — the filter
imposes no measurable cost vs. the pre-fix path when every row would
match anyway.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@arreyder arreyder force-pushed the crr/listgrants-expansion-alloc branch from 43b6032 to b707829 Compare May 28, 2026 00:29
@arreyder arreyder requested a review from a team May 28, 2026 00:29
@linear-code
Copy link
Copy Markdown

linear-code Bot commented May 28, 2026

FDE-233

@arreyder
Copy link
Copy Markdown
Contributor Author

Reviving this PR with a fresh adaptation against current main

The branch was parked since 2026-05-22 on a 4-file rebase conflict against main. Main moved GrantListOptions/GrantListMode/InternalGrantRow from the public connectorstore package to an internal pkg/dotc1z/internal_grant_options.go, and replaced the raw ListGrantsInternal entry point with the GrantStore.ListWithAnnotations* family.

Re-authored as a single clean commit (b707829b) preserving the original schema, write-path, and SQL predicate. The ExternalMatchOnly field lives on the internal grantListOptions and is opted into via a new GrantStore.ListWithAnnotationsExternalMatchOnly method. Pebble adapter falls back to the unfiltered iterator (no SQL column there); the in-loop annotation check in processGrantsWithExternalPrincipals stays as a correctness fallback.

Why this still matters

I benchmarked current main against the proposed PR with a fresh BenchmarkListGrantsExternalMatch_PRRevival, on AMD Ryzen 7 5700X3D, benchtime=5x:

n=10,000 grants

match % unfiltered (main) filtered (this PR) speedup
1% 35.8 ms / 15.6 MB / 390k allocs 0.60 ms / 173 KB / 4,266 allocs 59× / 90× less mem / 92× fewer allocs
5% 35.7 ms / 15.7 MB / 392k allocs 3.29 ms / 819 KB / 20,260 allocs 11× / 19× / 19×
20% 36.9 ms / 16.1 MB / 398k allocs 10.18 ms / 3.2 MB / 80,224 allocs 3.6× / 5× / 5×
100% 38.7 ms / 18.1 MB / 430k allocs 40.5 ms / 16.2 MB / 400k allocs parity (regression guard)

n=50,000 grants

match % unfiltered (main) filtered (this PR) speedup
1% 177.1 ms / 78.2 MB / 1.95M allocs 3.14 ms / 818 KB / 20,267 allocs 56× / 96× / 96×
5% 178.7 ms / 78.7 MB / 1.96M allocs 14.76 ms / 4.04 MB / 100,259 allocs 12× / 19× / 20×
20% 179.6 ms / 80.5 MB / 1.99M allocs 48.96 ms / 16.2 MB / 400,229 allocs 3.7× / 5× / 5×
100% 196.9 ms / 90.4 MB / 2.15M allocs 199.6 ms / 80.8 MB / 2.00M allocs parity

These numbers match the original PR's claim almost exactly (the PR claimed 60× at 1%/10k; actual is 59×). The slim-grants writer (#781) is in main's baseline and didn't close the gap — slim still requires an unmarshal, and the consumer filters in Go after that. Pushing the filter to SQL via the partial index sidesteps both costs.

Test plan

  • go build ./... clean
  • go vet ./... clean
  • go test ./pkg/dotc1z/... green (32s)
  • TestCloneSyncMigratedColumnOrder updated for the new ALTER TABLE … ADD COLUMN has_external_match migration step
  • Benchmark above shows the win is intact

Out of scope vs. original PR

The original branch carried ~313 lines of test cases against the deprecated public GrantListOptions API. Porting those to main's new internal surface is a separate change; happy to follow up if reviewers want them. The benchmark above + the existing test suite (which exercises the new column on every grant write) cover the regression risk.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 28, 2026

General PR Review: perf(dotc1z): filter grants by has_external_match column at SQL

Blocking Issues: 0 | Suggestions: 0 | Threads Resolved: 0
Review mode: full
View review run

Review Summary

This PR adds a has_external_match column to the grants table, set during every upsert via a cheap Any.MessageIs type-URL check, and uses it to filter at SQL in processGrantsWithExternalPrincipals — avoiding the per-row proto unmarshal for the ~99% of grants that lack an ExternalResourceMatch annotation. The migration strategy is well thought out: a two-phase backfill correctly handles both never-backfilled syncs (grants_backfilled=0) and the installed base where the prior expansion backfill already flipped grants_backfilled=1, using a new independent has_external_match_backfilled sentinel. The kill-switch (BATON_DISABLE_EXTERNAL_MATCH_FILTER=1), Pebble fallback to the unfiltered scan, and the retained in-loop annotation check in the caller all provide defense in depth. Test coverage is thorough — extraction, PreserveExpansion upsert, SQL filter, both backfill paths, kill-switch, and clone-sync column order are all exercised.

Security Issues

None found.

Correctness Issues

None found.

Suggestions

None.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blocking issues found.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blocking issues found.

@arreyder
Copy link
Copy Markdown
Contributor Author

Re-bench after critical-fix commit 26b535cd

Sanity-checked the bench on the post-fix HEAD to confirm the perf claim survived adding the has_external_match_backfilled flag + dedicated backfill loop. The fix only touches the write/backfill path; the read path the bench measures is unchanged.

benchtime=5x, same AMD Ryzen 7 5700X3D, post-fix HEAD (26b535cd):

n=10,000 grants

match % unfiltered (main baseline) filtered (this PR) speedup
1% 35.03 ms / 15.63 MB / 390k allocs 0.60 ms / 172 KB / 4,264 allocs 58× / 91× less mem / 92× fewer allocs
5% 36.49 ms / 15.73 MB / 392k allocs 3.19 ms / 819 KB / 20,258 allocs 11× / 19× / 19×
20% 37.70 ms / 16.10 MB / 398k allocs 9.80 ms / 3.23 MB / 80,225 allocs 3.8× / 5× / 5×
100% 40.24 ms / 18.08 MB / 430k allocs 41.12 ms / 16.16 MB / 400k allocs parity (regression guard)

n=50,000 grants

match % unfiltered (main baseline) filtered (this PR) speedup
1% 179.39 ms / 78.16 MB / 1.95M allocs 3.14 ms / 817 KB / 20,266 allocs 57× / 96× / 96×
5% 178.12 ms / 78.66 MB / 1.96M allocs 15.79 ms / 4.05 MB / 100,266 allocs 11× / 19× / 20×
20% 180.09 ms / 80.51 MB / 1.99M allocs 47.64 ms / 16.17 MB / 400,229 allocs 3.8× / 5× / 5×
100% 201.51 ms / 90.43 MB / 2.15M allocs 201.85 ms / 80.83 MB / 2.00M allocs parity

The bench now also asserts row-count correctness: b.Fatalf if either path yields a count other than numGrants * matchPercent / 100. A regression that under-selects rows would abort the bench rather than silently look like a speedup.

What changed since the prior comment

26b535cd addressed a critical correctness bug surfaced in a multi-agent code review of b707829b: backfillGrantDerivedColumns is gated on sync_runs.grants_backfilled=0, but every existing sync was already flipped to 1 by the prior expansion-column PR. Migration would have ALTERed in has_external_match with default 0, the gated backfill would no-op, the SQL predicate would filter to zero rows, and processGrantsWithExternalPrincipals would silently process nothing on every existing c1z. Fixed by introducing a separate has_external_match_backfilled flag column on sync_runs and a dedicated backfill loop. New regression test TestBackfillHasExternalMatchColumn_OldSyncWithBackfilledFlag reproduces the failure mode and asserts the fix.

arreyder and others added 2 commits May 29, 2026 08:40
processGrantsWithExternalPrincipals iterates every grant in the sync but
only acts on rows carrying an ExternalResourceMatch{,All,ID} annotation.
On be-temporal-sync, listGrantsWithExpansionInternal accounted for 20.33%
flat / 23.53% cum of alloc_space (~22GB / 3h) — dominated by the per-row
sqlite columnBlob copy + proto.Unmarshal work for rows the caller was
about to skip anyway.

Push the annotation check to the storage layer:

- New has_external_match column on the grants table, set on write from
  the in-memory grant's annotations via a cheap Any.MessageIs check
  (no unmarshal). Populated in every upsert mode including
  PreserveExpansion (executeGrantChunkedUpsert still writes EXCLUDED.data
  on conflict; the column must track the data blob exactly).
- Migration ALTERs the column in with default 0 for older c1zs, plus a
  partial index on (sync_id) WHERE has_external_match = 1 (same shape
  as the existing needs_expansion index).
- Backfill extended (backfillGrantExpansionColumn → backfillGrantDerivedColumns)
  to derive has_external_match while the proto is already unmarshaled
  for the expansion backfill; sentinel path split by has_external_match
  so each branch still bulk-updates in one statement.
- grantListOptions gains ExternalMatchOnly; new GrantStore method
  ListWithAnnotationsExternalMatchOnly opts in. processGrantsWithExternalPrincipals
  switches to it; the in-loop annotation check stays as a correctness
  fallback for rows written before the backfill completes.
- BATON_DISABLE_EXTERNAL_MATCH_FILTER=1 bypasses the predicate for fast
  revert without a re-release.
- Pebble adapter falls back to the unfiltered iterator (no SQL column);
  correctness preserved by the in-loop annotation check.

This is a rewrite of #776 against current main's API — main moved
GrantListOptions, GrantListMode, InternalGrantRow et al. from the
public connectorstore package to internal pkg/dotc1z/internal_grant_options.go,
and replaced the raw ListGrantsInternal entry point with the
GrantStore.ListWithAnnotations family. The intent and on-disk shape
remain identical to the original PR; the wiring is adapted to the new
internal surface.

## Benchmark

BenchmarkListGrantsExternalMatch_PRRevival compares the SQL-filtered
(this PR) vs unfiltered (current main) paths through the
processGrantsWithExternalPrincipals workload, on AMD Ryzen 7 5700X3D,
benchtime=5x:

n=10,000 grants:
  match %    filtered ns/op   unfiltered ns/op   speedup
  1%         604,361          35,817,293         59× (90× less mem, 92× fewer allocs)
  5%         3,285,496        35,728,381         11× (19× less mem, 19× fewer allocs)
  20%        10,182,980       36,909,050         3.6× (5× less mem, 5× fewer allocs)
  100%       40,469,356       38,651,628         parity (regression guard)

n=50,000 grants:
  match %    filtered ns/op   unfiltered ns/op   speedup
  1%         3,136,992        177,058,596        56× (96× less mem, 96× fewer allocs)
  5%         14,756,822       178,732,617        12× (19× less mem, 20× fewer allocs)
  20%        48,958,824       179,619,881        3.7× (5× less mem, 5× fewer allocs)
  100%       199,639,761      196,864,152        parity (regression guard)

The 100% match case is pinned as the regression guard — the filter
imposes no measurable cost vs. the pre-fix path when every row would
match anyway.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code review caught a critical regression in the prior commit on this PR:
backfillGrantDerivedColumns is gated on sync_runs.grants_backfilled=0,
but every existing sync was already flipped to 1 by the prior expansion-
column PR. Migration ALTERed has_external_match in with default 0, the
gated backfill no-op'd, ListWithAnnotationsExternalMatchOnly's
WHERE has_external_match=1 filter then returned zero rows, and
processGrantsWithExternalPrincipals silently processed nothing on the
entire installed base. The in-loop annotation check never fired because
no rows arrived.

Fix:
- Add a dedicated has_external_match_backfilled flag column on sync_runs
  (separate from grants_backfilled, which is no longer a useful gate).
- New syncs set it to 1 at creation; the existing ALTER + default 0
  picks up pre-existing rows so the backfill walks them on first open.
- New function backfillHasExternalMatchColumn() walks pending syncs,
  unmarshals grants in pages, bulk-updates matching IDs, and marks the
  flag at the end. Idempotent — second run reports no-op.
- InitTables calls it after backfillGrantDerivedColumns; either or both
  may need to run depending on the c1z's age.

Also from review:
- Make the kill-switch testable: change externalMatchFilterDisabled from
  a package-init var to a per-call function so t.Setenv can flip it
  mid-test. Cold path (once per page); cost is one syscall.
- Bench correctness assertions: compute expectedMatches up front and
  b.Fatalf if either /filtered or /unfiltered yields the wrong count.
  Without this, a regression that under-selects rows would look like a
  speedup. The existing 59×/11×/3.6× numbers held under the new asserts.
- New test TestBackfillHasExternalMatchColumn_OldSyncWithBackfilledFlag
  reproduces the failure mode: simulates an old c1z (grants_backfilled=1,
  has_external_match=0, has_external_match_backfilled=0), runs the new
  backfill, asserts the annotated row flipped to 1, the non-annotated
  row stayed at 0, the sync flag flipped to 1, and a second run is a
  no-op. Without this commit's fix the test would fail at the first
  assert.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@arreyder arreyder force-pushed the crr/listgrants-expansion-alloc branch from 26b535c to 0e69cfd Compare May 29, 2026 13:52
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blocking issues found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant