fix(data-access): drop dangling belongs_to: Brand on BrandSemrushProject by rainer-friederich · Pull Request #1617 · adobe/spacecat-shared

rainer-friederich · 2026-05-22T08:58:36Z

What

The BrandSemrushProject schema declares belongs_to: Brand, but this package does not ship a Brand entity — no Brand model or collection is registered in entity.registry.js. With the reference in place, every BrandSemrushProject instantiation throws Collection BrandCollection not found from base.model.js's eager reference resolution in reference.js#toAccessorConfigs:126. The result: every spacecat-api-service /v2/orgs/:org/brands/:brand/semrush/* route 500s the moment it hits a real DB row.

The bug is invisible to the unit tests because test/unit/util.js#createElectroMocks stubs getCollection() to always return a placeholder — reproducing it requires the real EntityRegistry, which only happens at runtime.

Fix

Replace .addReference('belongs_to', 'Brand') with the two things it produced internally (see schema.builder.js#addReference):

An explicit UUID-validated brandId attribute.
.addAllIndex(['brandId']).

This preserves the BrandSemrushProjectCollection.allByBrandId(brandId) accessor that the semrush handlers depend on (spacecat-api-service: src/support/semrush/handlers/prompts.js). The only thing lost is the navigation accessor getBrand() on the model side, which nothing consumes today (and which could not have worked without a Brand entity in the first place).

When/if a Brand entity is added to this package, the attribute + addAllIndex block can be replaced by .addReference('belongs_to', 'Brand') again, which additionally yields getBrand().

How I hit this

Discovered while running the semrush proxy locally against the mysticat-data-service docker stack with real BrandSemrushProject rows seeded from dev. Every GET /v2/orgs/.../semrush/{prompts,projects} 500'd with a stack trace ending in EntityRegistry.getCollection (entity.registry.js:154). After applying this fix (initially as a node_modules patch), the same routes returned 200 with the seeded rows enriched correctly.

Test plan

npm test -w packages/spacecat-shared-data-access — 2050 passing, coverage thresholds met
npm run lint -w packages/spacecat-shared-data-access — clean
Updated the schema test's describe block label from "auto-generated by belongs_to Brand" to "auto-generated by addAllIndex(["brandId"])" — same attribute assertions still pass

spacecat-api-service follow-up: bump to the new shared-lib version once this merges, so the semrush proxy works against real rows on dev/stage/prod.
A separate, larger PR could introduce a proper Brand entity here (model, collection, schema, tests, registry registration); that would unlock getBrand() navigation but is independent of this crash fix.

🤖 Generated with Claude Code

The BrandSemrushProject schema declared `belongs_to: Brand`, but this package does not ship a Brand entity (no Brand model or collection is registered in entity.registry.js). Every BrandSemrushProject instantiation therefore threw "Collection BrandCollection not found" from base.model.js's eager reference resolution in reference.js#toAccessorConfigs:126 — 500-ing every spacecat-api-service /v2/orgs/:org/brands/:brand/semrush/* route end-to-end. The bug was invisible to the unit tests because test/unit/util.js#createElectroMocks stubs `getCollection()` to always return a placeholder; reproducing it requires the real EntityRegistry, which only happens at runtime. Replace the reference with the two things it produced internally (see schema.builder.js#addReference): an explicit UUID-validated `brandId` attribute and an `addAllIndex(['brandId'])`. This preserves the `BrandSemrushProjectCollection.allByBrandId(brandId)` accessor that the semrush handlers depend on (see spacecat-api-service: src/support/semrush/handlers/prompts.js). The only thing lost is the navigation accessor `getBrand()` on the model side, which nothing currently consumes (a `Brand` entity would need to exist for that to work in the first place). When/if a Brand entity is added to this package, the attribute + addAllIndex block can be replaced by `addReference('belongs_to', 'Brand')` again, which additionally yields `getBrand()`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

MysticatBot

Hey @rainer-friederich,

Strengths

Correct root-cause identification and fix (brand-semrush-project.schema.js:29-54): The addReference('belongs_to', 'Brand') call invokes reference.js#toAccessorConfigs at instantiation time, calling registry.getCollection('BrandCollection') which throws because no Brand entity is registered. This is a real 500-in-production fix, well-traced through the stack.
Validation contract preserved (brand-semrush-project.schema.js:48-51): The brandId attribute retains UUID validation via isValidUUID(value) with required: true, matching exactly what addReference would have generated internally.
Minimal blast radius: Only the schema declaration and its unit test change. No model, collection, or handler code is touched. The allByBrandId accessor that consumers depend on continues to exist.
Forward compatibility documented: The comment explains a clear path back to addReference('belongs_to', 'Brand') once the Brand entity is registered in this package.

Issues

Important (Should Fix)

Index type mismatch - addAllIndex vs the original belongs_to index (brand-semrush-project.schema.js:55)

The comment block claims the fix "mirrors what addReference('belongs_to', 'Brand') would have produced internally," but the index structures are fundamentally different:

addReference('belongs_to', 'Brand') produces a GSI with pk: { composite: ['brandId'] } and sk: { composite: ['updatedAt'] }, type BELONGS_TO. This partitions by brandId - efficient O(1) lookups for a single brand's projects.
.addAllIndex(['brandId']) produces a GSI with pk: { template: 'ALL_BRAND_SEMRUSH_PROJECTS' } and sk: { composite: ['brandId'] }, type ALL. This puts all records in a single partition sorted by brandId - effectively a scan filtered by sort-key prefix.

Why it matters: Although both produce an allByBrandId accessor, the query semantics differ. With the ALL index, querying for a specific brand's projects requires scanning the entire partition. With the BELONGS_TO index, it is a direct partition-key lookup. At scale, the ALL-index approach concentrates all records in one partition (DynamoDB's 10 GB partition limit, write throttling), and performance degrades proportionally to total record count.

How to fix: Use .addIndex({ composite: ['brandId'] }, { composite: ['updatedAt'] }) instead of .addAllIndex(['brandId']). This creates an index with brandId as partition key and updatedAt as sort key - semantically equivalent to what belongs_to would have produced, without requiring a registered Brand entity. The allByBrandId and allByBrandIdAndUpdatedAt accessors will both be generated.

Minor (Nice to Have)

Verbose inline comment (brand-semrush-project.schema.js:29-47): 18 lines of comment for a 4-line code change. The bug explanation is thoroughly documented in the PR description and commit message. Trim to 3-4 lines: FK to Brand entity not shipped by this package, mirrors addReference output, replaceable once Brand entity is registered.
Test describe label is inaccurate (brand-semrush-project.schema.test.js:130): The label "auto-generated by addAllIndex" is misleading since addAllIndex does not generate the brandId attribute - it only creates an index referencing the already-declared attribute. Something like brandId attribute (explicit FK to brands table) would be accurate regardless of index implementation.

Recommendations

Add a test that verifies the allByBrandId accessor is actually produced by the schema build. The current test validates the attribute contract but does not assert the index/accessor combination, which is what downstream consumers depend on.

Assessment

Ready to merge? No - with one fix needed.

The Important finding (index type mismatch) changes query performance characteristics compared to the original belongs_to behavior. The fix is a one-line change from .addAllIndex(['brandId']) to .addIndex({ composite: ['brandId'] }, { composite: ['updatedAt'] }). The minor issues are cosmetic and should not block merge.

Next Steps

Replace .addAllIndex(['brandId']) with .addIndex({ composite: ['brandId'] }, { composite: ['updatedAt'] }) to produce the correct GSI structure.
Optionally trim the inline comment and fix the test label.

MysticatBot

Hey @rainer-friederich,

Strengths

Correct root-cause identification and fix (brand-semrush-project.schema.js:29-54): The addReference('belongs_to', 'Brand') call invokes reference.js#toAccessorConfigs at instantiation time, calling registry.getCollection('BrandCollection') which throws because no Brand entity is registered. This is a real 500-in-production fix, well-traced through the stack.
Validation contract preserved (brand-semrush-project.schema.js:48-51): The brandId attribute retains UUID validation via isValidUUID(value) with required: true, matching exactly what addReference would have generated internally.
Minimal blast radius: Only the schema declaration and its unit test change. No model, collection, or handler code is touched. The allByBrandId accessor that consumers depend on continues to exist.
Forward compatibility documented: The comment explains a clear path back to addReference('belongs_to', 'Brand') once the Brand entity is registered in this package.

Issues

Important (Should Fix)

Index type mismatch - addAllIndex vs the original belongs_to index (brand-semrush-project.schema.js:55)

The comment block claims the fix "mirrors what addReference('belongs_to', 'Brand') would have produced internally," but the index structures are fundamentally different:

addReference('belongs_to', 'Brand') produces a GSI with pk: { composite: ['brandId'] } and sk: { composite: ['updatedAt'] }, type BELONGS_TO. This partitions by brandId - efficient O(1) lookups for a single brand's projects.
.addAllIndex(['brandId']) produces a GSI with pk: { template: 'ALL_BRAND_SEMRUSH_PROJECTS' } and sk: { composite: ['brandId'] }, type ALL. This puts all records in a single partition sorted by brandId - effectively a scan filtered by sort-key prefix.

Why it matters: Although both produce an allByBrandId accessor, the query semantics differ. With the ALL index, querying for a specific brand's projects requires scanning the entire partition. With the BELONGS_TO index, it is a direct partition-key lookup. At scale, the ALL-index approach concentrates all records in one partition (DynamoDB's 10 GB partition limit, write throttling), and performance degrades proportionally to total record count.

How to fix: Use .addIndex({ composite: ['brandId'] }, { composite: ['updatedAt'] }) instead of .addAllIndex(['brandId']). This creates an index with brandId as partition key and updatedAt as sort key - semantically equivalent to what belongs_to would have produced, without requiring a registered Brand entity.

Minor (Nice to Have)

Verbose inline comment (brand-semrush-project.schema.js:29-47): 18 lines of comment for a 4-line code change. Trim to 3-4 lines.
Test describe label is inaccurate (brand-semrush-project.schema.test.js:130): "auto-generated by addAllIndex" is misleading since addAllIndex does not generate the attribute.

Recommendations

Add a test that verifies the allByBrandId accessor is actually produced by the schema build.

Assessment

Ready to merge? No - with one fix needed.

The Important finding (index type mismatch) changes query performance characteristics compared to the original belongs_to behavior. The fix is a one-line change from .addAllIndex(['brandId']) to .addIndex({ composite: ['brandId'] }, { composite: ['updatedAt'] }).

Next Steps

Replace .addAllIndex(['brandId']) with .addIndex({ composite: ['brandId'] }, { composite: ['updatedAt'] }) to produce the correct GSI structure.
Optionally trim the inline comment and fix the test label.

_{Skill: pr-review | Model: us.anthropic.claude-opus-4-6-v1[1m] | Duration: 0m 41s | Cost: $3.93 | Commit: 3c57c08aecf8b6e3bcbeec6f4b8a17e6820c88b7
If this code review was useful, please react with 👍. Otherwise, react with 👎.}

aliciadriani

PR Review

Author: @rainer-friederich Scope: 2 files, +27/-5, schema fix only.

Summary

Minimal, correct crash fix. BrandSemrushProject declared belongs_to: Brand but spacecat-shared-data-access has no Brand entity registered — so every instantiation threw Collection BrandCollection not found from the schema builder's eager reference resolver, 500ing every /semrush/* route that touched a real DB row.

The fix manually expands the schema to add the brandId attribute plus addAllIndex(['brandId']), preserving the allByBrandId accessor and findBySlice that the semrush handlers depend on. The only thing dropped is getBrand(), which was always broken (no Brand entity to resolve against) and — verified — has no consumers anywhere in the org.

Verification performed

✅ CI: all 4 checks green (CLA, Semantic Release, Kodiak, Test)
✅ HEAD verified: local clone at 3c57c08 matches PR head
✅ Test suite: 2050 passed, 0 failed, 0 skipped
✅ Schema instantiation: exercised by "initializes the BrandSemrushProject instance correctly" — the Collection BrandCollection not found crash path is gone
✅ getBrand() consumer search: org-wide search across adobe and adobe-rnd returned zero callers of BrandSemrushProject.getBrand(). All 100 getBrand-named hits were unrelated (getBrandForOrgSite, getBrandById, getBrandSlug, getBrandConfig, etc.). index.d.ts in this PR also explicitly omits getBrand() from the TypeScript surface with an explanatory comment, so no typed consumer could have depended on it.

Must Fix

None.

Should Fix

None.

Nits / clarifications worth recording

Two small clarifications on how the schema actually behaves — not blocking, but worth noting in the PR record so future readers don't have to re-derive them:

Index-type equivalence is not exact. addReference('belongs_to', 'Brand') would have produced a BELONGS_TO-type index (pk: brandId, sk: updatedAt). The PR uses addAllIndex(['brandId']) which is an ALL-type index (pk: fixed entity template, sk: brandId). Structurally different.

This has no runtime effect here because BrandSemrushProject is PostgREST-backed (mysticat-data-service), and the PostgREST query path in queryByIndexKeys applies all key fields as SQL WHERE filters regardless of index type. So the manual expansion is functionally equivalent for this entity, just not structurally identical to what the macro would emit. If anyone later migrates this entity to DynamoDB, the index shape would need a second look.
findBySlice is a real indexed lookup, not a partition scan + filter. The PostgREST path in #queryPage calls #applyKeyFilters(query, keys), which maps all three keys to SQL WHERE conditions. The actual query is:

   SELECT * FROM brand_to_semrush_projects
   WHERE brand_id = ? AND semrush_location_id = ? AND language = ?
   LIMIT 1

The DB's uq_brand_to_semrush_slice UNIQUE(brand_id, semrush_location_id, language) covers all three columns, so this is a proper indexed key lookup. No in-memory filtering or partition scan involved.

What's good

The long comment block in the schema is the right call. It explains the problem, what was lost, what's preserved, and the exact migration path back to addReference if a Brand entity is ever added. Prevents a future contributor from "fixing" it back to the broken state.
The fix is exactly as minimal as it should be — no scope creep, no unrelated changes.
addAllIndex(['semrushProjectId']) is correctly preserved alongside the new brandId index.
The test label update accurately reflects what actually generates the attribute now.

Verdict: ✅ Safe to merge.

Addresses MysticatBot review on #1617: 1. Swap `.addAllIndex(['brandId'])` for `.addIndex({ composite: ['brandId'] }, { composite: ['updatedAt'] })`. The new shape mirrors what `addReference('belongs_to', 'Brand')` would have produced internally (pk: composite['brandId'], sk: composite['updatedAt']) rather than the ALL-typed template-pk index the previous form generated. Same `allByBrandId` accessor is produced; additionally yields `allByBrandIdAndUpdatedAt` / `findByBrandIdAndUpdatedAt` which the original belongs_to would also produce. 2. Trim the inline comment from 18 lines to 7 — the full bug context stays in the PR description / commit message. 3. Relabel the brandId describe block in the schema test from "auto-generated by addAllIndex(['brandId'])" to "explicit FK to brands table" — the attribute is declared explicitly; the index only references it. 4. Add composite-key accessor smoke tests: `allByBrandIdAndUpdatedAt` and `findByBrandIdAndUpdatedAt`. The existing single-key tests already cover `allByBrandId` / `findByBrandId`; the new tests catch a regression where the sort-key composite is accidentally dropped without breaking the single-key forms. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-22T09:26:19Z

This PR will trigger a patch release when merged.

rainer-friederich · 2026-05-22T09:26:27Z

Thanks for the review. Addressed all three points in 7e84b4ee:

Important — index shape mismatch. Swapped .addAllIndex(['brandId']) for .addIndex({ composite: ['brandId'] }, { composite: ['updatedAt'] }). This mirrors the (pk: composite['brandId'], sk: composite['updatedAt']) shape that addReference('belongs_to', 'Brand') would have produced internally, rather than the ALL-typed (pk: template, sk: composite['brandId']) shape the previous form generated. Same allByBrandId accessor is produced; additionally yields allByBrandIdAndUpdatedAt / findByBrandIdAndUpdatedAt which the original belongs_to would also produce.

One caveat on the "ALL index = partition concentration" concern: v3 of this package is PostgREST/Postgres-backed (DynamoDB v2 is retired), so the GSI shape doesn't translate to a partition concept at the storage layer — the actual hot-path index lives in the brand_to_semrush_projects SQL migration. But the structural argument for mirroring the belongs_to output exactly still stands: addIndex is the more honest expression of intent, and it keeps the door open for a future DynamoDB-backed consumer without a schema rewrite.

Minor — verbose comment. Trimmed from 18 lines to 7. Full bug context kept in the PR description and commit message.

Minor — test describe label. Relabeled brandId attribute (auto-generated by addAllIndex(["brandId"])) → brandId attribute (explicit FK to brands table). Correct now; the attribute is declared explicitly, the index only references it.

Recommendation — accessor test. The existing auto-generated index accessors block in brand-semrush-project.collection.test.js already covers allByBrandId / findByBrandId / allBySemrushProjectId / findBySemrushProjectId (existence + delegation). I added a new assertion for allByBrandIdAndUpdatedAt / findByBrandIdAndUpdatedAt so a future regression that drops the sort-key composite is caught.

2051 tests passing, lint clean. Ready for re-review.

MysticatBot

Hey @rainer-friederich,

Strengths

All prior review findings addressed - the index shape, comment verbosity, test label, and accessor coverage were all resolved cleanly in a single follow-up commit.
Index shape now mirrors belongs_to semantics (brand-semrush-project.schema.js:34): .addIndex({ composite: ['brandId'] }, { composite: ['updatedAt'] }) produces the correct (pk: brandId, sk: updatedAt) GSI structure, matching what addReference('belongs_to', 'Brand') would generate internally. This also yields the allByBrandIdAndUpdatedAt / findByBrandIdAndUpdatedAt accessors for free.
Comment trimmed to essential context (brand-semrush-project.schema.js:21-27): 7 lines covering the why, the workaround, and the path back to addReference - exactly right for a future reader.
Regression coverage for composite-key accessors (brand-semrush-project.collection.test.js:55-63): The new assertion catches a future regression where the updatedAt sort-key composite is accidentally dropped.
Correct root-cause fix: Replacing addReference('belongs_to', 'Brand') with an explicit brandId attribute + index eliminates the Collection BrandCollection not found crash at model instantiation while preserving the accessor surface that downstream handlers depend on.

Assessment

Ready to merge? Yes

The crash fix is correct, minimal, and all prior review feedback has been addressed. The 2051 tests pass, lint is clean, and the schema now accurately reflects intent without depending on a non-existent Brand entity.

_{Skill: pr-review | Model: us.anthropic.claude-opus-4-6-v1[1m] | Duration: 3m 22s | Cost: $0.75 | Commit: 7e84b4eea2ea6a945cc72191edb32892576dcc5a
If this code review was useful, please react with 👍. Otherwise, react with 👎.}

## [@adobe/spacecat-shared-data-access-v3.70.1](https://github.com/adobe/spacecat-shared/compare/@adobe/spacecat-shared-data-access-v3.70.0...@adobe/spacecat-shared-data-access-v3.70.1) (2026-05-22) ### Bug Fixes * **data-access:** drop dangling belongs_to: Brand on BrandSemrushProject ([#1617](#1617)) ([b52d815](b52d815))

solaris007 · 2026-05-22T10:10:40Z

🎉 This PR is included in version @adobe/spacecat-shared-data-access-v3.70.1 🎉

The release is available on:

Your semantic-release bot 📦🚀

Picks up adobe/spacecat-shared#1617 — drops the dangling `belongs_to: Brand` reference on BrandSemrushProject in favour of an explicit (brandId, updatedAt) index. Without this, every BrandSemrushProject instantiation threw "Collection BrandCollection not found" at runtime because no Brand entity is registered in the data-access package; the failure 500-ed every /v2/orgs/.../semrush/* route the moment a real row was returned from PostgREST. Now the allByBrandId / findBySlice paths work end-to-end against the mysticat-data-service stack. Verified locally: 179 semrush-related tests passing; GET /v2/orgs/.../semrush/projects returns the seeded brand_to_semrush_projects rows with enrichment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rainer-friederich requested a review from MysticatBot May 22, 2026 09:12

MysticatBot requested changes May 22, 2026

View reviewed changes

MysticatBot added the ai-reviewed Reviewed by AI label May 22, 2026

MysticatBot requested changes May 22, 2026

View reviewed changes

aliciadriani self-requested a review May 22, 2026 09:23

aliciadriani previously approved these changes May 22, 2026

View reviewed changes

rainer-friederich dismissed aliciadriani’s stale review via 7e84b4e May 22, 2026 09:26

rainer-friederich requested a review from MysticatBot May 22, 2026 09:27

MysticatBot approved these changes May 22, 2026

View reviewed changes

Merge branch 'main' into fix/brand-semrush-project-no-brand-entity

c3f1427

rainer-friederich merged commit b52d815 into main May 22, 2026
5 checks passed

rainer-friederich deleted the fix/brand-semrush-project-no-brand-entity branch May 22, 2026 09:56

solaris007 added the released label May 22, 2026

rainer-friederich mentioned this pull request May 22, 2026

feat(serenity): rename /semrush → /serenity + post-launch hardening adobe/spacecat-api-service#2467

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(data-access): drop dangling belongs_to: Brand on BrandSemrushProject#1617

fix(data-access): drop dangling belongs_to: Brand on BrandSemrushProject#1617
rainer-friederich merged 3 commits into
mainfrom
fix/brand-semrush-project-no-brand-entity

rainer-friederich commented May 22, 2026

Uh oh!

MysticatBot left a comment

Uh oh!

MysticatBot left a comment

Uh oh!

aliciadriani left a comment

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

rainer-friederich commented May 22, 2026

Uh oh!

MysticatBot left a comment

Uh oh!

Uh oh!

solaris007 commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

rainer-friederich commented May 22, 2026

What

Fix

How I hit this

Test plan

Related

Uh oh!

MysticatBot left a comment

Choose a reason for hiding this comment

Strengths

Issues

Important (Should Fix)

Minor (Nice to Have)

Recommendations

Assessment

Next Steps

Uh oh!

MysticatBot left a comment

Choose a reason for hiding this comment

Strengths

Issues

Important (Should Fix)

Minor (Nice to Have)

Recommendations

Assessment

Next Steps

Uh oh!

aliciadriani left a comment

Choose a reason for hiding this comment

PR Review

Summary

Verification performed

Must Fix

Should Fix

Nits / clarifications worth recording

What's good

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

rainer-friederich commented May 22, 2026

Uh oh!

MysticatBot left a comment

Choose a reason for hiding this comment

Strengths

Assessment

Uh oh!

Uh oh!

solaris007 commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants