feat: make the review prompt directive instead of descriptive#8
Merged
Merged
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the mongodb-query-index-check review prompt to be more procedural (directive) so the agent consistently compares before/after query shapes and catches common index regressions (especially partial-filter mismatches).
Changes:
- Added an explicit “Common change patterns to flag” checklist (partial filters, sorts vs ESR, shard-key prefix,
$orbranches, negations, regex anchoring, ESR pitfalls). - Added concrete ESR worked examples to anchor expected reasoning and comment specificity.
- Rewrote the “Decide findings” workflow step into a strict per-query procedure (reconstruct before/after, enumerate all indexes, run checklist, then severity-grade).
Comments suppressed due to low confidence (3)
mongodb-query-index-check/prompts/review.md:55
- The guidance on negation operators is too absolute:
$ne/$nin/$not/$exists:falsecan still use an index in MongoDB (often with poor selectivity / high read-return ratio), so saying they "typically force a scan even when the field is indexed" is misleading and may push the reviewer toward incorrect high/critical findings. Consider rephrasing to emphasize reduced selectivity / inability to use compound-key suffixes rather than implying an unavoidable collection scan.
5. **An equality filter became a negation operator** (`$ne`, `$nin`, `$not`, `$exists: false`). Negation operators can't use indexes the same way equality does — they typically force a scan even when the field is indexed. Flag.
mongodb-query-index-check/prompts/review.md:58
- The checklist item about a "range filter moved before an equality filter" is ambiguous and risks being incorrect: MongoDB doesn't care about key order in the query object, and ESR is about index key order + predicate types on those keys. If the intent is "a predicate on an earlier index key changed from equality to range, preventing use of later keys (including sort)", it would be clearer to state that explicitly so the procedure doesn't produce false positives.
7. **A range filter moved before an equality filter in a compound query.** ESR ordering: equality first, then sort, then range. Reversing it neutralises a compound index. Flag.
mongodb-query-index-check/prompts/review.md:110
- Step 2 references
$MONGO_INDEXES_DIR/<collection>.ts, but earlier the prompt says index files are snake_case (e.g.request_queues.ts) and step 1 also uses<collection-snake-case>.ts. This inconsistency could cause the agent to look for the wrong path; suggest changing step 2 to<collection-snake-case>.tsfor consistency.
2. **Read all indexes for the collection from `$MONGO_INDEXES_DIR/<collection>.ts`.** Don't stop at the first index that looks like a match. Enumerate every `ensureIndex` call in the file and score each one against the after-query: (a) does the filter form a prefix of the index keys (ESR — equality first), (b) does the `partialFilterExpression` match the query, (c) does any sort/range field appear at the right position?
3. **Walk through every pattern in "Common change patterns to flag" above** for this query, not just the rubric. The patterns are the *checklist*; the rubric is the *severity*.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| ## Common change patterns to flag | ||
|
|
||
| Most regressions in real PRs match one of these shapes. Scan for each pattern explicitly — don't rely on the severity rubric alone to catch them. For every MongoDB call you analyse, **reconstruct both the before-state and the after-state** of the query (filter object, sort, projection) and compare them against the indexes. A query that was perfectly fine before the PR can be broken by a small change. |
Real-world test on apify-core: out of 16 deliberately-introduced index
regressions, only ~half got flagged. The misses clustered on a few
recognisable patterns:
* `removedAt: null` dropped from filters that previously matched a
`partialFilterExpression` index (most common).
* New `sort: { <field>: ±1 }` on a field not in the chosen index.
* Shard-key prefix removed from a sharded-collection query.
* New `$or` branch without index coverage.
The previous prompt mentioned these in the severity rubric, but the
rubric is a classification scheme — not a checklist Claude works
through. So Claude often picked the first index that "looked right"
and skipped subtle regressions.
Three changes to the prompt:
1. Add a "Common change patterns to flag" section that enumerates the
7 most-missed patterns with concrete examples. Claude is now
instructed to scan for each one explicitly.
2. Add a "Concrete ESR examples" block — small templated reasoning
chains showing the level of specificity expected in inline comments.
3. Rewrite step 2 of the Workflow ("Decide findings") into a 4-step
per-query procedure: (a) reconstruct before- and after-state of the
query, (b) enumerate **all** indexes for the collection rather than
stopping at the first match, (c) walk through every pattern in the
"Common change patterns" list, (d) pick the best index and grade
the gap.
The before/after reconstruction is the key change — the
partial-filter-dropped regression can only be caught if Claude
explicitly compares the two states.
No new envsubst placeholders. Allowlist unchanged.
37d894f to
301875c
Compare
fnesveda
pushed a commit
that referenced
this pull request
May 18, 2026
🤖 I have created a release *beep* *boop* --- ## [1.1.0](v1.0.0...v1.1.0) (2026-05-18) ### Features * add mongodb-query-index-check action ([#3](#3)) ([e288951](e288951)) * add python-package-check composite action ([#11](#11)) ([cafe9c0](cafe9c0)) * bump max-turns default to 100 and stream full Claude output ([#7](#7)) ([812c5cb](812c5cb)) * expand allowed-tools list for mongodb-query-index-check ([#6](#6)) ([42e0fe2](42e0fe2)) * make the review prompt directive instead of descriptive ([#8](#8)) ([910af2a](910af2a)) * mention [@mtrunkat](https://github.com/mtrunkat) in the review summary on findings ([#12](#12)) ([2f0becd](2f0becd)) ### Bug Fixes * move state files into workspace and address bash sandbox denials ([#9](#9)) ([6e2aa05](6e2aa05)) * Stop using `@octokit/rest` in scripts ([#10](#10)) ([232b613](232b613)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
mtrunkat
pushed a commit
that referenced
this pull request
May 18, 2026
Two prompt rewrites prompted by inline review on #17: * **Pattern #3 (shard-key dropped).** Old wording read ambiguously as "flag when somebody drops the unique index" rather than "flag when a query omits the shard-key field". Rewritten to make the intent explicit (drop = field missing from filter / `$match`), and to name the canonical source for the shard key — the `ShardAwareCollection(<name>, { shardKey: '<key>', … })` constructor, which lives outside `$MONGO_INDEXES_DIR`. The `// For sharding …` index comment is now described as a signal rather than the source. Adds the explicit caveat that some sharded collections (e.g. `Datasets`, sharded on `_id`) won't have that signal at all and should be left un-flagged when the status is uncertain. * **Pattern #8 (sharded skip / readConcern).** Rewritten from "match these specific calls" to "here's the underlying mechanism, here are known examples, reason about new cases too": sharded chunks can hold orphan docs → `SHARDING_FILTER` stage fetches every candidate to drop them → `readConcern: 'available'` bypasses it. The named examples (skip, countDocuments-auto-handled, wide aggregation ranges) now sit under that explanation rather than being the rule itself, which lets the model spot novel index-only-on-shard regressions instead of just the patterns we already enumerated. Cursor-pagination guidance now consistently uses the sort key (`startedAt < lastStartedAt`), matching the concrete example below.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Real-world test against apify-core PR #27953 (which contains 16 deliberately-introduced index regressions): Claude flagged only ~half of them. The misses clustered on a few recognisable patterns:
removedAt: nulldropped from a filter that previously matched apartialFilterExpressionindex — most common misssort: { <field>: ±1 }added on a field not in the chosen indexuserId) dropped from a query against a sharded collection$orbranch with no index coverageThese shapes are covered by the severity rubric in the existing prompt — but the rubric is descriptive, not procedural. Claude often picked the first index that "looked right" against the after-state of the query and never compared it to the before-state, so partial-filter regressions slipped through.
What changes
Three additions to
prompts/review.md, all in service of making the review more directive:1. New "Common change patterns to flag" section
Seven explicit patterns Claude must scan for on every MongoDB call:
sortadded on a field outside the chosen index's ESR position$orbranch$ne,$nin,$not,$exists: false)$regexon an indexed fieldEach pattern names the failure mode and the specific thing to check (e.g. read the candidate index's
partialFilterExpressionand verify the after-query satisfies every key).2. New "Concrete ESR examples" block
Five worked examples — one OK, four flag-worthy — showing the exact reasoning style and specificity expected in inline comments. Anchors Claude's output against the apify-core schema (
Act2Runs,userId,partialFilterExpression, etc.).3. Rewritten step 2 of the Workflow
Instead of "apply the severity rubric", it now spells out a 4-step procedure per query:
The before/after reconstruction is the key bit — the partial-filter-dropped regression can only be caught if Claude explicitly compares the two states.
What stays the same
Test plan
python3 -c "import yaml; yaml.safe_load(...)"—action.yamlvalidpnpm run lint— cleanpnpm run type-check— cleanpnpm run test— 35/35$VARplaceholders in the promptFollow-ups
$existsinteractions, cross-package collection imports (mongoClient.Xvsdb.Xvs destructured).Generated by Claude Code