feat(platform): wire context length governance into context builder#1388
Conversation
📝 WalkthroughWalkthroughThis PR implements configurable per-user and per-team context length limitations through the governance feature-flag system. It adds UI controls with preset buttons for context token limits in the FeatureFlagsEditor, introduces Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Resolve feature flags in startAgentChat and propagate maxContextTokens through the generation pipeline. The governance limit acts as an upper bound via Math.min(governanceLimit, agentDefault) so it can only make context stricter, never more permissive. - Call resolveFeatureFlags() in startAgentChat after budget check - Forward maxContextTokens through runAgentGeneration to generateAgentResponse - Apply governance limit to all three buildStructuredContext calls - Add maxContextTokens to ContextBuilderOptions for budget calculation - Apply governance limit to checkAndSummarizeIfNeeded threshold - Enforce minimum floor of 4096 tokens in featureFlagRuleSchema - Add feature_flags validation in upsertPolicy mutation - Build FeatureFlagsEditor component with scope/target rules table - Add Feature Controls tab to governance settings page - Add i18n keys to en.json and de.json (tabs + featureFlags namespace)
98ee82e to
aa4b410
Compare
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
services/platform/app/features/settings/governance/components/feature-flags-editor.tsx (1)
269-284:⚠️ Potential issue | 🟠 MajorAlign input constraints with schema minimum (4096).
The UI still allows values below the backend floor (
min={0}), which can produce avoidable save errors for valid user actions.💡 Proposed fix
<Input label={t('featureFlags.maxContextTokens')} type="number" value={draft.maxContextTokens ?? ''} onChange={(e) => updateDraft({ maxContextTokens: e.target.value ? Number(e.target.value) : undefined, }) } disabled={cannotManage} size="sm" placeholder="e.g. 50000" - min={0} + min={4096} + step={1} />As per coding guidelines, "ALWAYS share validation schemas between client and server using Zod... DO NOT duplicate validation logic."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@services/platform/app/features/settings/governance/components/feature-flags-editor.tsx` around lines 269 - 284, The number input for maxContextTokens allows values below the backend minimum; update the Input in feature-flags-editor (the Input with value={draft.maxContextTokens ?? ''}, onChange calling updateDraft) to use the shared schema minimum instead of hardcoded 0: import the MIN (or equivalent) constant from the shared Zod schema used by the server and set Input's min prop to that constant (and adjust placeholder/help text if needed); this ensures client-side constraints match the server validation and avoids duplicate validation logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@services/platform/app/features/settings/governance/components/__tests__/feature-flags-editor.test.tsx`:
- Around line 98-100: The test is asserting the raw glyph '\u2718', which is
fragile; instead find the row/cell for the "default" feature and assert the
accessible state (aria-label, title, or cell text) that represents the
disabled/off state. Locate the row by using screen.getByText('default') and call
.closest('tr') (or the container element used in the table), then use
within(row).getByRole('cell', { name: /disabled|off|false/i }) or
within(row).getByLabelText(/disabled|off/i) (or check the cell's
title/aria-label) to assert the rule state rather than matching the glyph
character.
- Around line 63-67: Mocks for mockedUseGovernancePolicy are being cast with "as
never", which hides type drift; replace those casts with "satisfies
ReturnType<typeof useGovernancePolicy>" so the mock object is type-checked
against the actual hook return type while allowing partial fields. Update each
mockReturnValue call that currently ends with "as never" (e.g., the
mockedUseGovernancePolicy mock in this test and the other occurrences you noted)
to provide the same object but append "satisfies ReturnType<typeof
useGovernancePolicy>" instead, ensuring imports reference useGovernancePolicy
and adjusting only the cast style without changing the mock shape.
In
`@services/platform/app/features/settings/governance/components/feature-flags-editor.tsx`:
- Around line 288-307: The preset buttons rendered from CONTEXT_TOKEN_PRESETS
lack semantic state; update the Button elements (the map that uses
CONTEXT_TOKEN_PRESETS and calls updateDraft) to include an aria-pressed
attribute set to true when draft.maxContextTokens === preset.value and false
otherwise, so assistive tech can detect the active preset; keep the existing
visual border logic and ensure aria-pressed is present alongside
disabled={cannotManage}.
In `@services/platform/convex/governance/feature_enforcement.test.ts`:
- Line 10: Replace the unsafe cast "const mockCtx = {} as never" with a typed
fixture that uses TypeScript's satisfies operator to model only the minimal
shape needed for calls like ctx.db.query().withIndex().first(); define mockCtx
as an object with a db property exposing query() returning an object whose
withIndex() returns an object with first(): Promise<...> (or a stubbed value)
and declare it using "satisfies" against the expected partial type so TypeScript
will check shape without requiring full interfaces; update tests to use this
mockCtx in place of the previous cast.
In `@services/platform/convex/lib/agent_chat/internal_actions.ts`:
- Around line 415-418: There is a duplicate maxContextTokens key in the config
object passed to generateAgentResponse; edit the object in internal_actions.ts
(the call to generateAgentResponse that includes maxContextTokens,
finalInstructions, and toolsSummary) and remove the redundant maxContextTokens
entry so the object only defines maxContextTokens once.
In `@services/platform/convex/lib/context_management/check_and_summarize.test.ts`:
- Around line 16-103: The mockEstimateContextSize mock retains call history
across tests causing fragile toHaveBeenCalledWith assertions; add a beforeEach
or afterEach hook in the check_and_summarize.test.ts suite that clears the mock
(e.g., call mockEstimateContextSize.mockClear() or
mockEstimateContextSize.mockReset()) so each it-block starts with a fresh mock
state; reference the mock by name (mockEstimateContextSize) and place the hook
alongside the existing describe/it blocks for checkAndSummarizeIfNeeded.
In `@services/platform/convex/lib/context_management/context_builder.test.ts`:
- Around line 20-51: The current test only compares token counts and can pass if
both paths behave identically; update the test to assert trimming occurred by
checking the builder output's trim flags: after calling
ContextBuilder.addContext(...) and build(), assert that resultWithCap indicates
the 'large' context was trimmed (e.g.,
resultWithCap.contexts['large'].wasTrimmed or resultWithCap.wasTrimmed === true)
and that resultWithoutCap shows no trimming for the same context, while keeping
the totalTokens comparison using resultWithCap and resultWithoutCap.
In `@services/platform/messages/de.json`:
- Around line 3623-3661: The de.json file contains duplicate definitions for
governance.tabs and governance.featureFlags which will silently overwrite
earlier entries; locate the existing governance.tabs and governance.featureFlags
objects (keys "tabs" and "featureFlags") and either remove this duplicate block
(the one containing "System-Prompt", "Budgets", "Funktionskontrolle", etc.) or
merge any new/unique translations from this block into the original
governance.tabs/governance.featureFlags definitions so there is only one
canonical entry for each key.
In `@services/platform/messages/en.json`:
- Around line 3707-3712: The governance translations object contains duplicate
keys ("tabs" and "featureFlags.enabled") causing earlier values to be
overwritten; edit the governance block to remove duplicates and merge any
differing translations so each key appears only once — ensure there is a single
"tabs" object containing all required tab entries (systemPrompt, budgets,
featureControls, usage) and a single "featureFlags" object with one "enabled"
entry, preserving the intended strings from the most correct/complete
occurrences.
---
Outside diff comments:
In
`@services/platform/app/features/settings/governance/components/feature-flags-editor.tsx`:
- Around line 269-284: The number input for maxContextTokens allows values below
the backend minimum; update the Input in feature-flags-editor (the Input with
value={draft.maxContextTokens ?? ''}, onChange calling updateDraft) to use the
shared schema minimum instead of hardcoded 0: import the MIN (or equivalent)
constant from the shared Zod schema used by the server and set Input's min prop
to that constant (and adjust placeholder/help text if needed); this ensures
client-side constraints match the server validation and avoids duplicate
validation logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: a4b4ea1d-0e6f-44d4-86c6-6b963128d759
📒 Files selected for processing (15)
services/platform/app/features/settings/governance/components/__tests__/feature-flags-editor.test.tsxservices/platform/app/features/settings/governance/components/feature-flags-editor.test.tsxservices/platform/app/features/settings/governance/components/feature-flags-editor.tsxservices/platform/convex/governance/feature_enforcement.test.tsservices/platform/convex/lib/agent_chat/internal_actions.tsservices/platform/convex/lib/agent_response/generate_response.tsservices/platform/convex/lib/agent_response/types.tsservices/platform/convex/lib/context_management/check_and_summarize.test.tsservices/platform/convex/lib/context_management/check_and_summarize.tsservices/platform/convex/lib/context_management/context_builder.test.tsservices/platform/convex/lib/context_management/context_builder.tsservices/platform/lib/shared/schemas/governance.test.tsservices/platform/lib/shared/schemas/governance.tsservices/platform/messages/de.jsonservices/platform/messages/en.json
| mockedUseGovernancePolicy.mockReturnValue({ | ||
| data: null, | ||
| isLoading: false, | ||
| } as never); | ||
| }); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate and examine the test file
find . -type f -name "feature-flags-editor.test.tsx" | head -5Repository: tale-project/tale
Length of output: 258
🏁 Script executed:
# Get the actual file size to determine how to read it
wc -l services/platform/app/features/settings/governance/components/__tests__/feature-flags-editor.test.tsxRepository: tale-project/tale
Length of output: 166
🏁 Script executed:
# Read the specific lines mentioned in the review comment
sed -n '60,70p;90,100p;115,125p;150,160p' services/platform/app/features/settings/governance/components/__tests__/feature-flags-editor.test.tsxRepository: tale-project/tale
Length of output: 1183
🏁 Script executed:
# Find the useGovernancePolicy hook definition to understand its return type
find . -type f \( -name "*.ts" -o -name "*.tsx" \) -exec grep -l "export.*useGovernancePolicy\|function useGovernancePolicy" {} \;Repository: tale-project/tale
Length of output: 130
🏁 Script executed:
# Examine the useGovernancePolicy hook definition
cat -n services/platform/app/features/settings/governance/hooks/queries.tsRepository: tale-project/tale
Length of output: 1040
🏁 Script executed:
# Also check what types are being used, search for type definitions
grep -A 5 -B 5 "useGovernancePolicy\|GovernancePolicy" services/platform/app/features/settings/governance/hooks/queries.ts | head -50Repository: tale-project/tale
Length of output: 301
🏁 Script executed:
# Find the useConvexQuery hook to understand its return type
find . -type f -name "use-convex-query.ts" -o -name "use-convex-query.tsx"Repository: tale-project/tale
Length of output: 110
🏁 Script executed:
# Check the useConvexQuery hook definition
cat -n services/platform/app/hooks/use-convex-query.ts | head -80Repository: tale-project/tale
Length of output: 1681
🏁 Script executed:
# Read the complete test file to understand the context better
cat -n services/platform/app/features/settings/governance/components/__tests__/feature-flags-editor.test.tsxRepository: tale-project/tale
Length of output: 5290
🏁 Script executed:
# Verify the structure of UseQueryResult to confirm satisfies pattern would work
rg "interface UseQueryResult|type UseQueryResult" --type tsRepository: tale-project/tale
Length of output: 43
🏁 Script executed:
# Double-check that the satisfies approach compiles by examining TypeScript behavior
# Check if there are any other similar patterns in the codebase for reference
rg "satisfies.*UseQueryResult|satisfies.*ReturnType" services/platform/app --type tsx --type tsRepository: tale-project/tale
Length of output: 88
🏁 Script executed:
# Check UseQueryResult from react-query
rg "UseQueryResult" --type ts | head -20Repository: tale-project/tale
Length of output: 263
🏁 Script executed:
# Check for satisfies patterns in test files
rg "satisfies" services/platform/app --type=ts | head -20Repository: tale-project/tale
Length of output: 752
🏁 Script executed:
# Check the actual return type that useGovernancePolicy provides
# Since useConvexQuery returns UseQueryResult<FunctionReturnType<Func>>,
# we need to understand what FunctionReturnType would be for the governance.queries.getPolicy function
find . -type f -name "*.ts" -path "*/convex/governance/*" | head -10Repository: tale-project/tale
Length of output: 682
🏁 Script executed:
# Find the governance queries definition
find . -type f -name "queries.ts" -path "*/convex/governance/*"Repository: tale-project/tale
Length of output: 109
🏁 Script executed:
# Also check the schema to understand the return type
cat services/platform/convex/governance/schema.ts | head -100Repository: tale-project/tale
Length of output: 1366
🏁 Script executed:
# Check what version of React Query is being used to understand UseQueryResult structure
grep -r "react-query\|@tanstack/react-query" services/platform/package.json services/platform/app/package.json 2>/dev/null | grep -i versionRepository: tale-project/tale
Length of output: 43
🏁 Script executed:
# Look at actual usage patterns in the codebase for mocking hooks
rg "mockReturnValue.*satisfies|mockReturnValue.*as" services/platform/app --type ts -B 2 -A 2 | head -40Repository: tale-project/tale
Length of output: 625
🏁 Script executed:
# Final verification: confirm UseQueryResult structure requires more properties
rg "interface UseQueryResult|type UseQueryResult" --type ts -A 10 | head -30Repository: tale-project/tale
Length of output: 43
Replace as never casts with satisfies for proper type validation in mocks.
Using as never suppresses type validation and masks real type drift if the hook contract changes. Use satisfies ReturnType<typeof useGovernancePolicy> to enforce compile-time compatibility while allowing partial mock objects.
Pattern
mockedUseGovernancePolicy.mockReturnValue({
data: null,
isLoading: false,
-} as never);
+} satisfies ReturnType<typeof useGovernancePolicy>);Also applies to lines 94, 120, 152.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@services/platform/app/features/settings/governance/components/__tests__/feature-flags-editor.test.tsx`
around lines 63 - 67, Mocks for mockedUseGovernancePolicy are being cast with
"as never", which hides type drift; replace those casts with "satisfies
ReturnType<typeof useGovernancePolicy>" so the mock object is type-checked
against the actual hook return type while allowing partial fields. Update each
mockReturnValue call that currently ends with "as never" (e.g., the
mockedUseGovernancePolicy mock in this test and the other occurrences you noted)
to provide the same object but append "satisfies ReturnType<typeof
useGovernancePolicy>" instead, ensuring imports reference useGovernancePolicy
and adjusting only the cast style without changing the mock shape.
| expect(screen.getByText('default')).toBeInTheDocument(); | ||
| expect(screen.getByText('\u2718')).toBeInTheDocument(); | ||
| }); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Avoid glyph-only assertion for rule state.
Asserting '\u2718' is fragile. Prefer checking the specific column/cell semantics (e.g., role-based lookup or labeled state) so UI icon changes don’t break behavior tests.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@services/platform/app/features/settings/governance/components/__tests__/feature-flags-editor.test.tsx`
around lines 98 - 100, The test is asserting the raw glyph '\u2718', which is
fragile; instead find the row/cell for the "default" feature and assert the
accessible state (aria-label, title, or cell text) that represents the
disabled/off state. Locate the row by using screen.getByText('default') and call
.closest('tr') (or the container element used in the table), then use
within(row).getByRole('cell', { name: /disabled|off|false/i }) or
within(row).getByLabelText(/disabled|off/i) (or check the cell's
title/aria-label) to assert the rule state rather than matching the glyph
character.
| <HStack gap={1} className="mt-2" wrap> | ||
| {CONTEXT_TOKEN_PRESETS.map((preset) => ( | ||
| <Button | ||
| key={preset.value} | ||
| type="button" | ||
| variant="secondary" | ||
| size="sm" | ||
| onClick={() => | ||
| updateDraft({ maxContextTokens: preset.value }) | ||
| } | ||
| disabled={cannotManage} | ||
| className={ | ||
| draft.maxContextTokens === preset.value | ||
| ? 'border-primary' | ||
| : '' | ||
| } | ||
| > | ||
| {preset.label} | ||
| </Button> | ||
| ))} |
There was a problem hiding this comment.
Expose preset selected state with aria-pressed.
Preset selection is currently conveyed only via border color. Add semantic state so assistive technologies can identify which preset is active.
💡 Proposed fix
<Button
key={preset.value}
type="button"
variant="secondary"
size="sm"
+ aria-pressed={draft.maxContextTokens === preset.value}
onClick={() =>
updateDraft({ maxContextTokens: preset.value })
}As per coding guidelines, "AVOID using color alone to convey information."
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| <HStack gap={1} className="mt-2" wrap> | |
| {CONTEXT_TOKEN_PRESETS.map((preset) => ( | |
| <Button | |
| key={preset.value} | |
| type="button" | |
| variant="secondary" | |
| size="sm" | |
| onClick={() => | |
| updateDraft({ maxContextTokens: preset.value }) | |
| } | |
| disabled={cannotManage} | |
| className={ | |
| draft.maxContextTokens === preset.value | |
| ? 'border-primary' | |
| : '' | |
| } | |
| > | |
| {preset.label} | |
| </Button> | |
| ))} | |
| <HStack gap={1} className="mt-2" wrap> | |
| {CONTEXT_TOKEN_PRESETS.map((preset) => ( | |
| <Button | |
| key={preset.value} | |
| type="button" | |
| variant="secondary" | |
| size="sm" | |
| aria-pressed={draft.maxContextTokens === preset.value} | |
| onClick={() => | |
| updateDraft({ maxContextTokens: preset.value }) | |
| } | |
| disabled={cannotManage} | |
| className={ | |
| draft.maxContextTokens === preset.value | |
| ? 'border-primary' | |
| : '' | |
| } | |
| > | |
| {preset.label} | |
| </Button> | |
| ))} |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@services/platform/app/features/settings/governance/components/feature-flags-editor.tsx`
around lines 288 - 307, The preset buttons rendered from CONTEXT_TOKEN_PRESETS
lack semantic state; update the Button elements (the map that uses
CONTEXT_TOKEN_PRESETS and calls updateDraft) to include an aria-pressed
attribute set to true when draft.maxContextTokens === preset.value and false
otherwise, so assistive tech can detect the active preset; keep the existing
visual border logic and ensure aria-pressed is present alongside
disabled={cannotManage}.
|
|
||
| const { resolveFeatureFlags } = await import('./feature_enforcement'); | ||
|
|
||
| const mockCtx = {} as never; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify current cast usage and inspect resolveFeatureFlags signature.
rg -nP '\bas\s+never\b' services/platform/convex/governance/feature_enforcement.test.ts
rg -nP 'export\s+(async\s+)?function\s+resolveFeatureFlags\s*\(' services/platform/convex/governance/feature_enforcement.ts -C3Repository: tale-project/tale
Length of output: 360
🏁 Script executed:
#!/bin/bash
# Find GenericQueryCtx definition
rg -nP 'type GenericQueryCtx|interface GenericQueryCtx' services/platform/convex -C5
# Check for other ctx mocking patterns in test files
rg -nP 'mockCtx|ctx.*=.*\{' services/platform/convex --glob='*.test.ts' -A1 -B1 | head -40Repository: tale-project/tale
Length of output: 3404
🏁 Script executed:
#!/bin/bash
# Search for createMockCtx definition
rg -nP 'export.*function createMockCtx|const createMockCtx' services/platform/convex --glob='*.ts'
# Look at the full feature_enforcement.test.ts to see how mockCtx is used
cat services/platform/convex/governance/feature_enforcement.test.ts | head -60Repository: tale-project/tale
Length of output: 1537
🏁 Script executed:
#!/bin/bash
# Look at the resolveFeatureFlags implementation to see what ctx methods it uses
cat services/platform/convex/governance/feature_enforcement.ts | head -100
# Also check what readPolicyConfig signature expects
rg -nP 'export.*function readPolicyConfig|export.*const readPolicyConfig' services/platform/convex/governance/helpers.ts -A5Repository: tale-project/tale
Length of output: 2447
🏁 Script executed:
#!/bin/bash
# Check what readPolicyConfig actually needs from ctx
cat services/platform/convex/governance/helpers.ts | head -50
# Check if there are other test files that properly mock GenericQueryCtx
rg -l 'GenericQueryCtx' services/platform/convex --glob='*.test.ts' | head -5Repository: tale-project/tale
Length of output: 1588
🏁 Script executed:
#!/bin/bash
# Check what GenericQueryCtx actually requires (from Convex library)
rg -nP 'GenericQueryCtx' services/platform/convex/_generated -A5 | head -30
# Also check if there are any other uses of satisfies in test files for guidance
rg -nP 'satisfies' services/platform/convex --glob='*.test.ts' | head -10Repository: tale-project/tale
Length of output: 1423
Replace as never with a properly typed fixture using satisfies.
Line 10 uses as never which suppresses type safety and can hide interface drift. Create a mock object with satisfies that includes only the minimal required shape for ctx.db.query().withIndex().first() access, aligning with the established pattern in the codebase and the learnings from prior reviews on test fixtures.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/platform/convex/governance/feature_enforcement.test.ts` at line 10,
Replace the unsafe cast "const mockCtx = {} as never" with a typed fixture that
uses TypeScript's satisfies operator to model only the minimal shape needed for
calls like ctx.db.query().withIndex().first(); define mockCtx as an object with
a db property exposing query() returning an object whose withIndex() returns an
object with first(): Promise<...> (or a stubbed value) and declare it using
"satisfies" against the expected partial type so TypeScript will check shape
without requiring full interfaces; update tests to use this mockCtx in place of
the previous cast.
| maxContextTokens, | ||
| instructions: finalInstructions, | ||
| toolsSummary, | ||
| maxContextTokens, |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify duplicate property occurrence in the same call-site object.
rg -n "maxContextTokens" services/platform/convex/lib/agent_chat/internal_actions.ts -C3Repository: tale-project/tale
Length of output: 931
🏁 Script executed:
sed -n '395,425p' services/platform/convex/lib/agent_chat/internal_actions.ts | cat -nRepository: tale-project/tale
Length of output: 1523
🏁 Script executed:
sed -n '395,430p' services/platform/convex/lib/agent_chat/internal_actions.ts | head -40Repository: tale-project/tale
Length of output: 1509
🏁 Script executed:
sed -n '405,425p' services/platform/convex/lib/agent_chat/internal_actions.tsRepository: tale-project/tale
Length of output: 968
🏁 Script executed:
sed -n '390,425p' services/platform/convex/lib/agent_chat/internal_actions.tsRepository: tale-project/tale
Length of output: 1446
Remove duplicated maxContextTokens key in generateAgentResponse config object.
Lines 415 and 418 both define maxContextTokens in the same object literal. This is a TypeScript compile-time error (TS1117) that blocks typecheck and build.
Proposed fix
noCacheToolNames: agentConfig.noCacheToolNames,
maxContextTokens,
instructions: finalInstructions,
toolsSummary,
- maxContextTokens,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| maxContextTokens, | |
| instructions: finalInstructions, | |
| toolsSummary, | |
| maxContextTokens, | |
| maxContextTokens, | |
| instructions: finalInstructions, | |
| toolsSummary, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/platform/convex/lib/agent_chat/internal_actions.ts` around lines 415
- 418, There is a duplicate maxContextTokens key in the config object passed to
generateAgentResponse; edit the object in internal_actions.ts (the call to
generateAgentResponse that includes maxContextTokens, finalInstructions, and
toolsSummary) and remove the redundant maxContextTokens entry so the object only
defines maxContextTokens once.
| describe('checkAndSummarizeIfNeeded — maxContextTokens', () => { | ||
| it('uses default model context limit when maxContextTokens is not set', async () => { | ||
| mockEstimateContextSize.mockResolvedValue({ | ||
| totalTokens: 50000, | ||
| usagePercent: 39.1, | ||
| }); | ||
|
|
||
| const result = await checkAndSummarizeIfNeeded(mockCtx, { | ||
| threadId: 'thread_1', | ||
| contextMessagesTokens: 5000, | ||
| currentPromptTokens: 100, | ||
| existingSummary: undefined, | ||
| }); | ||
|
|
||
| expect(mockEstimateContextSize).toHaveBeenCalledWith( | ||
| mockCtx, | ||
| expect.objectContaining({ | ||
| modelContextLimit: 128000, | ||
| }), | ||
| ); | ||
| expect(result.estimate.needsSummarization).toBe(false); | ||
| }); | ||
|
|
||
| it('uses governance limit when lower than model limit', async () => { | ||
| mockEstimateContextSize.mockResolvedValue({ | ||
| totalTokens: 6000, | ||
| usagePercent: 75.0, | ||
| }); | ||
|
|
||
| await checkAndSummarizeIfNeeded(mockCtx, { | ||
| threadId: 'thread_1', | ||
| contextMessagesTokens: 5000, | ||
| currentPromptTokens: 100, | ||
| existingSummary: undefined, | ||
| modelContextLimit: 128000, | ||
| maxContextTokens: 8192, | ||
| }); | ||
|
|
||
| expect(mockEstimateContextSize).toHaveBeenCalledWith( | ||
| mockCtx, | ||
| expect.objectContaining({ | ||
| modelContextLimit: 8192, | ||
| }), | ||
| ); | ||
| }); | ||
|
|
||
| it('uses model limit when governance limit is higher', async () => { | ||
| mockEstimateContextSize.mockResolvedValue({ | ||
| totalTokens: 50000, | ||
| usagePercent: 39.1, | ||
| }); | ||
|
|
||
| await checkAndSummarizeIfNeeded(mockCtx, { | ||
| threadId: 'thread_1', | ||
| contextMessagesTokens: 5000, | ||
| currentPromptTokens: 100, | ||
| existingSummary: undefined, | ||
| modelContextLimit: 32000, | ||
| maxContextTokens: 128000, | ||
| }); | ||
|
|
||
| expect(mockEstimateContextSize).toHaveBeenCalledWith( | ||
| mockCtx, | ||
| expect.objectContaining({ | ||
| modelContextLimit: 32000, | ||
| }), | ||
| ); | ||
| }); | ||
|
|
||
| it('triggers summarization earlier with low governance limit', async () => { | ||
| mockEstimateContextSize.mockResolvedValue({ | ||
| totalTokens: 6000, | ||
| usagePercent: 73.2, | ||
| }); | ||
|
|
||
| const result = await checkAndSummarizeIfNeeded(mockCtx, { | ||
| threadId: 'thread_1', | ||
| contextMessagesTokens: 5000, | ||
| currentPromptTokens: 100, | ||
| existingSummary: undefined, | ||
| modelContextLimit: 128000, | ||
| maxContextTokens: 8192, | ||
| }); | ||
|
|
||
| // 6000 / 8192 = 0.732 which exceeds SUMMARIZATION_THRESHOLD (0.65) | ||
| expect(result.estimate.needsSummarization).toBe(true); | ||
| }); | ||
| }); |
There was a problem hiding this comment.
Reset mockEstimateContextSize between tests to avoid cross-test contamination.
Shared call history can make toHaveBeenCalledWith assertions pass for the wrong reason when test order changes.
Suggested fix
import { describe, it, expect, vi } from 'vitest';
@@
const mockCtx = {} as never;
describe('checkAndSummarizeIfNeeded — maxContextTokens', () => {
+ beforeEach(() => {
+ mockEstimateContextSize.mockReset();
+ });
+
it('uses default model context limit when maxContextTokens is not set', async () => {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/platform/convex/lib/context_management/check_and_summarize.test.ts`
around lines 16 - 103, The mockEstimateContextSize mock retains call history
across tests causing fragile toHaveBeenCalledWith assertions; add a beforeEach
or afterEach hook in the check_and_summarize.test.ts suite that clears the mock
(e.g., call mockEstimateContextSize.mockClear() or
mockEstimateContextSize.mockReset()) so each it-block starts with a fresh mock
state; reference the mock by name (mockEstimateContextSize) and place the hook
alongside the existing describe/it blocks for checkAndSummarizeIfNeeded.
| it('caps budget when maxContextTokens is lower than model limit', () => { | ||
| const largeContent = 'token '.repeat(10000); | ||
|
|
||
| const builderWithoutCap = new ContextBuilder({ | ||
| agentType: 'chat', | ||
| modelContextLimit: 128000, | ||
| }); | ||
| builderWithoutCap.addContext( | ||
| 'large', | ||
| largeContent, | ||
| ContextPriority.LOW_RELEVANCE, | ||
| { canTrim: true }, | ||
| ); | ||
| const resultWithoutCap = builderWithoutCap.build(); | ||
|
|
||
| const builderWithCap = new ContextBuilder({ | ||
| agentType: 'chat', | ||
| modelContextLimit: 128000, | ||
| maxContextTokens: 8192, | ||
| }); | ||
| builderWithCap.addContext( | ||
| 'large', | ||
| largeContent, | ||
| ContextPriority.LOW_RELEVANCE, | ||
| { canTrim: true }, | ||
| ); | ||
| const resultWithCap = builderWithCap.build(); | ||
|
|
||
| expect(resultWithCap.totalTokens).toBeLessThanOrEqual( | ||
| resultWithoutCap.totalTokens, | ||
| ); | ||
| }); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Strengthen the cap test to assert trimming behavior explicitly.
toBeLessThanOrEqual can pass even if both code paths behave the same. Add explicit wasTrimmed expectations to validate that the governance cap is the cause.
Suggested test hardening
const resultWithoutCap = builderWithoutCap.build();
@@
const resultWithCap = builderWithCap.build();
+ expect(resultWithoutCap.wasTrimmed).toBe(false);
+ expect(resultWithCap.wasTrimmed).toBe(true);
expect(resultWithCap.totalTokens).toBeLessThanOrEqual(
resultWithoutCap.totalTokens,
);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| it('caps budget when maxContextTokens is lower than model limit', () => { | |
| const largeContent = 'token '.repeat(10000); | |
| const builderWithoutCap = new ContextBuilder({ | |
| agentType: 'chat', | |
| modelContextLimit: 128000, | |
| }); | |
| builderWithoutCap.addContext( | |
| 'large', | |
| largeContent, | |
| ContextPriority.LOW_RELEVANCE, | |
| { canTrim: true }, | |
| ); | |
| const resultWithoutCap = builderWithoutCap.build(); | |
| const builderWithCap = new ContextBuilder({ | |
| agentType: 'chat', | |
| modelContextLimit: 128000, | |
| maxContextTokens: 8192, | |
| }); | |
| builderWithCap.addContext( | |
| 'large', | |
| largeContent, | |
| ContextPriority.LOW_RELEVANCE, | |
| { canTrim: true }, | |
| ); | |
| const resultWithCap = builderWithCap.build(); | |
| expect(resultWithCap.totalTokens).toBeLessThanOrEqual( | |
| resultWithoutCap.totalTokens, | |
| ); | |
| }); | |
| it('caps budget when maxContextTokens is lower than model limit', () => { | |
| const largeContent = 'token '.repeat(10000); | |
| const builderWithoutCap = new ContextBuilder({ | |
| agentType: 'chat', | |
| modelContextLimit: 128000, | |
| }); | |
| builderWithoutCap.addContext( | |
| 'large', | |
| largeContent, | |
| ContextPriority.LOW_RELEVANCE, | |
| { canTrim: true }, | |
| ); | |
| const resultWithoutCap = builderWithoutCap.build(); | |
| const builderWithCap = new ContextBuilder({ | |
| agentType: 'chat', | |
| modelContextLimit: 128000, | |
| maxContextTokens: 8192, | |
| }); | |
| builderWithCap.addContext( | |
| 'large', | |
| largeContent, | |
| ContextPriority.LOW_RELEVANCE, | |
| { canTrim: true }, | |
| ); | |
| const resultWithCap = builderWithCap.build(); | |
| expect(resultWithoutCap.wasTrimmed).toBe(false); | |
| expect(resultWithCap.wasTrimmed).toBe(true); | |
| expect(resultWithCap.totalTokens).toBeLessThanOrEqual( | |
| resultWithoutCap.totalTokens, | |
| ); | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/platform/convex/lib/context_management/context_builder.test.ts`
around lines 20 - 51, The current test only compares token counts and can pass
if both paths behave identically; update the test to assert trimming occurred by
checking the builder output's trim flags: after calling
ContextBuilder.addContext(...) and build(), assert that resultWithCap indicates
the 'large' context was trimmed (e.g.,
resultWithCap.contexts['large'].wasTrimmed or resultWithCap.wasTrimmed === true)
and that resultWithoutCap shows no trimming for the same context, while keeping
the totalTokens comparison using resultWithCap and resultWithoutCap.
| "tabs": { | ||
| "systemPrompt": "System-Prompt", | ||
| "budgets": "Budgets", | ||
| "featureControls": "Funktionskontrolle", | ||
| "usage": "Nutzung" | ||
| }, | ||
| "featureFlags": { | ||
| "title": "Funktionskontrolle", | ||
| "description": "Funktionen pro Benutzer, Team oder Rolle aktivieren oder deaktivieren.", | ||
| "enabled": "Aktiviert", | ||
| "webSearch": "Websuche", | ||
| "codeExecution": "Code-Ausf\u00fchrung", | ||
| "fileUpload": "Datei-Upload", | ||
| "maxContextTokens": "Max. Kontext-Token", | ||
| "maxContextTokensPlaceholder": "z.B. 32768", | ||
| "maxContextTokensHelp": "Maximale Token f\u00fcr das KI-Kontextfenster pro Anfrage. Niedrigere Werte senken die Kosten. Leer lassen f\u00fcr kein Limit. Minimum: 4096.", | ||
| "scope": "Bereich", | ||
| "target": "Ziel", | ||
| "actions": "Aktionen", | ||
| "role": "Rolle", | ||
| "user": "Benutzer", | ||
| "team": "Team", | ||
| "allUsers": "Alle Benutzer", | ||
| "searchUsers": "Benutzer suchen...", | ||
| "noUsersFound": "Keine Benutzer gefunden", | ||
| "selectUser": "Benutzer w\u00e4hlen...", | ||
| "searchTeams": "Teams suchen...", | ||
| "noTeamsFound": "Keine Teams gefunden", | ||
| "selectTeam": "Team w\u00e4hlen...", | ||
| "confirm": "Best\u00e4tigen", | ||
| "addRule": "Regel hinzuf\u00fcgen", | ||
| "addRuleTitle": "Funktionskontrollregel hinzuf\u00fcgen", | ||
| "editRuleTitle": "Funktionskontrollregel bearbeiten", | ||
| "editRule": "Regel {index} bearbeiten", | ||
| "removeRule": "Regel {index} entfernen", | ||
| "emptyState": "Keine Funktionskontrollregeln konfiguriert. F\u00fcge eine Regel hinzu, um Funktionen zu verwalten.", | ||
| "saved": "Funktionskontrolle gespeichert", | ||
| "saveFailed": "Funktionskontrolle konnte nicht gespeichert werden" | ||
| }, |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
python - <<'PY'
import json
from pathlib import Path
from collections import Counter
path = Path("services/platform/messages/de.json")
text = path.read_text(encoding="utf-8")
def pairs_hook(pairs):
return pairs # preserve duplicate keys in object pair lists
data = json.loads(text, object_pairs_hook=pairs_hook)
dups = set()
def walk(node, p):
if isinstance(node, list):
is_object_pairs = (
len(node) > 0 and
isinstance(node[0], tuple) and
len(node[0]) == 2 and
isinstance(node[0][0], str)
)
if is_object_pairs:
keys = [k for k, _ in node]
counts = Counter(keys)
for k, c in counts.items():
if c > 1:
dups.add(".".join(p + [k]))
for k, v in node:
walk(v, p + [k])
else:
for idx, v in enumerate(node):
walk(v, p + [str(idx)])
walk(data, [])
print("Duplicate key paths:")
for k in sorted(dups):
print(" -", k)
PYRepository: tale-project/tale
Length of output: 127
Remove duplicate governance.tabs and governance.featureFlags key definitions to prevent silent overwrite.
The added block at lines 3623–3661 introduces governance.tabs and governance.featureFlags keys that are already defined elsewhere in the object. JSON parsers use the last occurrence, so the newly added translations will be overwritten and never used at runtime.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/platform/messages/de.json` around lines 3623 - 3661, The de.json
file contains duplicate definitions for governance.tabs and
governance.featureFlags which will silently overwrite earlier entries; locate
the existing governance.tabs and governance.featureFlags objects (keys "tabs"
and "featureFlags") and either remove this duplicate block (the one containing
"System-Prompt", "Budgets", "Funktionskontrolle", etc.) or merge any new/unique
translations from this block into the original
governance.tabs/governance.featureFlags definitions so there is only one
canonical entry for each key.
| "tabs": { | ||
| "systemPrompt": "System Prompt", | ||
| "budgets": "Budgets", | ||
| "featureControls": "Feature controls", | ||
| "usage": "Usage" | ||
| }, |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
python - <<'PY'
import json, collections
from pathlib import Path
path = Path("services/platform/messages/en.json")
dups = []
def hook(pairs):
counts = collections.Counter(k for k, _ in pairs)
for k, c in counts.items():
if c > 1:
dups.append(k)
return dict(pairs)
json.loads(path.read_text(), object_pairs_hook=hook)
print("Duplicate keys detected:", sorted(set(dups)))
PYRepository: tale-project/tale
Length of output: 105
🏁 Script executed:
# First, view the relevant sections of en.json to confirm the duplicates
cat -n services/platform/messages/en.json | sed -n '3700,3730p'Repository: tale-project/tale
Length of output: 1558
🏁 Script executed:
# Also view the other occurrence around line 3795
cat -n services/platform/messages/en.json | sed -n '3790,3810p'Repository: tale-project/tale
Length of output: 763
🏁 Script executed:
# Check what locale files exist
fd -e json services/platform/messages/Repository: tale-project/tale
Length of output: 474
🏁 Script executed:
fd . 'services/platform/messages/' --type f -e jsonRepository: tale-project/tale
Length of output: 245
🏁 Script executed:
# Check if other locale files exist and sample their structure
ls -la services/platform/messages/ | grep -E '\.(json|ts)$'Repository: tale-project/tale
Length of output: 370
🏁 Script executed:
# Check if de.json has the governance section with tabs and featureFlags
python3 << 'PYTHON'
import json
from pathlib import Path
# Read en.json and de.json
en_file = Path("services/platform/messages/en.json")
de_file = Path("services/platform/messages/de.json")
en_data = json.loads(en_file.read_text())
de_data = json.loads(de_file.read_text())
# Check governance structure
if "governance" in en_data:
print("=== EN.JSON governance structure ===")
gov_keys = list(en_data["governance"].keys())
print(f"Top-level keys: {gov_keys}")
print(f"Number of 'tabs': {list(en_data['governance'].keys()).count('tabs')}")
print(f"Number of 'featureFlags': {list(en_data['governance'].keys()).count('featureFlags')}")
if "governance" in de_data:
print("\n=== DE.JSON governance structure ===")
gov_keys = list(de_data["governance"].keys())
print(f"Top-level keys: {gov_keys}")
print(f"Number of 'tabs': {list(de_data['governance'].keys()).count('tabs')}")
print(f"Number of 'featureFlags': {list(de_data['governance'].keys()).count('featureFlags')}")
else:
print("\n=== DE.JSON ===")
print("No 'governance' key found in de.json")
# Check for missing keys
print("\n=== Key sync check ===")
en_gov_keys = set(en_data.get("governance", {}).keys()) if "governance" in en_data else set()
de_gov_keys = set(de_data.get("governance", {}).keys()) if "governance" in de_data else set()
missing_in_de = en_gov_keys - de_gov_keys
extra_in_de = de_gov_keys - en_gov_keys
if missing_in_de:
print(f"Keys in en.json governance but missing in de.json: {missing_in_de}")
if extra_in_de:
print(f"Keys in de.json governance but not in en.json: {extra_in_de}")
if not missing_in_de and not extra_in_de:
print("All governance keys are synchronized between en.json and de.json")
PYTHONRepository: tale-project/tale
Length of output: 622
🏁 Script executed:
# Use the original detection method to confirm exact locations of duplicate keys
python3 << 'PYTHON'
import json
from pathlib import Path
from collections import defaultdict
path = Path("services/platform/messages/en.json")
content = path.read_text()
# Parse with tracking to find duplicate keys
duplicate_info = defaultdict(list)
def hook(pairs):
for i, (k, v) in enumerate(pairs):
duplicate_info[k].append((i, v))
return dict(pairs)
json.loads(content, object_pairs_hook=hook)
# Show keys with multiple values
print("=== Duplicate keys found ===")
for key in sorted(duplicate_info.keys()):
if len(duplicate_info[key]) > 1:
print(f"\nKey '{key}' appears {len(duplicate_info[key])} times:")
for idx, (pos, val) in enumerate(duplicate_info[key]):
if isinstance(val, dict):
print(f" Occurrence {idx+1}: {list(val.keys())[:3]}... (dict)")
else:
val_str = str(val)[:50]
print(f" Occurrence {idx+1}: {val_str}")
PYTHONRepository: tale-project/tale
Length of output: 50373
🏁 Script executed:
# Extract exact line numbers for the duplicates by searching in raw file
echo "=== Location of first 'tabs' key in governance ==="
grep -n '"tabs"' services/platform/messages/en.json | head -5
echo -e "\n=== Location of 'enabled' key ==="
grep -n '"enabled"' services/platform/messages/en.json | grep -A1 -B1 3716Repository: tale-project/tale
Length of output: 356
Remove duplicate JSON keys in governance translations.
tabs is declared twice (lines 3707 and 3795) and featureFlags.enabled is declared twice (lines 3716 and 3722). Earlier values are silently overwritten, making this error-prone.
💡 Proposed fix
- "tabs": {
- "systemPrompt": "System Prompt",
- "budgets": "Budgets",
- "featureControls": "Feature controls",
- "usage": "Usage"
- },
"featureFlags": {
"title": "Feature controls",
"description": "Enable or disable features per user, team, or role.",
- "enabled": "Enabled",
"webSearch": "Web search",
"codeExecution": "Code execution",
"fileUpload": "File upload",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/platform/messages/en.json` around lines 3707 - 3712, The governance
translations object contains duplicate keys ("tabs" and "featureFlags.enabled")
causing earlier values to be overwritten; edit the governance block to remove
duplicates and merge any differing translations so each key appears only once —
ensure there is a single "tabs" object containing all required tab entries
(systemPrompt, budgets, featureControls, usage) and a single "featureFlags"
object with one "enabled" entry, preserving the intended strings from the most
correct/complete occurrences.
- Remove duplicate maxContextTokens property in internal_actions.ts - Remove unused maxContextTokens destructuring in generate_response.ts - Delete duplicate __tests__/feature-flags-editor.test.tsx (wrong import paths)
Summary
resolveFeatureFlags()intostartAgentChatto resolve per-user/team/role context limits from governance policiesmaxContextTokensthrough the generation pipeline (runAgentGeneration->generateAgentResponse->buildStructuredContext)Math.min(governanceLimit, agentDefault)in all threebuildStructuredContextcalls,ContextBuilder.build(), andcheckAndSummarizeIfNeededfeatureFlagRuleSchemato prevent unusable configurationsfeature_flagsvalidation inupsertPolicymutationFeatureFlagsEditoradmin UI with scope/target rules table, toggle switches for web search/code execution/file upload, number input with preset buttons (8K-128K) for max context tokensen.jsonandde.jsonTest plan
resolveFeatureFlagspriority chain (user > team > role > default), partial rules, disabled policy, empty rulesContextBuilder.build()withmaxContextTokens— verifies budget capping behaviorcheckAndSummarizeIfNeededwith governance limit — verifies summarization triggers earlierfeatureFlagRuleSchemaminimum floor (rejects <4096)FeatureFlagsEditorcomponent tests: empty state, rules table, add button, enabled toggle, loading statecheckAccessibility()) for empty state and rules tablebun run --filter @tale/platform test --project server)bun run --filter @tale/platform test --project client)bun run --filter @tale/platform lint)bun run --filter @tale/platform typecheck)Closes #1362
Summary by CodeRabbit
Release Notes
New Features
Tests
Translations