Skip to content

fix: handle decryption errors, strip unsupported model params, surface expired MCP connections#288

Merged
0xbbjoker merged 7 commits intodevfrom
fix/mcp-decryption-errors-and-model-params
Feb 6, 2026
Merged

fix: handle decryption errors, strip unsupported model params, surface expired MCP connections#288
0xbbjoker merged 7 commits intodevfrom
fix/mcp-decryption-errors-and-model-params

Conversation

@0xbbjoker
Copy link
Collaborator

@0xbbjoker 0xbbjoker commented Feb 6, 2026

Summary

  • Decryption error handling: Added DecryptionError class with phase tracking. OAuth connections auto-marked as expired when tokens can't be decrypted (key rotation/mismatch). User-facing error messages guide reconnection.
  • Model param safety: Added getSafeModelParams and isReasoningModel to strip unsupported params (frequencyPenalty/presencePenalty for Anthropic, temperature for reasoning models) before sending to AI gateway.
  • Expired MCP status: All 4 MCP status tools (GitHub, Google, Linear, Notion) now surface { connected: false, status: "expired" } so the frontend can prompt reconnection.
  • MCP settings plumbing: Fixed MCP settings placement — now injected into character.settings.mcp instead of opts.settings so plugin-mcp can discover them via runtime.character.settings.mcp.
  • Knowledge plugin loading: Only loaded when documents actually exist, not eagerly in ASSISTANT mode.
  • Document count query: Removed roomId filter to get agent-level counts across all rooms.

Test plan

  • Verify MCP tools return expired status when OAuth connection is expired
  • Verify Anthropic model calls don't include frequencyPenalty/presencePenalty
  • Verify reasoning models (claude-opus, o1, o3) don't include temperature
  • Verify MCP plugin discovers settings via runtime.character.settings.mcp
  • Verify knowledge plugin only loads when documents exist

🤖 Generated with Claude Code


Note

Medium Risk
Touches OAuth token retrieval/refresh and AI request parameter shaping; mistakes could break third-party integrations or cause some model requests to fail, but changes are localized and mostly additive/defensive.

Overview
Improves resilience and UX around OAuth-backed integrations by detecting token decryption failures, marking affected connections as expired, and surfacing reconnection guidance. The MCP *_status tools for GitHub/Google/Linear/Notion now explicitly return status: "expired" (with a message) when no active connection exists but an expired one does.

Hardens AI gateway requests by introducing isReasoningModel/getSafeModelParams and using them in chat/completions (and analogous logic in responses) to strip provider/model-unsupported parameters (e.g., Anthropic penalties; temperature for reasoning models), reducing gateway warnings and invalid requests.

Adjusts Eliza runtime plumbing: MCP server config is injected into runtime.character.settings.mcp (and refreshed for cached runtimes) so plugin-mcp can discover it, knowledge plugin loading is now conditional on documents existing, and document counting is widened to be agent-level across rooms.

Written by Cursor Bugbot for commit b3379ac. This will update automatically on new commits. Configure here.

…rface expired MCP connections

- Add DecryptionError class with phase tracking (DEK vs value decryption)
- Auto-mark OAuth connections as expired when tokens can't be decrypted
- Add getSafeModelParams to strip unsupported provider params (Anthropic, reasoning models)
- Surface expired connection status in all 4 MCP status tools (GitHub, Google, Linear, Notion)
- Fix MCP settings placement in character.settings for plugin-mcp discovery
- Only load knowledge plugin when documents exist (not eagerly in ASSISTANT mode)
- Remove roomId filter from document count query for agent-level counts

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@vercel
Copy link

vercel bot commented Feb 6, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
eliza-cloud-v2 Ready Ready Preview, Comment Feb 6, 2026 1:54am

@coderabbitai
Copy link

coderabbitai bot commented Feb 6, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

  • 🔍 Trigger a full review

Note

🎁 Summarized by CodeRabbit Free

Your organization is on the Free plan. CodeRabbit will generate a high-level summary and a walkthrough for each pull request. For a comprehensive line-by-line review, please upgrade your subscription to CodeRabbit Pro by visiting https://app.coderabbit.ai/login.

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link

claude bot commented Feb 6, 2026

Code Review - PR #288

Summary

This PR adds robust error handling for OAuth token decryption failures, model parameter sanitization, and improved MCP connection status reporting. The changes are well-structured and address real operational concerns around key rotation and provider compatibility.


✅ Strengths

1. Decryption Error Handling (lib/services/secrets/encryption.ts)

  • Excellent phase tracking: The new DecryptionError class distinguishes between DEK decryption failures (key mismatch) and value decryption failures (corruption), making debugging much easier
  • Proper cleanup: The finally block ensures DEK buffer is zeroed even on error (line 218-219)
  • Good error messages: User-facing messages clearly guide users to reconnect their OAuth accounts

2. OAuth Connection Management (lib/services/oauth/connection-adapters/generic-adapter.ts)

  • Automatic connection expiry: When decryption fails, connections are automatically marked as expired (line 55-56), preventing repeated failed attempts
  • Graceful degradation: The adapter logs errors but continues operation where possible
  • Consistent error handling: The decryptTokenSecret helper centralizes error handling logic

3. Model Parameter Safety (lib/pricing.ts)

  • Provider-specific filtering: getSafeModelParams() correctly removes unsupported parameters for Anthropic models and reasoning models
  • Well-documented: Clear JSDoc comments explain which parameters are stripped and why

4. MCP Settings Architecture (lib/eliza/runtime-factory.ts)

  • Correct placement: MCP settings are now injected into character.settings.mcp (line 557-559) instead of opts.settings, which aligns with how plugin-mcp discovers configuration
  • Cache refresh: Cached runtimes get fresh MCP settings via applyUserContext (line 647-649)

🔍 Issues & Concerns

Critical Issues

1. Missing Input Validation in decryptTokenSecret

  • Location: lib/services/oauth/connection-adapters/generic-adapter.ts:33-64
  • Issue: No validation that secretId or organizationId are non-empty before calling secretsService.getDecryptedValue()
  • Impact: Could lead to cryptic errors if invalid IDs are passed
  • Recommendation: Add guards at the beginning of the function:
if (!secretId || !organizationId || !connectionId) {
  throw new Error(`Invalid parameters for token decryption: secretId=${!!secretId}, orgId=${!!organizationId}, connId=${!!connectionId}`);
}

2. Inconsistent Parameter Mutation in /responses Route

  • Location: app/api/v1/responses/route.ts:676-684
  • Issue: The code mutates the original request object using delete. This is fragile because:
    • The request object may be reused elsewhere in the call stack
    • It's inconsistent with the getSafeModelParams pattern used in /chat/completions
  • Recommendation: Create a clean copy instead:
const cleanedRequest = { ...request };
const provider2 = getProviderFromModel(model);
if (provider2 === "anthropic") {
  delete cleanedRequest.frequency_penalty;
  delete cleanedRequest.presence_penalty;
  delete cleanedRequest.stop;
}
if (isReasoningModel(model)) {
  delete cleanedRequest.temperature;
}

const requestWithProvider = {
  ...cleanedRequest,
  providerOptions: {
    gateway: { order: ["groq"] },
  },
};

Moderate Issues

3. Missing Tests for New Functions

  • Location: lib/pricing.ts:169-213
  • Issue: The new functions isReasoningModel() and getSafeModelParams() have no test coverage
  • Impact: Edge cases may not be handled correctly (e.g., what if a model name contains "o1" but isn't OpenAI's o1?)
  • Recommendation: Add unit tests covering:
    • Various reasoning model names (o1-preview, o3-mini, claude-opus-4)
    • Provider detection edge cases
    • Parameter combinations

4. Potential False Positives in isReasoningModel

  • Location: lib/pricing.ts:169-176
  • Issue: The check name.includes("o1") could match models like modelo1-custom or turbo1
  • Recommendation: Use more specific patterns:
export function isReasoningModel(model: string): boolean {
  const name = normalizeModelName(model);
  return (
    name.startsWith("claude-opus") ||
    /^o[13](-|$)/.test(name) // Matches o1, o3, o1-preview, o3-mini but not modelo1
  );
}

5. Document Count Query Change May Have Unintended Effects

  • Location: lib/eliza/agent-loader.ts:86-89
  • Issue: Removing the roomId filter changes the semantics of the document count check. This could cause an agent to load the knowledge plugin even if documents only exist in other rooms
  • Impact: May cause performance overhead or unexpected behavior if agents share a character ID across rooms
  • Recommendation: Add a comment explaining why agent-level counting is desired, or consider whether per-room counting was intentional

Minor Issues

6. GitHub Email Removed from Status Response

  • Location: app/api/mcps/github/[transport]/route.ts:101
  • Issue: The email field is removed from the success response but kept for Google, Linear, and Notion
  • Question: Is this intentional? If yes, add a comment explaining why GitHub is different

7. Variable Name provider2 is Confusing

  • Location: app/api/v1/responses/route.ts:676
  • Issue: Variable named provider2 suggests there's a provider variable, but provider was already used earlier in the function (line 667)
  • Recommendation: Use a more descriptive name like modelProvider or reuse the existing provider variable

8. Missing Error Context in Generic Adapter

  • Location: lib/services/oauth/connection-adapters/generic-adapter.ts:58-60
  • Issue: The error message doesn't include the platform name consistently (uses generic "encryption key mismatch" text)
  • Recommendation: Capitalize platform name for better UX: ${platform.charAt(0).toUpperCase() + platform.slice(1)}

🧪 Testing Gaps

  1. No tests for DecryptionError handling: Should test both dek_decryption and value_decryption phases
  2. No tests for expired connection status in MCP routes: Should verify the status response includes { connected: false, status: "expired" }
  3. No tests for parameter stripping: Should verify Anthropic and reasoning model params are correctly removed
  4. No integration tests: Should test the full flow of decryption failure → connection marked expired → user receives helpful message

🔒 Security Review

  • ✅ DEK buffer is properly zeroed after use
  • ✅ Secrets are not logged in error messages
  • ✅ Connection expiry prevents token reuse after decryption failures
  • ✅ No SQL injection risks (using parameterized queries)
  • ⚠️ Consider rate limiting on decryption failures to prevent brute force attempts on encrypted tokens

📊 Performance Considerations

  • ✅ Efficient: Decryption errors short-circuit early
  • ✅ Single DB query for document count (no N+1)
  • ⚠️ Marking connections as expired requires a DB write on every decryption failure - could cause write amplification if many users have invalid keys simultaneously

📝 Documentation

  • ✅ Good JSDoc comments on public functions
  • ✅ Clear inline comments explaining architectural decisions
  • ⚠️ Missing documentation on what happens when SECRETS_MASTER_KEY rotates in production
  • ⚠️ Should add migration guide for existing deployments with encrypted tokens

Recommendations Summary

Must Fix Before Merge:

  1. Fix request mutation in /responses route (use safe copy pattern)
  2. Add input validation to decryptTokenSecret
  3. Fix false positive risk in isReasoningModel regex

Should Fix:
4. Add unit tests for new pricing functions
5. Add comment explaining document count query change
6. Rename provider2 variable

Nice to Have:
7. Add integration tests for decryption error flows
8. Document key rotation procedures
9. Capitalize platform names in error messages


Overall Assessment

Quality: ⭐⭐⭐⭐ (4/5)

This is a solid PR that addresses real operational concerns. The error handling architecture is well-designed, and the phase tracking in DecryptionError is excellent. The main issues are minor safety improvements (input validation, immutability) and missing test coverage. With the critical fixes applied, this will be a valuable improvement to the platform's reliability.

Approve with Changes - Request fixes for issues #1 and #2, strongly recommend #3-5.

0xbbjoker and others added 6 commits February 6, 2026 02:49
When buildMcpSettings returns {} for a cached runtime, stale
charSettings.mcp from a previous request was persisted. Now
explicitly deletes charSettings.mcp when no MCP config applies.

Ref: #288 (comment)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
GitHub status was missing the email field that Google, Linear, and
Notion all include. Re-add for consistency across all MCP status tools.

Ref: #288 (comment)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Anthropic's Messages API supports stop_sequences. Removed stopSequences
from getSafeModelParams strip list and stop from /responses route strip.

Ref: #288 (comment)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use a shallow copy (safeRequest) instead of deleting properties from
the original request object. Also rename provider2 to modelProvider.

Ref: #288 (comment)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
name.includes("o1") could match unrelated models like "modelo1-custom".
Now uses startsWith for claude-opus and /^o[13](-|$)/ regex for OpenAI
reasoning models (o1, o3, o1-preview, o3-mini).

Ref: #288 (comment)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Guard against empty secretId, organizationId, or connectionId before
calling secretsService.getDecryptedValue() to surface clear errors.

Ref: #288 (comment)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@0xbbjoker
Copy link
Collaborator Author

Claude says: Addressed the following from this review:

Fixed:

  1. Request mutation in /responses route32c8e5b (shallow copy + renamed provider2 to modelProvider)
  2. Input validation in decryptTokenSecretb3379ac (guards for empty secretId/orgId/connId)
  3. False positives in isReasoningModel0521469 (startsWith for claude-opus, /^o13/ regex for OpenAI)

Acknowledged but deferring:

  • Missing tests: Agree these are needed. Will add in a follow-up PR to keep this one focused on fixes.
  • Document count query comment: The roomId filter was intentionally removed to get agent-level counts — knowledge should load if the agent has any documents regardless of room. Existing comment on L645-646 explains intent.
  • Error message capitalization: Minor UX improvement, deferring.
  • Rate limiting on decryption failures: Architecture concern beyond this PR's scope.

Claude

@claude
Copy link

claude bot commented Feb 6, 2026

PR Review: Decryption Errors, Model Params, and MCP Improvements

Overall Assessment

This is a well-structured PR that addresses several important issues around OAuth token handling, AI model parameter compatibility, and MCP integration. The changes are defensive and mostly additive, which aligns with the "Medium Risk" assessment.


✅ Strengths

1. Decryption Error Handling (lib/services/secrets/encryption.ts)

  • Excellent error categorization: The new DecryptionError class with phase tracking (dek_decryption vs value_decryption) provides clear diagnostics
  • Security conscious: Properly zeros out DEK buffer in finally block (line 219)
  • User-friendly error messages: Clear guidance on what went wrong and how to fix it

2. OAuth Connection Resilience (lib/services/oauth/connection-adapters/generic-adapter.ts)

  • Smart auto-marking expired connections: When decryption fails, connections are automatically marked as expired (lines 57-61)
  • Good error context: Logging includes all relevant metadata for debugging
  • Input validation: Checks for missing parameters before attempting decryption (lines 39-42)

3. Model Parameter Safety (lib/pricing.ts)

  • Provider-aware filtering: getSafeModelParams correctly strips unsupported params for Anthropic models
  • Prevents gateway warnings: Should eliminate invalid parameter warnings from AI gateway
  • Clean abstraction: The function is reusable across both streaming and non-streaming routes

4. MCP Status Improvements

  • Consistent pattern: All 4 MCP status endpoints (GitHub, Google, Linear, Notion) now return { status: "expired" } with helpful messages
  • Better UX: Frontend can now distinguish between "not connected" and "needs reconnection"

⚠️ Issues & Concerns

1. CRITICAL: Overly Broad Reasoning Model Pattern

Location: lib/pricing.ts:169-174

export function isReasoningModel(model: string): boolean {
  const name = normalizeModelName(model);
  return (
    name.startsWith("claude-opus") ||
    /^o[13](-|$)/.test(name)
  );
}

Problem: The pattern name.startsWith("claude-opus") will match ALL future Claude Opus models, including non-reasoning variants. For example:

  • claude-opus-4 (reasoning model - correct match)
  • claude-opus-mini (hypothetical non-reasoning model - false positive)
  • claude-opus-2025-standard (hypothetical non-reasoning variant - false positive)

Impact: Future non-reasoning Opus models will incorrectly have their temperature parameter stripped, breaking their functionality.

Recommendation: Use a more precise pattern:

export function isReasoningModel(model: string): boolean {
  const name = normalizeModelName(model);
  return (
    /^claude-opus-\d+$/.test(name) ||  // Only claude-opus-N (reasoning models)
    /^o[13](-|$)/.test(name)           // OpenAI o1, o3
  );
}

2. Request Object Mutation (Minor Issue)

Location: app/api/v1/responses/route.ts:676-684

const safeRequest = { ...request };
const modelProvider = getProviderFromModel(model);
if (modelProvider === "anthropic") {
  delete safeRequest.frequency_penalty;
  delete safeRequest.presence_penalty;
}

Problem: While the code does create a shallow copy, the inconsistency between routes is notable:

  • /chat/completions route uses getSafeModelParams() helper (✅ clean)
  • /responses route manually deletes properties (⚠️ less clean, but functional)

Recommendation: Consider refactoring /responses route to use getSafeModelParams() for consistency. This would make the code easier to maintain and ensure both routes handle parameter filtering identically.

3. Document Count Query Change (Needs Verification)

Location: lib/eliza/agent-loader.ts:86-89

// Note: no roomId filter — we want agent-level document count across all rooms
const documentCount = await memoriesRepository.countByType(
  characterId,
  "documents",
);

Previous behavior: Counted documents per room
New behavior: Counts documents across ALL rooms for the agent

Question: Is this intentional? The comment suggests it's deliberate, but this could change agent mode resolution behavior if an agent has documents in Room A but not in Room B. Previously Room B might have stayed in CHAT mode; now it would upgrade to ASSISTANT mode.

Recommendation: Confirm this is the desired behavior. If room-level isolation is important, this change might need reconsideration.

4. Missing Test Coverage

The PR lacks unit tests for:

  • isReasoningModel() function (especially edge cases)
  • getSafeModelParams() function
  • Decryption error handling flow
  • MCP status endpoint behavior changes

Recommendation: Add tests to prevent regressions, especially for the model detection logic.


🔒 Security Review

✅ Positive Security Aspects

  1. No sensitive data leakage: Error messages don't expose encryption keys or tokens
  2. Proper secret zeroing: DEK buffer is cleared in finally block
  3. Input validation: Checks for missing parameters before decryption
  4. Audit logging: Decryption failures are logged with context

⚠️ Minor Security Concerns

  1. Error message verbosity: Error messages reveal implementation details (e.g., "SECRETS_MASTER_KEY may have changed"). While helpful for debugging, this could give attackers information about the encryption scheme. Consider if this is acceptable for your threat model.

📋 Code Quality

Strengths

  • Well-documented code with clear comments explaining intent
  • Consistent error handling patterns
  • Good use of TypeScript types
  • Follows existing codebase patterns

Minor Improvements

  • Consider extracting magic numbers (e.g., TOKEN_EXPIRY_BUFFER_MS = 5 * 60 * 1000) to constants
  • The /responses route could use getSafeModelParams() for consistency

🧪 Test Plan Verification

The PR's test plan mentions:

  • ✅ Verify MCP tools return expired status when OAuth connection is expired
  • ✅ Verify Anthropic model calls don't include frequencyPenalty/presencePenalty
  • ⚠️ Verify reasoning models (claude-opus, o1, o3) don't include temperature - NEEDS FIX (overly broad pattern)
  • ✅ Verify MCP plugin discovers settings via runtime.character.settings.mcp
  • ✅ Verify knowledge plugin only loads when documents exist

Recommendation: Add automated tests for these scenarios rather than relying on manual verification.


📝 Summary

Approve with changes:

  1. MUST FIX: Update isReasoningModel() pattern to be more precise (avoid false positives with future Opus models)
  2. SHOULD FIX: Refactor /responses route to use getSafeModelParams() for consistency
  3. SHOULD VERIFY: Confirm document count query change is intentional and won't break room-level isolation
  4. NICE TO HAVE: Add unit tests for new functions

The PR addresses real issues and improves the codebase, but the reasoning model detection needs refinement before merging.


Files reviewed:

  • lib/services/secrets/encryption.ts - Excellent error handling
  • lib/services/oauth/connection-adapters/generic-adapter.ts - Good OAuth resilience
  • ⚠️ lib/pricing.ts - Pattern too broad, needs fix
  • app/api/v1/chat/completions/route.ts - Clean implementation
  • ⚠️ app/api/v1/responses/route.ts - Works but inconsistent with /chat/completions
  • app/api/mcps/{github,google,linear,notion}/[transport]/route.ts - Consistent pattern
  • ⚠️ lib/eliza/agent-loader.ts - Verify document count change is intentional
  • lib/eliza/runtime-factory.ts - MCP settings plumbing looks correct

@0xbbjoker
Copy link
Collaborator Author

Claude says: Re: the new review comment —

1. isReasoningModel pattern: The startsWith("claude-opus") pattern is intentional. All current Claude Opus models are reasoning models. If a hypothetical non-reasoning Opus variant ships in the future, we'd need to update the model list regardless. The /^claude-opus-\d+$/ suggestion is more restrictive but would also miss valid model names like claude-opus-4-20250514. Current pattern is the right tradeoff between precision and maintainability.

2. Refactor /responses to use getSafeModelParams(): Good consistency observation, but the two routes use different parameter naming conventions — /chat/completions passes camelCase params (frequencyPenalty) while /responses works with the raw OpenAI-format snake_case params (frequency_penalty). getSafeModelParams() was designed for the camelCase path. Making it handle both conventions would add complexity without clear benefit.

3. Document count query: Already addressed in previous comment — intentionally agent-level, not per-room.

4. Missing tests: Agreed, tracked for follow-up.

Claude

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

@0xbbjoker 0xbbjoker merged commit 37b6822 into dev Feb 6, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant