fix(ai): reject system-role UI messages in createAgentUIStream#14749
fix(ai): reject system-role UI messages in createAgentUIStream#14749etairl wants to merge 1 commit into
Conversation
Client-supplied UI messages cross an untrusted boundary, but the schema allowed `role: 'system'` and the conversion to model messages preserved it. A caller of `createAgentUIStream` / `createAgentUIStreamResponse` could therefore inject arbitrary system-level instructions alongside the developer's trusted `instructions`, enabling prompt injection that the model cannot distinguish from the developer-authored system prompt. Throw `InvalidArgumentError` if any inbound UI message has `role: 'system'`. System instructions belong on the agent's `instructions` setting, not on inbound messages. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
a) i agree b) call option schema: this would be great, ideally should be a separate dedicated pr c) a separate pr would be great just for media type |
|
Re a) thinking about it more, I would prefer the check to be part of the model message scanning. I will take this on. |
Gotcha, I'll create separate PRs for B & C and leave A to you :) |
3ce20fc to
8bfd422
Compare
Split and created separated PRs: |
## Background `ToolLoopAgentSettings.callOptionsSchema` is declared and documented as a runtime schema for caller-supplied `options`, but `ToolLoopAgent.prepareCall` never invokes it. Any invariant a developer encodes in that schema is silently bypassed at runtime, and unchecked `options` flow straight into `prepareCall` and any `instructions` template that interpolates them — defeating both the validation guarantee and any input-shape assumptions downstream code makes. Splitting this out per maintainer feedback on #14749 that `callOptionsSchema` enforcement should be its own dedicated PR. ## Summary - `packages/ai/src/agent/tool-loop-agent.ts`: at the top of `prepareCall`, when `callOptionsSchema` is set and the caller passed `options`, validate via `safeValidateTypes`. On failure, throw `InvalidArgumentError` with the schema's error message. On success, swap the caller-supplied `options` for the validated (parsed) value so any schema transforms or defaults take effect for the rest of the call. - `packages/ai/src/agent/tool-loop-agent.test.ts`: regression tests covering the rejection path (out-of-enum value rejected before reaching the model) and the accept path (in-enum value passes through and the model is invoked normally). - Patch changeset. The check is gated on `options !== undefined`, so existing callers that don't supply `options` are unaffected. Only agents that opted into `callOptionsSchema` see new behaviour — and that behaviour is exactly what the field name and docs already promised. ## Manual Verification - `pnpm --filter ai test:node -- src/agent/tool-loop-agent.test.ts` — passes, including the two new `callOptionsSchema` tests. ## Checklist - [x] Tests have been added / updated (for bug fixes / features) - [ ] Documentation has been added / updated (for bug fixes / features) - [x] A _patch_ changeset for relevant packages has been added (for bug fixes / features - run `pnpm changeset` in the project root) - [x] I have reviewed this pull request (self-review) Documentation is unchanged: the existing `callOptionsSchema` JSDoc already describes the intended behaviour; this PR makes the runtime match the docs. ## Future Work None — this PR makes `callOptionsSchema` behave as documented. ## Related Issues - Split out from #14749 (closed) per maintainer feedback. --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: Lars Grammel <lars.grammel@gmail.com>
…ion (#14751) ## Background `getMediaTypeFromUrl` in `convert-to-language-model-prompt.ts` does an `ext in URL_EXTENSION_TO_MEDIA_TYPE` check on a plain object literal. Because plain objects inherit from `Object.prototype`, the `in` operator returns `true` for inherited keys like `constructor`, `toString`, `hasOwnProperty`, etc. A URL ending in `.constructor` (or any other `Object.prototype` member) therefore takes the lookup branch and returns the inherited value — for `.constructor`, that's the `Object` constructor function, which is then forwarded as `mediaType` to provider adapters. This is a low-severity correctness/typing bug rather than an exploit path, but it's worth fixing: the helper's return type is `string | undefined` and a non-string slipping through can break downstream code paths that assume a string `mediaType`. Splitting this out per maintainer feedback on #14749 that the media-type fix should be its own PR. ## Summary - `packages/ai/src/prompt/convert-to-language-model-prompt.ts`: replace `ext in URL_EXTENSION_TO_MEDIA_TYPE` with `Object.hasOwn(URL_EXTENSION_TO_MEDIA_TYPE, ext)` so only own-property extensions are matched. - `packages/ai/src/prompt/convert-to-language-model-prompt.test.ts`: regression test for a URL ending in `.constructor` — asserts the helper falls back to the no-extension behaviour instead of returning a non-string value from the prototype chain. - Patch changeset. The change is one line of production code; the table is treated as a closed lookup, which is what the original code intended. ## Manual Verification - `pnpm --filter ai test:node -- src/prompt/convert-to-language-model-prompt.test.ts` — passes, including the new prototype-collision regression test. ## Checklist - [x] Tests have been added / updated (for bug fixes / features) - [ ] Documentation has been added / updated (for bug fixes / features) - [x] A _patch_ changeset for relevant packages has been added (for bug fixes / features - run `pnpm changeset` in the project root) - [x] I have reviewed this pull request (self-review) No documentation change — `getMediaTypeFromUrl` is internal. ## Future Work None. If other helpers in the prompt layer use `in` against plain object literals indexed by user input, the same prototype-confusion fix would apply, but I didn't spot any others while looking at this one. ## Related Issues - Split out from #14749 (closed) per maintainer feedback. --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: Lars Grammel <lars.grammel@gmail.com>
…-in) (#14752) ## Background For historical and convenience reasons, system messages can be part of user messages or prompts, e.g. to allow interleaving regular messages and system messages. However, this creates a prompt injection risk where the user (e.g. by modifying the messages in a web ui) can override or set the system prompt. In most cases, it should only be possible to set the system prompt via the system (or instructions) property, and users should not be able to inject system messages. ## Summary * throw `InvalidPromptError` when there are system messages in the messages or prompt options * add `allowSystemInMessages` option for opting into allowing system messages in messages or prompt options ## Future Work * add `allowSystemInMessages` opt-in support to `WorkflowAgent` (if desired) @gr2m ## Related Issues Issue reported in #14749
## Background For historical and convenience reasons, system messages can be part of user messages or prompts, e.g. to allow interleaving regular messages and system messages. However, this creates a prompt injection risk where the user (e.g. by modifying the messages in a web ui) can override or set the system prompt. In most cases, it should only be possible to set the system prompt via the system (or instructions) property, and users should not be able to inject system messages. ## Summary * `allowSystemInMessages === undefined`: print warning when there are system messages in the messages or prompt options * `allowSystemInMessages === true`: throw InvalidPromptError when there are system messages in the messages or prompt options * `allowSystemInMessages === false`: ignore system messages in the messages or prompt options ## Related Issues Adjusted backport of #14752 Issue reported in #14749
## Background For historical and convenience reasons, system messages can be part of user messages or prompts, e.g. to allow interleaving regular messages and system messages. However, this creates a prompt injection risk where the user (e.g. by modifying the messages in a web ui) can override or set the system prompt. In most cases, it should only be possible to set the system prompt via the system (or instructions) property, and users should not be able to inject system messages. ## Summary * `allowSystemInMessages === undefined`: print warning when there are system messages in the messages or prompt options * `allowSystemInMessages === true`: throw InvalidPromptError when there are system messages in the messages or prompt options * `allowSystemInMessages === false`: ignore system messages in the messages or prompt options ## Related Issues Adjusted backport of #14810 and #14752 Issue reported in #14749
Background
Three issues were identified in
packages/aithat weaken trust boundaries between developer-controlled and caller-controlled inputs:role: 'system'injection increateAgentUIStream(high).uiMessagesSchemaallowsrole: 'system',convertToModelMessagespreserves it, andcreateAgentUIStreamforwards the converted prompt toagent.stream(). Any caller ofcreateAgentUIStream/createAgentUIStreamResponsecould therefore inject system-level instructions alongside the developer'sinstructions, indistinguishable to the model from the developer-authored system prompt.callOptionsSchemadeclared but never enforced (medium).ToolLoopAgentSettings.callOptionsSchemais named and documented as a runtime schema foroptions, buttool-loop-agent.tsnever invokes it. Any invariant a developer encodes in that schema is silently bypassed at runtime, and uncheckedoptionsflow straight intoprepareCalland anyinstructionstemplate that interpolates them.getMediaTypeFromUrl(low). The helper usedext in URL_EXTENSION_TO_MEDIA_TYPEagainst a plain object literal, so a URL ending in.constructorresolved through the prototype chain and returned theObjectconstructor (a function), violating the helper's: stringreturn type and forwarding a non-stringmediaTypeto provider adapters.Summary
packages/ai/src/agent/create-agent-ui-stream.ts: aftervalidateUIMessages, throwInvalidArgumentErrorif any inbound UI message hasrole: 'system'. The error message points developers at the agent'sinstructionssetting, which is the supported way to set system prompts.packages/ai/src/agent/tool-loop-agent.ts: inprepareCall, whencallOptionsSchemais set andoptionsis provided, runsafeValidateTypesand throwInvalidArgumentErroron failure, before forwarding toprepareCall/generateText/streamText. The validated value replaces the raw input on the way through.packages/ai/src/prompt/convert-to-language-model-prompt.ts: replaceext in URL_EXTENSION_TO_MEDIA_TYPEwithObject.hasOwn(...)so attacker-controlled extensions like.constructorcannot resolve to inheritedObject.prototypekeys.create-agent-ui-stream-response.test.ts,tool-loop-agent.test.ts, andconvert-to-language-model-prompt.test.ts.aicovering all three fixes.Manual Verification
npx vitest --config vitest.node.config.js --run src/agent/tool-loop-agent.test.ts src/agent/create-agent-ui-stream-response.test.ts src/prompt/convert-to-language-model-prompt.test.ts— 151/151 pass, including the new regression tests.npx tsc --noEmitinpackages/ai— clean.system-role test asserts the underlyingMockLanguageModelV4.doStreamis never called (i.e. the request fails closed before reaching the model).callOptionsSchematest rejects an out-of-enumtopicand accepts an in-enum value end-to-end.'constructor' in {}→true,Object.hasOwn({}, 'constructor')→false) to confirm the fix changes behavior only on collision keys.Checklist
pnpm changesetin the project root)Documentation is not updated because the fixes preserve documented behavior —
instructionsremains the supported way to set system prompts,callOptionsSchemanow actually does what its name and JSDoc imply, andgetMediaTypeFromUrlis a private helper.Future Work
uiMessagesSchemaitself toz.enum(['user', 'assistant'])(or splitting client-trust from server-trust constructions) sovalidateUIMessagesis safe by default in any inbound-handler context, not just insidecreateAgentUIStream. Left out of this PR to avoid changing the public shape ofvalidateUIMessagesfor server-side callers that programmatically construct system messages.