Skip to content

fix: reject newline characters in SSE event and id fields#15187

Open
subhashdasyam wants to merge 2 commits intofastapi:masterfrom
subhashdasyam:fix/sse-newline-injection
Open

fix: reject newline characters in SSE event and id fields#15187
subhashdasyam wants to merge 2 commits intofastapi:masterfrom
subhashdasyam:fix/sse-newline-injection

Conversation

@subhashdasyam
Copy link

Summary

format_sse_event() in fastapi/sse.py interpolated the event and id fields
into SSE wire-format bytes without stripping or rejecting newline characters.
A \n or \r\n in either field injects an additional SSE field-line into the
stream, enabling event-type spoofing and fabricated data: payloads in browser
EventSource clients.

The data and comment fields in the same function are correctly protected via
splitlines(). The omission for event and id is an inconsistency — not a
design decision.

Vulnerability class: SSE Protocol Injection (CRLF injection into wire format)
CWE: CWE-116 — Improper Encoding for Output in a Different Plaintext Context
OWASP: A03:2021 — Injection
Prior art: Hono GHSA-p6xx-57qc-3wxr — identical bug class, same fix pattern

Inconsistency before this fix

Field Newline handling Safe?
data splitlines() loop ✅ Yes
comment splitlines() loop ✅ Yes
event None ❌ No
id Only \0 check ❌ No
retry int type (no strings) ✅ Yes

What the injection looks like

A request like GET /stream?type=chat%0Adata:%20{"cmd":"exec"} causes
format_sse_event to produce:

event: chat
data: {"cmd":"exec"}
data: "real server data"

The browser's EventSource receives a chat event with a fabricated data:
field prepended to the real payload. Frontends that process event.data
line-by-line (common in streaming AI/MCP UIs) receive a corrupted payload.

Changes

  • fastapi/sse.py

    • Add _check_event_no_newline() validator; apply to ServerSentEvent.event
      via AfterValidator (mirrors the existing _check_id_no_null pattern)
    • Extend _check_id_no_null() to also reject \n and \r
    • Strip \r/\n from event and id in format_sse_event() as
      defense-in-depth (low-level function; callable without going through
      the Pydantic model validators)
  • tests/test_sse.py

    • 7 new regression tests covering LF/CR rejection in ServerSentEvent.event,
      ServerSentEvent.id, and the format_sse_event() low-level function

Test results

All 25 tests pass (pytest tests/test_sse.py -v), including the 7 new ones.

Severity

Low at framework level / Medium in affected applications. The vulnerability
only activates when a developer routes attacker-influenced input into the event=
or id= fields — most correct uses pass static strings. The fix is a one-line
guard that prevents the framework from silently accepting malformed input.

Copilot AI review requested due to automatic review settings March 21, 2026 14:57
@codspeed-hq
Copy link

codspeed-hq bot commented Mar 21, 2026

Merging this PR will not alter performance

✅ 20 untouched benchmarks


Comparing subhashdasyam:fix/sse-newline-injection (06eccbf) with master (12bbd94)1

Open in CodSpeed

Footnotes

  1. No successful run was found on master (64feaec) during the generation of this report, so 12bbd94 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR closes an SSE protocol injection vector by preventing CR/LF injection via the event and id fields in format_sse_event() / ServerSentEvent, aligning their handling with the already-safe data and comment behavior.

Changes:

  • Add validation to reject CR/LF in ServerSentEvent.event and extend ServerSentEvent.id validation to reject CR/LF.
  • Add defense-in-depth sanitization in format_sse_event() by stripping CR/LF from event and id before encoding.
  • Add regression tests covering newline/CR rejection and low-level formatting sanitization.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
fastapi/sse.py Adds newline/CR validation for event/id and strips CR/LF in format_sse_event() to prevent SSE field-line injection.
tests/test_sse.py Adds regression tests to ensure newline/CR are rejected at the model level and sanitized at the formatter level.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@subhashdasyam subhashdasyam marked this pull request as draft March 21, 2026 15:00
@subhashdasyam subhashdasyam marked this pull request as ready for review March 21, 2026 15:01
`format_sse_event()` interpolated the `event` and `id` fields into SSE
wire-format bytes without stripping or rejecting newline characters. A
`\n` or `\r\n` in either field injects an additional SSE field-line into
the stream — enabling event-type spoofing and fabricated data payloads
in browser EventSource clients.

The `data` and `comment` fields were already protected via `splitlines()`;
the omission for `event` and `id` was an inconsistency, not a design
decision. The identical bug class was fixed in Hono (GHSA-p6xx-57qc-3wxr).

Changes:
- Add `_check_event_no_newline()` validator and apply it to
  `ServerSentEvent.event` via `AfterValidator`
- Extend `_check_id_no_null()` to also reject `\n` and `\r`
- Strip `\r`/`\n` in `format_sse_event()` for `event` and `id` as
  defense-in-depth (low-level function, bypasses model validators)
- Add 7 regression tests covering LF/CR rejection in both the model
  and the low-level formatting function

Severity: Medium (client-side injection; server unaffected)
CWE: CWE-116 — Improper Encoding for Output in a Different Plaintext Context
OWASP: A03:2021 — Injection
Prior art: Hono GHSA-p6xx-57qc-3wxr (same class, same fix pattern)
@subhashdasyam subhashdasyam force-pushed the fix/sse-newline-injection branch from 6b62388 to 8bfab52 Compare March 21, 2026 15:03
@subhashdasyam
Copy link
Author

Hi maintainers 👋

The Labels / check-labels CI check is failing because this PR is missing one of the required labels.

This is a security fix (SSE protocol injection via unvalidated event and id fields — CWE-116), so the appropriate label would be security.

Could a maintainer please apply the security label so the check passes? Thank you!

@subhashdasyam
Copy link
Author

Fix tracked in issue #15188.

@SadManFahIm
Copy link

Thanks for the thorough fix and the regression tests! A few observations while reviewing:

1. Bare \r test coverage
The validator at line 46 correctly rejects \r alone ("\n" in v or "\r" in v).
Could you confirm whether tests/test_sse.py includes an explicit test case for
a bare carriage return without a newline?

# e.g.
with pytest.raises(ValueError):
    ServerSentEvent(event="chat\radmin_command")

2. Inconsistency in format_sse_event() for null bytes on id
The _check_id_no_null_or_newline validator raises ValueError on \0, but
format_sse_event() at line 220 silently strips \0 via .replace("\0", "").
These two code paths have different behaviors for the same input — is that intentional?
If the validator rejects null bytes at the model level, should format_sse_event()
also raise (or at least not strip silently)?

3. event has no null-byte check
id has had null-byte protection since the original _check_id_no_null.
The new _check_event_no_newline doesn't include a similar \0 check for event.
Is that an intentional asymmetry, or worth adding for consistency?

Overall the fix looks solid — the two-layer approach (validator + defense-in-depth
sanitization in format_sse_event) is a good pattern. Just flagging these for
completeness.

…'id'

`_check_event_no_newline` lacked a `\0` check that `_check_id_no_null_or_newline`
has had since the original implementation. Similarly, `format_sse_event()` stripped
`\0` from `id` but not from `event`.

Changes:
- Extend `_check_event_no_newline` to also reject `\0` characters
- Strip `\0` in `format_sse_event()` for `event` (mirrors existing `id` behavior)
- Add `test_server_sent_event_null_event_rejected` regression test

`event` and `id` are now fully symmetric in both the model validator and the
low-level wire formatter.
@subhashdasyam
Copy link
Author

Thanks for the thorough fix and the regression tests! A few observations while reviewing:

1. Bare \r test coverage The validator at line 46 correctly rejects \r alone ("\n" in v or "\r" in v). Could you confirm whether tests/test_sse.py includes an explicit test case for a bare carriage return without a newline?

# e.g.
with pytest.raises(ValueError):
    ServerSentEvent(event="chat\radmin_command")

2. Inconsistency in format_sse_event() for null bytes on id The _check_id_no_null_or_newline validator raises ValueError on \0, but format_sse_event() at line 220 silently strips \0 via .replace("\0", ""). These two code paths have different behaviors for the same input — is that intentional? If the validator rejects null bytes at the model level, should format_sse_event() also raise (or at least not strip silently)?

3. event has no null-byte check id has had null-byte protection since the original _check_id_no_null. The new _check_event_no_newline doesn't include a similar \0 check for event. Is that an intentional asymmetry, or worth adding for consistency?

Overall the fix looks solid — the two-layer approach (validator + defense-in-depth sanitization in format_sse_event) is a good pattern. Just flagging these for completeness.

Thanks for looking closely at this.

On point 2 -- \0 inconsistency between validator and format_sse_event()

The difference is intentional. These two layers have different jobs:

  • ServerSentEvent (Pydantic model) is a strict gate. It rejects bad input so the caller has to fix their data upstream. Raising on \0 here is the right behavior.
  • format_sse_event() is a best-effort sanitizer. It is a low-level function callable directly without Pydantic (from routing internals, custom integrations, etc.). Its job is to always produce safe wire output regardless of how it was called. Raising there would defeat that purpose.

This pattern (raise at model level, sanitize at wire level) is the same approach used for \n/\r and was documented in the commit message as "defense-in-depth". The behavioral difference between the two layers is by design, not an oversight.

On point 3 -- event has no null-byte check

You are right, this was a real gap. id has had null-byte protection since the original _check_id_no_null, but _check_event_no_newline did not have it, and format_sse_event() was not stripping \0 from event either.

Fixed in the follow-up commit:

  • _check_event_no_newline now rejects \0 with ValueError("SSE 'event' must not contain null characters")
  • format_sse_event() now strips \0 from event, matching the existing id behavior
  • Added test_server_sent_event_null_event_rejected as a regression test

event and id are now fully symmetric in both the validator and the low-level formatter.

@YuriiMotov
Copy link
Member

YuriiMotov commented Mar 22, 2026

format_sse_event() is a best-effort sanitizer. It is a low-level function callable directly without Pydantic (from routing internals, custom integrations, etc.). Its job is to always produce safe wire output regardless of how it was called. Raising there would defeat that purpose.

I think this assumption is incorrect. In case of data field, newline characters is absolutely normal and we must handle it. But for event field it's not normal and we should reject such value instead of just removing those characters

@subhashdasyam
Copy link
Author

format_sse_event() is a best-effort sanitizer. It is a low-level function callable directly without Pydantic (from routing internals, custom integrations, etc.). Its job is to always produce safe wire output regardless of how it was called. Raising there would defeat that purpose.

I think this assumption is incorrect. In case of data field, newline characters is absolutely normal and we must handle it. But for event field it's not normal and we should reject such value instead of just removing those characters

That distinction makes sense. data is multi-line by design in the SSE spec, so splitlines() is the right move there. event and id are single-value fields and a newline in either one isn't something you'd ever want silently swallowed.

id falls in the same category as event here. Should I update format_sse_event to raise ValueError for \n, \r, and \0 in both event and id, and leave the splitlines() handling for data and comment as is?

@YuriiMotov
Copy link
Member

Should I update format_sse_event to raise ValueError ... ?

I think yes, but it's you PR and you should decide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants