Skip to content

JSONL content type matcher accepts unrelated strings #276

@fallintoplace

Description

@fallintoplace

Summary

OpenAI::Internal::Util::JSONL_CONTENT currently matches strings that are not JSONL media types. The regex has (:? groups where non-capturing (?:...) groups were likely intended, and the jsonl alternative is not anchored to application/.

Because decode_content switches to line-delimited JSON parsing based on this matcher, unrelated response content types can be treated as JSONL streams.

Reproduction

jsonl = OpenAI::Internal::Util::JSONL_CONTENT

p jsonl.match?("text/plain; name=jsonl")
p jsonl.match?("foojsonlbar")
p jsonl.match?("application/notjsonlbutjsonl")

Expected

All three values should be false.

Actual

All three values are true.

Expected matches

The matcher should still accept valid JSONL media types, including parameterized content types:

application/x-ndjson
application/x-ldjson
application/jsonl
application/x-jsonl
application/jsonl; charset=utf-8

Impact

A non-JSONL response whose content type contains jsonl elsewhere can be parsed line-by-line as JSON, which can produce incorrect decoding behavior or parser errors for unrelated response bodies.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions