Skip to content

Concatenating two ISO-2022-JP outputs from a conforming encoder doesn't result in conforming input #115

Open
@hsivonen

Description

@hsivonen

Encodings other than ISO-2022-JP have the property that if you concatenate two outputs from a conforming encoder and decode them together, you get the same result as when decoding them separately and then concatenating.

ISO-2022-JP lacks this property, because despite the encoder making an effort into this direction by emitting a transition to the ASCII state at the end, if the next segment being concatenated starts with a transition to a non-ASCII state, the concatenation results in zero ASCII bytes between two escapes.

Is there a strong reason for treating a transition to the ASCII state immediately followed by another escape as non-conforming? Or put the other way, what purpose does the transition to the ASCII state at the end of encode serve if not achieving the above-mentioned concatenation property that other encodings have?

This is relevant to RFC 2047 header decoding.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions