Skip to content

Reading a string prefixed with a length in code points, not bytes #1020

@rgov

Description

@rgov

Suppose I have the string "π" which is encoded using UTF-8 as the byte sequence [0xCF, 0x80]. This is serialized as a length-prefixed string, but the length is given in terms of characters or code points, rather than bytes. So instead of a length of 2 bytes, it is written with a length of 1 character.

How is this expressed in construct?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions