Skip to content

[C++] Support convert utf16 encoded string to utf8 string #1546

@chaokunyang

Description

@chaokunyang

Is your feature request related to a problem? Please describe.

Currently Fury xlang serialization use utf8 for string encoding, which is not performance efficient in many languages.

We introduced utf16 in https://fury.apache.org/docs/specification/fury_xlang_serialization_spec#string . But c++ doesn't support utf16, and most users assume the std::string is utf-8 encoded if it's used as a string instead of buffer. we should support to transcode utf16 encoded string to utf8 string in fury C++ deserialization.

Describe the solution you'd like

Implement utf16 to utf8 convertion in fury c++. The implementation should use SIMD to provide faster speed.

Additional context

#1413

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions