Is your feature request related to a problem? Please describe.
Currently Fury xlang serialization use utf8 for string encoding, which is not performance efficient in many languages.
We introduced utf16 in https://fury.apache.org/docs/specification/fury_xlang_serialization_spec#string . But c++ doesn't support utf16, and most users assume the std::string is utf-8 encoded if it's used as a string instead of buffer. we should support to transcode utf16 encoded string to utf8 string in fury C++ deserialization.
Describe the solution you'd like
Implement utf16 to utf8 convertion in fury c++. The implementation should use SIMD to provide faster speed.
Additional context
#1413
Is your feature request related to a problem? Please describe.
Currently Fury xlang serialization use utf8 for string encoding, which is not performance efficient in many languages.
We introduced utf16 in https://fury.apache.org/docs/specification/fury_xlang_serialization_spec#string . But c++ doesn't support utf16, and most users assume the
std::stringis utf-8 encoded if it's used as a string instead of buffer. we should support to transcode utf16 encoded string to utf8 string in fury C++ deserialization.Describe the solution you'd like
Implement utf16 to utf8 convertion in fury c++. The implementation should use SIMD to provide faster speed.
Additional context
#1413