Support UTF-8 strings for `read` and `lookup` outputs while using `ProtocolBuffer` encoding #170

jaeyeol-moloco · 2023-12-26T07:41:58Z

Thanks for stopping by to let us know something could be better!

PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.

Is your feature request related to a problem? Please describe.
When I use ProtocolBuffer encoding, I'm frustrated by string value encoded in unreadable bytes.
For example, a Korean string "경동나비엔" is printed as "\352\262\275\353\217\231\353\202\230\353\271\204\354\227\224".

Describe the solution you'd like
I would like cbt to support UTF-8 string in read or lookup output.

Describe alternatives you've considered
I found that the unreadable byte sequence is from message.MarshalTextIndent()(link). If we use message.MarshalJSONIndent() instead, a UTF-8 string can be correctly printed like "경동나비엔". So it would be also good if cbt allows users to choose prototext or protojson as the output format. Then prototext will still output bytes in octal, but I can choose protojson to see UTF-string.

Additional context

The text was updated successfully, but these errors were encountered:

jaeyeol-moloco · 2023-12-29T12:47:25Z

I think octal outputs for UTF-8 characters are intended according to https://protobuf.dev/reference/protobuf/textformat-spec/. So fixing the output for ProtocolBuffer format wouldn't be an option for this issue. In #171, I added one more format ProtocolBufferJSON for marshaling a protocol buffer value in JSON format which prints UTF-8 strings normally.

jaeyeol-moloco · 2024-01-02T02:52:39Z

I realized that text format spec itself supports UTF-8, so I closed #171 and open a new PR #172 which changes the formatter package from https://github.com/jhump/protoreflect to https://pkg.go.dev/google.golang.org/protobuf/encoding/prototext.

jaeyeol-moloco added priority: p3 Desirable enhancement or fix. May not be included in next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. labels Dec 26, 2023

jaeyeol-moloco changed the title ~~Support UTF-8 strings while using ProtocolBuffer encoding~~ Support UTF-8 strings for read and lookup outputs while using ProtocolBuffer encoding Dec 27, 2023

jaeyeol-moloco mentioned this issue Dec 29, 2023

feat: add ProtocolBufferJSON format #171

Closed

jaeyeol-moloco mentioned this issue Jan 2, 2024

fix: outputs UTF-8 characters in ProtocolBuffer format #172

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support UTF-8 strings for `read` and `lookup` outputs while using `ProtocolBuffer` encoding #170

Support UTF-8 strings for `read` and `lookup` outputs while using `ProtocolBuffer` encoding #170

jaeyeol-moloco commented Dec 26, 2023 •

edited

jaeyeol-moloco commented Dec 29, 2023

jaeyeol-moloco commented Jan 2, 2024

Support UTF-8 strings for read and lookup outputs while using ProtocolBuffer encoding #170

Support UTF-8 strings for read and lookup outputs while using ProtocolBuffer encoding #170

Comments

jaeyeol-moloco commented Dec 26, 2023 • edited

jaeyeol-moloco commented Dec 29, 2023

jaeyeol-moloco commented Jan 2, 2024

Support UTF-8 strings for `read` and `lookup` outputs while using `ProtocolBuffer` encoding #170

Support UTF-8 strings for `read` and `lookup` outputs while using `ProtocolBuffer` encoding #170

jaeyeol-moloco commented Dec 26, 2023 •

edited