You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
, it first encodes the []byte type into a base64 string during the json.Marshal. Then after that it is converted into binary using the avro codec. This final binary data is significantly larger in size than it would have been if it was transmitted directly as the raw bytes.
Here is an example:
Original golang byte array: [172 12 53 97 9 70 89 247 94 3 56 242 127 146 9 209]
Base64 encoded text from byte array: rAw1YQlGWfdeAzjyf5IJ0Q==
The byte array of the final encoded binary payload (which is just the byte array representation of the base64 encoded string): [114 65 119 49 89 81 108 71 87 102 100 101 65 122 106 121 102 53 73 74 48 81 61 61]
In this example the encoded payload that the pulsar-client-go transmits to the pulsar queue is 50% larger in terms of bytes. This can lead to a dramatic loss to performance when throughput is the bottleneck.
Steps to reproduce
Encode a []byte object using the avro schema encode function here
Note, this base64 encoding behavior is also inconsistent with the python client when using the same avro schema. The python consumer expects the byte field of the avro schema to be raw bytes of the data. But it is actually the bytes of a base64 encoded string. So data is decoded incorrectly.
Expected behavior
Situation:
It is expected that the []byte property should be encoded as-is into the binary payload using the avro codec.
Actual behavior
In this situation, when encoding the native golang struct into a pulsar payload using the function here:
pulsar-client-go/pulsar/schema.go
Line 253 in d9b18d0
Here is an example:
Original golang byte array: [172 12 53 97 9 70 89 247 94 3 56 242 127 146 9 209]
Base64 encoded text from byte array: rAw1YQlGWfdeAzjyf5IJ0Q==
The byte array of the final encoded binary payload (which is just the byte array representation of the base64 encoded string): [114 65 119 49 89 81 108 71 87 102 100 101 65 122 106 121 102 53 73 74 48 81 61 61]
In this example the encoded payload that the pulsar-client-go transmits to the pulsar queue is 50% larger in terms of bytes. This can lead to a dramatic loss to performance when throughput is the bottleneck.
Steps to reproduce
Encode a []byte object using the avro schema encode function here
pulsar-client-go/pulsar/schema.go
Line 253 in d9b18d0
System configuration
commit 504e589
The text was updated successfully, but these errors were encountered: