Skip to content

[python client] message properties are not round-trippable #44

@zbentley

Description

@zbentley

Describe the bug
Properties objects on messages can be set to (and published with) values that cannot be deserialized on the far side.

To Reproduce

  1. Using the Python client, publish a message on any topic with properties={'foo': b'\x01-\x00\x97'}
  2. Using a Python consumer, consume that message and attempt to access message.properties().
  3. Observe that a UnicodeDecodeError is raised.
  4. Repeat steps 1-3 with properties={ b'\x01-\x00\x97': 'foo'}

Expected behavior
Properties should be round-trippable: they should be deserialized with the same types and values with which they were set, and should not raise exceptions on deserialization.

There are three possible solutions here:

  1. Require that all properties keys and values be bytess in Python. This is easy to implement inside the client, but breaks backwards compatibility.
  2. Encode type information along with property keys and values. This is harder to implement inside the client (it doesn't seem like it's using google.protobuf.Values on the wire at the moment, but I may be misreading the code) and deserialize the appropriate types in the consumer.
  3. Less preferable: require that all keys and values be strs in Python. This is more restrictive than the protocol allows, but is probably simpler to implement.

Environment:
MacOS 12 x86, Pulsar standalone 2.10, pulsar client 2.10, Python 3.7.13.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions