Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AvroProducer doesn't serialize the key if it's an empty string #342

Closed
4 of 7 tasks
alfiya400 opened this issue Mar 26, 2018 · 2 comments
Closed
4 of 7 tasks

AvroProducer doesn't serialize the key if it's an empty string #342

alfiya400 opened this issue Mar 26, 2018 · 2 comments
Assignees
Labels

Comments

@alfiya400
Copy link

alfiya400 commented Mar 26, 2018

Description

Based on the code for AvroProducer
AvroProducer serializes the key only if bool(key) == True, which allows to pass an empty string as a key without the serialization.

This breaks the read using AvroConsumer, because it tries to deserialize the key if message.key() is not None
And the serializer raises the following error at the length check

Traceback (most recent call last):
  File "/Users/alfiya/miniconda3/envs/py3/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2862, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-20-7566bd90b3a5>", line 1, in <module>
    m=consumer.poll()
  File "/Users/alfiya/miniconda3/envs/py3/lib/python3.6/site-packages/confluent_kafka/avro/__init__.py", line 118, in poll
    decoded_key = self._serializer.decode_message(message.key())
  File "/Users/alfiya/miniconda3/envs/py3/lib/python3.6/site-packages/confluent_kafka/avro/serializer/message_serializer.py", line 209, in decode_message
    raise SerializerError("message is too small to decode")

How to reproduce

import avro.schema
import confluent_kafka.avro

producer = confluent_kafka.avro.AvroProducer(
    {
        'api.version.request': True,
        'compression.codec': 'gzip',
        'default.topic.config': {'request.required.acks': 1},
        'bootstrap.servers': 'localhost:9093',
        'schema.registry.url': 'http://localhost:8081',
    }, 
    default_key_schema=avro.schema.Parse('{"type": "string"}'), 
    default_value_schema=avro.schema.Parse('{"type": "string"}')
)
producer.produce(topic='bug', value='not empty', key='')

consumer = confluent_kafka.avro.AvroConsumer({
    'group.id': 'mygroup', 
    'api.version.request': True,
    'default.topic.config': {'auto.offset.reset': 'smallest'},
    'bootstrap.servers': 'localhost:9093',
    'schema.registry.url': 'http://localhost:8081'
    
})
consumer.subscribe(['bug'])
consumer.poll()

At consumer.poll() the error above should be raised:

Checklist

Please provide the following information:

  • confluent-kafka-python and librdkafka version (confluent_kafka.version() and confluent_kafka.libversion()): confluent_kafka.version(): ('0.11.0', 720896), confluent_kafka.libversion(): ('0.11.3', 721919)
  • Apache Kafka broker version: 0.10.2.1
  • Client configuration: {...} provided in 'How to reproduce'
  • Operating system: Mac OS X 10.13.3
  • Provide client logs (with 'debug': '..' as necessary)
  • Provide broker log excerpts
  • Critical issue
@rnpridgeon
Copy link
Contributor

For completeness:

The AvroProducer will actually fail to serialize both the key and the value if they are represented by an empty string. This is actually true for 0 values on all the numeric types and empty sequences as well per the python documentation.

The difference being 0 numeric types will actually fail to be written at all as it will fail the type checking done by the producer. Had it been serialized it would have been converted to an array of bytes.

Instead the AvroProducer should us the object identity operators is not to evaluate the presence of a key and/or value.

https://docs.python.org/2/library/stdtypes.html#truth-value-testing

@rnpridgeon
Copy link
Contributor

Fixed on master (#374)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants