Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RdKafka] Enable serializers to serialize message keys #254

Merged

Conversation

tPl0ch
Copy link
Contributor

@tPl0ch tPl0ch commented Nov 5, 2017

Currently the RdKafkaProducer is serializing just the message, which I guess us fine for JSON serialization.

But when you want to use an AvroSerializer that integrates with the Confluent Platform, it is possible to use complex types as keys as well, which in turn means that keys need to be serialized, too.

This change leverages the mutability of the Message by shifting the getKey method call after the call to the Serializer.

@makasim
Copy link
Member

makasim commented Nov 5, 2017

Wouldn't it better to add the key dedicated methods to the serializer. Or to introduce key serializer interfaces?

Currently the `RdKafkaProducer` is serializing just the message, which
I guess us fine for JSON serialization. But when you want to use an
AvroSerializer that integrates with the Confluent Platform, it is
possible to use complex types as keys as well, which in turn means that
keys need to be serialized, too.

This change leverages the mutability of the Message by shifting the
`getKey` method call after the call to the Serializer.
@tPl0ch tPl0ch force-pushed the feature-enable-key-serialization branch from 90163db to f45a646 Compare November 5, 2017 11:26
@tPl0ch
Copy link
Contributor Author

tPl0ch commented Nov 5, 2017

Wouldn't it better to add the key dedicated methods to the serializer?

I would say this is out of question since you'd have to change the stable interface of the SerializerInterface which would break BC.

Or to introduce key serializer interfaces?

That would be a better solution but also way more intrusive (more changes required). I guess I could add the KeySerializerInterface, a default NoopKeySerializer implementation (just returning the unmodified key) and an additional KeySerializerAwareTrait.

Suggestion: Merge this simple & easy to test change now, and implement feature of KeySerializerInterface later.

@makasim What do you think?

@makasim makasim merged commit 6fb190c into php-enqueue:master Nov 5, 2017
@makasim
Copy link
Member

makasim commented Nov 5, 2017

Do you have plans to work on a better solution ?

@makasim
Copy link
Member

makasim commented Nov 5, 2017

@tPl0ch Could you please share some links on the matter?

@makasim
Copy link
Member

makasim commented Nov 5, 2017

What is the key, What role it plays and so on?

@tPl0ch
Copy link
Contributor Author

tPl0ch commented Nov 5, 2017

@makasim

What is the key, What role it plays and so on?

The key of a Kafka message plays in important role deciding which partition of a topic a message is assigned to (if multiple are available). See this document for more information on partitioning on Kafka topics.

In summary, Kafka only gives ordering guarantees per topic partition. So in order to guarantee order of messages you need to mark related messages (i.e. AggregateId for Domain Events) with a given key.

In our case keys could also be Avro serialized complex types (imagine using serialized ValueObjects as keys), this is why it should be possible to serialize keys separately from the message. You can read about the Confluent platofrm Avro Serialization & Formatters in detail here.

With php-rdkafka partition assignment this is controlled by the constants RD_KAFKA_MSG_PARTITIONER_RANDOM and RD_KAFKA_MSG_PARTITIONER_CONSISTENT, as well by the custom partitioner callback that you can register to override the default Kafka hash alogrithm (which is CRC32 currently).

Do you have plans to work on a better solution?

Yes, I would provide a PR this evening prototyping the solution with an additional KeySerializer which would not rely on message mutability.

Also for future major versions of this library the decision to have mutable message implementations should IMO be revisited.
In return of pthreads for PHP 7.2 and the more & more important role of parallel and async programming having immutable data structures would drastically help with sharing contexts between threads.

@tPl0ch tPl0ch mentioned this pull request Nov 6, 2017
7 tasks
ASKozienko pushed a commit that referenced this pull request Nov 2, 2018
[RdKafka] Enable serializers to serialize message keys
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants