Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
New (key|value).multi.type option for Avro serialization #680
In some situations, an application needs to store events of several different types in the same Kafka topic. In particular, when developing a data model in an Event Sourcing style, you might have several kinds of event that affect the state of an entity. For example, for a customer entity there may be
The Avro schema registry currently assumes a 1:1 mapping between Kafka topics and Avro schemas, making it difficult to support scenarios like the one above. Users who want several event types in the same topic currently either have to put them in one big Avro union (which works, but gets unwieldy very quickly), or turn off the registry's schema compatibility checking (which would be unfortunate, since the compatibility check is very valuable).
This patch introduces two new boolean config settings,
This has the effect that a Kafka producer will happily accept any mixture of Avro record types and publish them to the same topic. Since the schema registry's ID for a schema is globally unique, the binary message encoding does not need to change, and consumers also handle the mixture of record types without change. When a schema is changed, the registry checks compatibility with previous schemas of the same fully-qualified type name; different record types can be evolved independently without any interference.
It looks like @ept hasn't signed our Contributor License Agreement, yet.
You can read and sign our full Contributor License Agreement here.
Once you've signed reply with
Appreciation of efforts,
@ept thanks for your patch. While for most part, it looks good; I'm just thinking gout aloud here if it would help to generalize this. We have different scenarios where users would want to share the same schema across topic. Your solution can be used to fix that scenario as well. So, may be we could call the config to be something key.subject.name.strategy and value.subject.name.strategy. The default strategy could always use topic-ket and topic-value. Let me know your thoughts.
@mageshn Thanks for the suggestion — I think that's a good idea, so I've implemented the key.subject.name.strategy and value.subject.name.strategy configs. They currently have three valid settings:
@rhauch @mageshn Happy new year! I have updated the patch as you suggested, using different classes to implement the different subject-name choosing strategies. The configuration is now a fully-qualified Java classname, so that people can easily plug in their own strategies if desired. Could you let me know if it looks good now?
@ept, happy new year to you! Thanks for the changes. I have one really minor question below -- otherwise this looks great!
Approving as is in case it's difficult to find succinct and clear text to add.
BTW, not sure if these pass locally, but the build is failing with the NPEs in the following tests:
Jan 12, 2018
1 check passed
referenced this pull request
Feb 3, 2018
referenced this pull request
Feb 16, 2018
I don't think Avro supports this out of the box.
Now that 4.1 is out, can we use the multi-schema feature? I didn't see it mentioned in the release note.
We have implemented our own version of schema-registry based on etcd (we had etcd at hand) because we needed this feature before it was released, though it doesn't support schema compatibility enforcements and rather than adding it on top of our own we'd like to migrate whenever possible.