Skip to content

Support KeyValue Schema with GenericRecord/AUTO_CONSUME #9844

@eolivelli

Description

@eolivelli

We have the KeyValue schema that supports a generic key-value model, and both the key and the value have a schema.

When you are dealing with structured data types, currently you usually use Sink<GenericRecord> and the AUTO_CONSUME schema, this way you can deal automatically with any supported from of data structures.

But if you use AUTO_CONSUME you cannot consume KeyValue records.

Describe the solution you'd like
I would like to see a way to use AUTO_CONSUME that in case of KeyValue schema, it passes a special GenericRecord instance with two fields:

  • key
  • value
    GenericRecord already supports nested data structures, so it is possible to set the schema for the key field and for the value field.

Advanced processors that allow to deal with nested structures will benefit from this new feature, because they will automatically be able to deal with KeyValue without changes, and in a consistent way, that is to deal only with GenericRecord, that is the generic key-value dictionary we have in Pulsar.

Describe alternatives you've considered
Modifying all of the connectors to deal with KeyValue and with GenericRecord, but this will be a big effort, and also currently (2.7.x) you cannot have a Sink that deals with two separate data type (the user must set explicitly a "classname")

Additional context
I have implementations of Sinks that deal with generic data structures and allow the user to transform/map the data before writing to the external system

Metadata

Metadata

Assignees

No one assigned

    Labels

    lifecycle/staletype/enhancementThe enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions