Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design: Protobuf support #17

Closed
torbsto opened this issue May 4, 2022 · 1 comment
Closed

Design: Protobuf support #17

torbsto opened this issue May 4, 2022 · 1 comment
Labels
type/design Design documents for enhancements

Comments

@torbsto
Copy link
Contributor

torbsto commented May 4, 2022

Design: Protobuf support

Last updated: 04.05.2022

Milestone: Protobuf support
Development: 0.7


This issue describes our approach for the support of Protobuf in Quick.

Protobuf is a data format for (de-)serializing data that has gained a lot of support in the Kafka ecosystem recently. It is comparable to Avro, which so far is the only schema format supported by Quick.

We track all related issues in the Protobuf support milestone. As per the roadmap, the development of this feature is planned for Quick 0.7.

Goals

With the implementation of this enhancement, Quick supports:

  1. Topic Creation: users can create topics that are backed by Protobuf schemas
  2. Format Information: components should be able to tell which schema format a topic uses
  3. Data Ingest: users can ingest data into topics backed by Protobuf schemas
  4. GraphQL Query: gateways can query topics backed by Protobuf schemas
  5. GraphQL Subscription: gateways can subscribe to topics backed by Protobuf schemas
  6. Mirror: mirrors read and expose data from topics backed by Protobuf schemas

Implementation

1. Topic Creation

Goal: users can create topics that are backed by Protobuf schemas

First, let's look into what happens when the user creates a new topic. Quick:

  1. checks if it already exists (topic registry, Kafka, Schema Registry)
  2. creates the Kafka topic
  3. converts (key and value) GraphQL to Avro and registers it with the Schema Registry
  4. deploys a mirror

The steps affected by the proposed change are 1 and 3:

  • In 1, Quick has to ensure the schema doesn't exist yet. So far, this works by checking if the subject already exists. We have to look into whether this works the same for Protobuf.
  • For step 3, Quick now requires a converter from GraphQL to Protobuf. Further, we also have to evaluate if the registration works the same as for Avro.

Quick additionally requires a way to let users decide between Avro and Protobuf. There are (at least) the following two options to implement this:

  1. Extend the API to include an additional parameter for setting the schema format
  2. Add a configuration variable for setting the schema format

The advantages of 1 is the flexibility that comes with it. A user can decide per topic creation which schema they want.
However, this can also become repetitive since most use a single format. This also complicates the overall implementation: We would then require a way to propagate the information per topic.
We therefore start with option 2. If users require option 1, we can still add it later.

2. Format Information

Goal: components should be able to tell which schema format a topic uses.

All the following goals require a mechanism in place that tells the corresponding components whether the topics use Avro or Protobuf. Since we start with a global environment variable as described in goal 1, this configuration can be used.

Other options that allow more granular configurations are:

  • Store this information in the topic registry
  • Infer from Schema Registry

3. Data Ingest

Goal: users can ingest data into topics backed by Protobuf schemas

The ingest-service uses the TypeResolver to transform JSON to Avro. We therefore require an additional implementation of TypeResolver for Protobuf. The configuration of the TypeResolver happens in QuickTopicTypeService. Here Quick has to differentiate between Avro and Protobuf and set the resolver accordingly.

4. GraphQL Query

Goal: gateways can query topics backed by Protobuf schemas

This is dependent on goal 6 (mirror). During a GraphQL query, the gateway forwards requests to corresponding mirror applications. The communication between gateway and mirror uses REST + JSON. Therefore, the underlying schema format is transparent from the gateway's point of view.

5. GraphQL Subscription

Goal: gateways can subscribe to topics backed by Protobuf schemas

Similar to the data ingest, the GraphQL subscription uses the SerDe provided by the QuickTopicTypeService. Since Quick can't know the exact message, it has to use DynamicMessage. This is similar to the way Quick uses Avro's GenericRecords currently.

6. Mirror

Goal: mirrors can read data from topics backed by Protobuf schemas

As in the data ingest, mirrors use the QuickTopicTypeService to get TypeResolver for (de-)serializing data. The resolvers are used to

  1. read data from topic
  2. store them in the state store
  3. transform the data to JSON in the REST API

Thus, the mirror can handle Protobuf with the updated TypeResolver.

@torbsto torbsto added the type/design Design documents for enhancements label May 4, 2022
@torbsto torbsto added this to the Protobuf support milestone May 4, 2022
@torbsto
Copy link
Contributor Author

torbsto commented Jul 12, 2022

Protobuf support is implemented.

@torbsto torbsto closed this as completed Jul 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/design Design documents for enhancements
Projects
Status: Done
Development

No branches or pull requests

1 participant