Wanted: topic management, declarative #101

solsson · 2017-11-27T14:37:02Z

We're a small team with a small set of kafka topics and clients, but already we need a self-documenting way to create and update topics.

Use cases:

Ensure that a topic exists in minikube, or docker-compose, during development.
Ensure that a topic exists in production during service provisionting in Kubernetes.
Edit retention.ms.
Knowing if the topic was created with key and/or value scemas in mind.

Using kafka-topics.sh or the java api ok for testing, but for instance with production topics we want to create them with 1 replica during dev and >=3 in a real ckluster.

An option is to use helm for topic creation manifests. I've searched for best practices, like tooling other organizations use, but with no luck. I imagine most teams will want to document their kafka topics, beyond a naming convention.

The text was updated successfully, but these errors were encountered:

BenjaminDavison · 2017-12-05T13:14:33Z

We are thinking about using terraform for this: https://github.com/packetloop/terraform-provider-kafka

DanielFerreiraJorge · 2017-12-10T16:47:53Z

Maybe this: https://github.com/nbogojevic/kafka-operator

solsson · 2017-12-11T17:45:57Z

My current best bet is to rely on auto.create.topics.enable=true + default.replication.factor and a naming convention (some ideas here).

I found that during stream processing it's a quite legitimate use case to want to add new topics dynamically. For example in log processing you could split on kubernetes namespace, and those may pop up while your streams app is running.

solsson · 2017-12-13T21:11:46Z

I've tried to summarize my confusion by closing various open investigations. What surprised me the most was the insight(?) that Schema Registry doesn't help.

At Yolean we store all our persistent data in Kafka. Some of it has the integrity requirements you'd normally entrust a relational database. If validation is supported at the API level when producing, we'll make design choices based on the type of data (rather than time constraints / laziness).

Now I suspect we won't get that far without in-house libs - a maintenance effort we can't depend on, given the rapid evolution of Kafka APIs.

Validation can still be simple enough, if schemas are available as files. We already use json-schema in some core producers using Node.js. Files have the added benefit that they get the same versioning and release cycle as the services they share repository with.

We do need a little bit of tooling to copy schema files, be they Avro or JSON or Protobuf, to the source folders of the services that depend on them. Due to Docker build contexts we can't pull them from a parent folder.

We still don't enforce anything, but luckily there's a compelling alternative: Monitoring (explored in #49 and #93). Consumer services can export counters on the number of messages they had to drop due to invalidity, or even if they don't we can look at generic failure rates. We can catch errors that schemas can't: the other day we had misbehaving client-side code that led to an increase in messages/second to a topic by a factor 100.

We can also write dedicated validation services, whose sole purpose is to expose metrics. This is where naming conventions might matter. Let's say that topic names ending with a dash followed by digits indicate a version number. It'd be quite easy to spot, based on a topic listing and the bytes or message rates reported by broker JMX, that we have some service producing to an old version of a topic.

WIP in Yolean/kubernetes-kafka#101

* #107 * #101 * Yolean/fluent-bit-kubernetes-logging@e045ef1 * #52

solsson · 2018-01-08T11:19:08Z

We're now phasing in auto.topic.create (#107) in production. The gotchas that made me disable it from the start are there for sure -- run kafkacat with some typo in -t and a new topic gets created. Who would configure their database to create a table on INSERT INTO xyz?

However, Kafka Streams in essence requires auto creation. We just have to live with the gotchas.

Regarding topics + schema management, with the arguments I tried to summarize above, I ended up writing a Gradle build that generates Java POJO typed Serde impls from json-schema files. We can also use these schema files directly from libs like ajv in Node.

To do this we had to establish a list of topics, or actually topic name regexes, where we configure the model (i.e. schema) that is used for values. With this list we can unit test the whole thing: "deserialize this byte stream for topic x" etc, which is what I argued for earlier.

Regarding topic management, with or without auto create, we could compare that list to existing topics in kafka, which together with bytes in/out rates could indicate if there are bogus topics lying around or we're writing to unexpected topics.

WIP in Yolean/kubernetes-kafka#101

solsson · 2018-01-18T21:17:30Z

Good points here: https://www.confluent.io/blog/put-several-event-types-kafka-topic/. I agree to a lot of what Kleppmann says, and conclude that my design above with one schema per topic is too restrictive. Schema evolution support basically only caters for new versions adding new fields. A single version may need to produce using multiple schemas to the same topic, due to ordering requirements.

I also happened to read http://www.dwmkerr.com/the-death-of-microservice-madness-in-2018/ today, and it nicely captures some versioning aspects of using topics as API contracts between services. As that post puts it:

"When a microservice system uses message queues for intra-service communication, you essentially have a large database (the message queue or broker) glueing the services together. Again, although it might not seem like a challenge at first, schema will come back to bite you."

I wanted to have as much of this complexity as possible known at build time, rather than provisioning/orchestration time (Kube manifests) or runtime (schema registry etc). Am I mistaken here? As a consumer, with Kleppmann's patch to Confluent's Avro serdes, you'll basically if on the deserialized type and do your processing from there. This could be used to try to support different versions of the upstream service's messages, without schema evolution. I guess that temptation should be resisted. At Yolean we've tried topic generation numbers for that, with mixed results.

Kleppmann writes:

If you are using a data encoding such as JSON, without a statically defined schema, you can easily put many different event types in the same topic. However, if you are using a schema-based encoding such as Avro, a bit more thought is needed to handle multiple event types in a single topic.

That you can deserialize JSON without a schema doesn't mean you always should. For domain entities or "facts" - not operational data - I'd like all our records to have schemas. If Schema Registry is evolving, it's a pity confluentinc/schema-registry#220 gets so little attention.

solsson · 2018-02-09T13:29:45Z

Regarding naming conventions for topics, just now I spotted the config property create.topic.policy.class.name. https://cwiki.apache.org/confluence/display/KAFKA/KIP-108%3A+Create+Topic+Policy looks useful.

joewood · 2018-02-14T13:03:13Z

Randomly reading through this issue, I noticed you mentioned auto.create.topics.enable=true is required for streams. I don't believe this is the case as Kafka Streams uses the admin messages for topic creation and not the meta data request message (which is what this config relates to). Streams should create the derived topics based on the same configuration as the source topic (number of partitions etc.), with compaction for store backing topics.

solsson · 2018-02-14T18:36:44Z

@joewood Thanks a lot for correcting my misunderstanding here. We (Yolean) must have drawn some premature conclusions when evaluating streams.

Actually my disappointment summarized in #101 (comment) remains unchanged, and with your insight there's really no argument left for auto.create.topics.enable=true IMO, except that it is the default. I'd really like to do #148.

You don't happen to have experience with Create Topic Policy?

solsson · 2018-07-28T05:41:39Z

Reading about the new Knative - apparently awesome enough to need another landing page :) - it could be that CloudEvents matches what I've been looking for. It seems to go further than mapping a schema to topic messages, as it discusses feeds and event types and the relation to an "action". Types are managed in Kubernetes using CRD.

They say that "A producer can generate events before a consumer is listening, and a consumer can express an interest in an event or class of events that is not yet being produced.". That's in contrast to auto.create.topics.

Knative supports a "user-provided" Kafka event "Bus".

There's an event sorce for kubernetes events, interesting as alternative to #186 #145.

I haven't tested any of the above yet. I'll be particularily interested in how it relates to Kleppmann's that I referred to earlier. My first goal would be to see if I could implement event sources for our git hosting (Gogs for which we already publish events to Kafka using webhooks + #183) and Registry which didn't work with Pixy out of the box.

so we want to undo #107. It was partially based on a false assumption, as pointed out in #101 (comment) Topics are created not only at produce but also at for example kafkacat -C. Typos cost us more time than it would take to automate topic creation and run ./bin/kafka-topics.sh in a temporary pod when we haven't automated.

solsson added the automation label Nov 27, 2017

solsson mentioned this issue Dec 10, 2017

Might schema management be in scope? nbogojevic/kafka-operator#3

Open

This was referenced Dec 13, 2017

Zookeeper persistent data? #26

Closed

As-a-service, consume topic definitions and act accordingly Yolean/kafka-topic-client#7

Closed

This was referenced Dec 13, 2017

Expose metric for decoding errors Yolean/kafka-cache#2

Open

Embrace auto.topic.create; trust defaults for production topics #107

Merged

solsson added a commit to Yolean/fluent-bit-kubernetes-logging that referenced this issue Dec 14, 2017

Elaborates on a naming convention

e045ef1

WIP in Yolean/kubernetes-kafka#101

solsson added a commit that referenced this issue Dec 18, 2017

Modernize based on

a861237

* #107 * #101 * Yolean/fluent-bit-kubernetes-logging@e045ef1 * #52

solsson mentioned this issue Jan 8, 2018

Test failures for console producer/consumer with min.insync.replicas=2 #116

Closed

solsson added a commit that referenced this issue Jan 8, 2018

Adapts topic to our naming convention (#101)

a97c2f2

solsson added a commit that referenced this issue Jan 8, 2018

Adapts topic to our naming convention (#101)

9ad4134

solsson added a commit to Yolean/fluent-bit-kubernetes-logging that referenced this issue Jan 16, 2018

Elaborates on a naming convention

ad00da9

WIP in Yolean/kubernetes-kafka#101

solsson mentioned this issue Jan 18, 2018

JSON Schema Support confluentinc/schema-registry#220

Closed

solsson added howto-topics and removed automation labels Feb 8, 2018

solsson mentioned this issue Dec 19, 2018

Can we happy-hack a sidecar concept? Yolean/kafka-cache#24

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wanted: topic management, declarative #101

Wanted: topic management, declarative #101

solsson commented Nov 27, 2017 •

edited

Loading

BenjaminDavison commented Dec 5, 2017

DanielFerreiraJorge commented Dec 10, 2017

solsson commented Dec 11, 2017

solsson commented Dec 13, 2017

solsson commented Jan 8, 2018

solsson commented Jan 18, 2018

solsson commented Feb 9, 2018

joewood commented Feb 14, 2018

solsson commented Feb 14, 2018

solsson commented Jul 28, 2018

Wanted: topic management, declarative #101

Wanted: topic management, declarative #101

Comments

solsson commented Nov 27, 2017 • edited Loading

BenjaminDavison commented Dec 5, 2017

DanielFerreiraJorge commented Dec 10, 2017

solsson commented Dec 11, 2017

solsson commented Dec 13, 2017

solsson commented Jan 8, 2018

solsson commented Jan 18, 2018

solsson commented Feb 9, 2018

joewood commented Feb 14, 2018

solsson commented Feb 14, 2018

solsson commented Jul 28, 2018

solsson commented Nov 27, 2017 •

edited

Loading