Update Kafka (to 0.10) #1405

codefromthecrypt · 2016-11-16T23:40:28Z

Last year, we attempted to update Kafka to 0.9, but had to revert it due to the ordering problem implied.

Unfortunately no one responded to the email asking about kafka update plans: https://groups.google.com/forum/#!topic/zipkin-user/rQpecbvc6Y0

We shouldn't be pinned to 0.8 forever, and we are now getting requests for 0.10 support. I'm wondering how we should address Kafka updates, particularly as the server includes an all-jar. Unless packaging is different, it might imply weird tricks to be able support multiple versions in the same binary, or a separate build.

Can anyone share their current status wrt kafka and/or have ideas on how to support multiple versions simply?

cc @prat0318 @dgarson @SimenB @sveisvei @eirslett @kristofa @NegatioN @shakuzen

codefromthecrypt · 2016-11-17T01:42:07Z

ps according to confluent, it is still fine to use kafka 0.8 consumers and producers http://docs.confluent.io/3.0.0/upgrade.html

Since zipkin-server is standalone, it might not need to change. However, zipkin-reporter-java might want a kafka010 module since it shares the classpath with production apps (who might want to upgrade)

StephenWithPH · 2017-05-04T00:23:34Z

Thoughts on revisiting this? Using the 0.8 consumer brings with it a Zookeeper dependency; newer versions of the consumer library use Kafka itself for offset tracking.

#904 sticks with 0.8.x clients until we either make a standalone 0.9 kafka collector or re-evaluate when 0.9 adoption is higher... it's hard to gauge the pace of Kafka broker upgrades in the world at large. Anecdotally, half of the room at the last Kafka meetup was still on 0.8.

The standalone zipkin-kafka-connector approach sounds viable; it's similar to https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/connectors/kafka.html.

We've just started digging deep into this part of Zipkin to brainstorm how to carve out the pinned 0.8 consumer. If the above sounds viable, we'll come back with a more detailed sketch.

codefromthecrypt · 2017-05-04T00:49:54Z

How does this impact you? Most of the time zipkin is a standalone server, so the library version conflict is of no concern (black box) Are you running a custom server, or is there some feature in 0.10 which you need zipkin to use?

codefromthecrypt · 2017-05-04T00:54:36Z

@StephenWithPH Sorry, you said "newer versions of the consumer library use Kafka itself for offset tracking" (still curious about custom build or not as that impacts the direction here)

Can you elaborate on this? Is this an automatic feature or would we need to expose configuration for it. Code example would be best.

Also, are you saying that now brokers self-discover? so you pass a broker url and that's it?

I'm also interested in progressing this, but need to pick your brain a bit. I think we'll need to do both 0.8 and 0.10, so this is the tricky part.

codefromthecrypt · 2017-05-04T00:55:04Z

pps are you using docker? If so, this gives us some more options on how to address this..

StephenWithPH · 2017-05-04T04:23:37Z

I'll answer the questions before diving into the details:

we are using Docker in prod
we use the factory openzipkin Docker image ❤️
newer versions of the Kafka consumer library write to a consumer offsets topic in Kafka itself. That's automatic... I don't believe there's any way not to do that, and no user code is needed. This Confluent blog post gives lots of details; note that the post is somewhat dated... this behavior is no longer in beta.
newer consumers do indeed discover all the brokers; see bootstrap.servers here.

From a high level, we strongly prefer to protect our Zookeeper from all of our various Kafka consumers. That's why we want to avoid indirectly using the old style consumer via Zipkin. Quoting some docs: ZooKeeper does not scale extremely well (especially for writes) when there are a large number of offsets (i.e., consumer-count * partition-count).

I agree that maintaining support for 0.8.x is necessary so we don't break existing users whose brokers are 0.8.x. We'd like to add support for the 0.10.x consumer (since our brokers are 0.10.x). The consumer-to-broker version limitation is still a thing, but sounds marginally less painful with 0.10.x.

It's likely a matter of time until 0.9.x support is a feature request from other Zipkin users. This is what lead us to link to the Apache Flink manner of handling "pluggable" Kafka consumer (or producer) client versions.

codefromthecrypt · 2017-05-04T04:40:33Z

incidentally, 0.10 has been more requested than 0.9. I'd prefer not to think about 0.9 if we can. Fine by you? :) Thanks for the details! I think we can sort something out for sure.

codefromthecrypt · 2017-05-04T04:44:23Z

First step would be to make a kafka10 collector impl, similar to what we did in reporter https://github.com/openzipkin/zipkin-reporter-java/tree/master/kafka10 This would need to also have a corresponding auto-configuration module. The last yard would be how to stitch it in. we can resolve that one way or another. For example, in docker we can see if KAFKA_BROKERLIST (or whatever name is used) is set while KAFKA_ZOOKEEPER is not. flipping on that, we can adjust the server classpath accordingly. This would allow old versions to continue w/ 0.8 and opt-in to 0.10 Do you have cycles to carve a kafka10 collector impl?

StephenWithPH · 2017-05-04T17:06:48Z

We do have cycles. We'll be back on this early next week.

dgrabows · 2017-05-08T04:52:43Z

I've been working on a Kafka 0.10 collector to see what would be involved in implementing one, primarily because the cluster I need Zipkin to consume span data from requires TLS. This isn't supported by the 0.8 consumer.

I have followed the pattern laid out out by the Kafka 0.8 collector. There is a kafka10 collector module and corresponding auto-configuration module. The zipkin-server module has been modified to run the collector when KAFKA10_BOOTSTRAP_SERVERS is set. It is functional at this point, although there is some cleanup to do in the collector and I'm suspect including the kafka10 collector in zipkin-server breaks the 0.8 collector. This is because the 0.10.2.0 version of kafka-clients takes precedence over the 0.8.x.x kafka-clients when resolving dependencies for zipkin-server. I'm not sure what the best way to address that classpath conflict is.

The work I've done so far is on the kafka10-collector branch of my fork. There is some documentation on how to configure the collector on https://github.com/dgrabows/zipkin/tree/kafka10-collector/zipkin-server#kafka-010-collector.

I'm happy to do what I can to contribute this back or help use this as the basis for other efforts.

codefromthecrypt · 2017-05-08T06:03:12Z

first feedback is thanks! Second is I wouldn't prefix these properties with kafka10. Not only will these not be in the same classpath together, but also the detection property KAFKA_BOOTSTRAP_SERVERS doesn't exist yet (so you can't clash with it). This property is used in spark-streaming, and wouldn't be something we would change for kafka11

dgrabows · 2017-05-08T14:22:33Z

I will change the environment variables to KAFKA_* instead of KAFKA10_*.

What about the underlying properties in https://github.com/dgrabows/zipkin/blob/kafka10-collector/zipkin-server/src/main/resources/zipkin-server-shared.yml? Right now, there are separate sets of properties for each Kafka collector, under the zipkin.collector.kafka and zipkin.collector.kafka10 namespaces. I'm leaning towards re-using the zipkin.collector.kafka properties for the kafka10 collector. Most of the properties are applicable to both collectors, and kafka10.consumer-threads could be dropped in favor of kafka.streams. My only lingering concern is that some properties, e.g. max-message-size, will only apply to one version of the Kafka collector and that may lead to confusion. I think that can be addressed sufficiently through documentation.

Thoughts?

NithinMadhavanpillai · 2017-05-08T18:05:22Z

Hi @adriancole ,
@StephenWithPH and I were looking at building from scratch a new Kafka 10 collector implementation based on your inputs. Today found that @dgrabows has covered a lot more on this and we would like to help you two in moving this forward. Could you please let us know if we can help you folks getting this to the master.

dgrabows · 2017-05-08T20:57:35Z

@NithinMadhavanpillai A couple of things that I would find helpful:

Feedback on the implementation of https://github.com/dgrabows/zipkin/tree/kafka10-collector/zipkin-collector/kafka10. I have a couple of todo comments in there that I intend to circle back to. Is there anything else about how this has been implemented that concerns you, isn't a good fit for your expected usage, or you think could be improved?
Thoughts on how to integrate this collector into zipkin-server without breaking the current 0.8 Kafka collector. The approach I've taken mostly likely breaks the 0.8 collector because of dependency resolution conflicts on different versions of kafka-clients. In the context of a docker deployment, Adrian mentioned checking for the Kafka collector environment variables and adjusting the classpath based on that. It sounds like that might work for you, because you're running Zipkin using docker. I will not be deploying it using docker. I could do the same thing with a wrapper shell script and I'd be fine with that solution. I'm not sure how well that generalizes to a solution that can be contributed back to the zipkin project since shells scripts imply implementation and testing across multiple platforms (linux, windows, etc.).

dgrabows · 2017-05-08T21:08:46Z

Another option for packaging the Kafka 0.10 collector would be to not integrate it into zipkin-server, and package it as a standalone collector instead. For example, use the same approach as the AWS SQS collector. That would be fine for my purposes, as I plan to have the collector running on a separate host from the query api for other reasons. I don't know if that's a desirable approach when considering ease-of-use for people using Zipkin and Kafka 0.10+ in general.

codefromthecrypt · 2017-05-08T23:56:26Z

Hi, Dan. Let's move this to a pull request? it is a lot easier to deal with different threads of discussion this way. wrt how we address integration, I think the best way is to act like sqs, where we package a module jar (and this includes a zipkin-server-kafka file like https://github.com/openzipkin/zipkin-aws/blob/master/autoconfigure/collector-sqs/src/main/resources/zipkin-server-sqs.yml ) The pom would include a module file as well, like this, except different excludes: https://github.com/openzipkin/zipkin-aws/blob/master/autoconfigure/collector-sqs/pom.xml Running will be a little messier than this because of classpath https://github.com/openzipkin/zipkin-aws/tree/master/autoconfigure/collector-sqs#running For example, we'll probably have to customize the docker image based on presence of the KAFKA_BOOTSTRAP_SERVERS variable, and move old jars out of the way.. we may end up lucky enough to not need to do that. Basically, once there's a module dir the next step is seeing how lucky we are.. in best case classpath order works and we don't need to hack the all jar to take out the kafka08 stuff. In the future, we could make a thin launcher https://github.com/dsyer/spring-boot-thin-launcher as the default image

codefromthecrypt · 2017-05-08T23:58:11Z

PS rationale is that unlike SQS, we have existing deployments with Kafka. Upgrading these sites should be easy a possible (vs creating a new image), so let's try to get the docker image working.. better to try and fail than not try and have people upset at now needing to think about multiple images when they formerly didn't.

dgrabows · 2017-05-09T01:33:28Z

That all makes sense. I created #1586 based on what I've got so far. I tried to capture the outstanding work I'm aware of on the PR.

codefromthecrypt referenced this issue in openzipkin/zipkin-finagle Nov 16, 2016

[maven-release-plugin] prepare for next development iteration

1c0ee7e

dgrabows mentioned this issue May 9, 2017

Add a Kafka 0.10+ compatible collector and attempt to integrate into zipkin-server #1586

Merged

8 tasks

codefromthecrypt closed this as completed in #1586 May 17, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Kafka (to 0.10) #1405

Update Kafka (to 0.10) #1405

codefromthecrypt commented Nov 16, 2016

codefromthecrypt commented Nov 17, 2016

StephenWithPH commented May 4, 2017

codefromthecrypt commented May 4, 2017 via email

codefromthecrypt commented May 4, 2017

codefromthecrypt commented May 4, 2017

StephenWithPH commented May 4, 2017

codefromthecrypt commented May 4, 2017 via email

codefromthecrypt commented May 4, 2017 via email

StephenWithPH commented May 4, 2017

dgrabows commented May 8, 2017

codefromthecrypt commented May 8, 2017 via email

dgrabows commented May 8, 2017

NithinMadhavanpillai commented May 8, 2017 •

edited

Loading

dgrabows commented May 8, 2017

dgrabows commented May 8, 2017

codefromthecrypt commented May 8, 2017 via email

codefromthecrypt commented May 8, 2017 via email

dgrabows commented May 9, 2017

Update Kafka (to 0.10) #1405

Update Kafka (to 0.10) #1405

Comments

codefromthecrypt commented Nov 16, 2016

codefromthecrypt commented Nov 17, 2016

StephenWithPH commented May 4, 2017

codefromthecrypt commented May 4, 2017 via email

codefromthecrypt commented May 4, 2017

codefromthecrypt commented May 4, 2017

StephenWithPH commented May 4, 2017

codefromthecrypt commented May 4, 2017 via email

codefromthecrypt commented May 4, 2017 via email

StephenWithPH commented May 4, 2017

dgrabows commented May 8, 2017

codefromthecrypt commented May 8, 2017 via email

dgrabows commented May 8, 2017

NithinMadhavanpillai commented May 8, 2017 • edited Loading

dgrabows commented May 8, 2017

dgrabows commented May 8, 2017

codefromthecrypt commented May 8, 2017 via email

codefromthecrypt commented May 8, 2017 via email

dgrabows commented May 9, 2017

NithinMadhavanpillai commented May 8, 2017 •

edited

Loading