Skip to content

Conversation

@zackdever
Copy link
Collaborator

Ports the java metrics functionality, but only has one example instrumentation to keep size down.

The only thing that isn't a straight port is the DictReporter, which I made in place of the JMX reporter since there is no obvious equivalent in python. I don't have a good feel yet for how that will be used, so I would consider it a rough draft towards the standard python reporter.

One thing to keep an eye on is performance hits by the use of locks.

return self._config

def _set_config(self, config):
with self._lock:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simple variable assignment is generally considered atomic in python. can we drop the lock here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will remove the lock entirely from KafkaMetric and leave a note that the java implementation uses one.

@zackdever zackdever force-pushed the metrics branch 5 times, most recently from b749402 to e2b340c Compare April 14, 2016 00:25
There is no straight translation for the JMX reporter into python,
so I'll do something else in a separate commit.
This adds the parent metrics instance to kafka consumer, which will
eventually be used to instrument everything under consumer. To start
I ported the java consumer coordinator metrics.
@zackdever
Copy link
Collaborator Author

Assuming tests pass, I think this is ready to go. It provides a complete port of the java client's metric system, minus the JMX reporter. I haven't done much of the instrumentation, but have instrumented fetch requests and the consumer coordinator as a first pass example. My plan is to ship this with the next release, and then soon thereafter work on matching the java client's instrumentation.

Important: This doesn't publicly expose the metrics we're collecting just yet. Because the java client relies on JMX as the default built-in reporter, we need to make a design decision about how to handle that in kafka-python. I think a simple dictionary based approach would make sense, but would like to mess around with that some more before committing to a public interface. Towards this end I've made DictReporter that can export a snapshot of all stats at the current time and their values. That said, users can easily provide their own reporters through configuration to expose the metrics, just like in the java implementation. I've updated the documentation on KafkaConsumer to reflect this.

#424 and #505 both add some metric instrumentation and provide a couple ways of retrieving those metrics - one exposes a dictionary of metrics, and the other has a handler system. This PR provides the same reporter interface the java client provides, which I think is similar to the handler system in #505. My intent for the DictReporter is to make the default instance of it available through a public API similar to #424. I don't know if the java implementation has the same instrumentation as the two mentioned PRs, but I would plan to mirror the java instrumentation fairly closely, but there's no reason we couldn't add our own instrumentation if it made sense.

@zackdever zackdever merged commit cf679ae into dpkp:master Apr 14, 2016
@zackdever
Copy link
Collaborator Author

Will be interesting to watch https://issues.apache.org/jira/browse/KAFKA-3377 to see if a REST metrics interface gets added to Kafka.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants