Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to capture record-lag-max metric for consumer? #670

Closed
1 task
kasturichavan opened this issue Sep 23, 2019 · 9 comments
Closed
1 task

How to capture record-lag-max metric for consumer? #670

kasturichavan opened this issue Sep 23, 2019 · 9 comments

Comments

@kasturichavan
Copy link

kasturichavan commented Sep 23, 2019

We want to scale our kafka consumer with HPA in kubernetes based on kafka custom metric record-lag. Is there a method in confluent-kafka-python that exposes the metrics ? How can we get this data?
Below is the fetch metrics list:
https://docs.confluent.io/current/kafka/monitoring.html#fetch-metrics

Checklist

Please provide the following information:

  • confluent-kafka-python version (confluent_kafka.version(0.11.5)):
@kasturichavan kasturichavan changed the title How to capture record_lag metric ? How to capture record-lag-max metric for consumer? Sep 23, 2019
@edenhill
Copy link
Contributor

@kasturichavan
Copy link
Author

kasturichavan commented Sep 23, 2019

I set the "statistics.interval.ms": 10 for testing. So now consumer.poll() should trigger the stats_cb. But this isnt returning any data ?? Will stats data be returned in poll function ??

@edenhill
Copy link
Contributor

You need to define a stats_cb callback too, and call poll() at regular intervals to trigger the stats.
(a stats interval of 10ms might be a tad low).

@kasturichavan
Copy link
Author

kasturichavan commented Sep 23, 2019

where do we pass the stats_cb ? Sorry my question could be dumb. But based on code specified above, stats_cb seems to be already defined. Can this be passed as a field to consumer init() call ? Docs dnt have that specified as a field consumer init accepts.

@edenhill
Copy link
Contributor

def stats_cb(stats_json_str):

@kasturichavan
Copy link
Author

kasturichavan commented Sep 26, 2019

@edenhill After implementing above, i do get back stats.
Noticed following issues:

  1. Sometimes topics field & toppars field are empty & sometimes they hold values from which i can extract the consumer_lag.https://docs.confluent.io/current/clients/librdkafka/md_STATISTICS.html

  2. The stats that get gathered are consumer specific right? So it only includes partitions that are assigned to this consumer. Or does it include info of all partitions in the topic that are assigned to all consumers and its lags ???

@edenhill
Copy link
Contributor

Stats will only be collected for the currently assigned/consumed partitions.

@kasturichavan
Copy link
Author

kasturichavan commented Sep 30, 2019

@edenhill But why do the stats hold empty values for topics field for a single consumer in consumergroup assigned to a topic with 2 partitions. These stats are collected via a consumer poll() and hence it will always be pointing to a currently assigned partition.

@edenhill
Copy link
Contributor

edenhill commented Oct 7, 2019

@kasturichavan Can you paste your stats json object?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants