-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Flink-7945][Metrics&connector]Fix per partition-lag metric lost in kafka connector #4935
Conversation
Map<MetricName, ? extends Metric> metrics = consumer.metrics(); | ||
if (metrics == null) { | ||
// MapR's Kafka implementation returns null here. | ||
log.info("Consumer implementation does not support metrics"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The log will be overloaded with these, if the MapR implementation is used and metrics is turned on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i change the level to debug.
log.info("Consumer implementation does not support metrics"); | ||
} else { | ||
// we have Kafka metrics, register them | ||
for (Map.Entry<MetricName, ? extends Metric> metric: metrics.entrySet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really not sure about this.
This includes a loop through every consumer metric on every record poll.
AFAIK, the Kafka consumer contains at least 6~8 shipped metrics. That could be harmful for the performance of the consumer.
Is there any way we can avoid that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i agree with you this is not the best way to solve. what do you think about try to register kafka metrics at the beginnng of the job for about serval times which can be configured by properties
, after beyond the count, we will not run in the loop~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have realized the register several times and then skip the register, and then can successfully register the related metrics. Please let me know if you have any suggestion, thanks~ and the commits have been squash :)
One other side note:
|
d40d0f7
to
9b70e49
Compare
update the code according to the comment. ping @tzulitai |
Hi @tzulitai , could you take look at this again :-) ? |
ping @tzulitai ~ |
Hi @Aitozi, sorry for the long delay in relaying back to this PR. I'm still not convinced that this is a sane solution. For example, what is a "good" setting for the I wonder if we can try the following two approaches:
What do you think? |
Hi, @tzulitai Please let me know your idea ,thanks |
What is the purpose of the change
*When used KafkaConnector, we cant get per partition lag metric. But it has been exposed after kafka 0.10.2 https://issues.apache.org/jira/browse/KAFKA-4381. After read the kafka code, i found that the per partition lag is register after
KafkaConsumer#poll
method be invoked, so i change the metric register time in flink , and after this, with kafka-connector10 and kafka-connector11 we can see the correct lag metric. *Brief change log
Verifying this change
This change is already run through the test case
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (no)Documentation