Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-9219: prevent NullPointerException when polling metrics from Kafka Connect #7652

Merged
merged 1 commit into from Nov 27, 2019

Conversation

ning2008wisc
Copy link
Contributor

@ning2008wisc ning2008wisc commented Nov 6, 2019

assignmentSnapshot may not always get initialized in some cases. If assignmentSnapshot is not initialized, registering assigned-connectors and assigned-tasks metrics upfront will cause
NullPointerException when the two metrics are polled via JmxReporter later

[2019-11-05 23:56:57,909] WARN Error getting JMX attribute 'assigned-tasks' (org.apache.kafka.common.metrics.JmxReporter:202)
java.lang.NullPointerException
	at org.apache.kafka.connect.runtime.distributed.WorkerCoordinator$WorkerCoordinatorMetrics$2.measure(WorkerCoordinator.java:316)
	at org.apache.kafka.common.metrics.KafkaMetric.metricValue(KafkaMetric.java:66)
	at org.apache.kafka.common.metrics.JmxReporter$KafkaMbean.getAttribute(JmxReporter.java:190)
	at org.apache.kafka.common.metrics.JmxReporter$KafkaMbean.getAttributes(JmxReporter.java:200)
	at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttributes(DefaultMBeanServerInterceptor.java:709)
	at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttributes(JmxMBeanServer.java:705)
	at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1449)
	at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
	at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
	at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1401)
	at javax.management.remote.rmi.RMIConnectionImpl.getAttributes(RMIConnectionImpl.java:675)
	at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357)
	at sun.rmi.transport.Transport$1.run(Transport.java:200)
	at sun.rmi.transport.Transport$1.run(Transport.java:197)
	at java.security.AccessController.doPrivileged(Native Method)
	at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
	at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573)
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:835)
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688)
	at java.security.AccessController.doPrivileged(Native Method)
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
[2019-11-05 23:57:02,821] INFO [Worker clientId=connect-1, groupId=backup-mm2] Herder stopped (org.apache.kafka.connect.runtime.distributed.DistributedHerder:629)
[2019-11-05 23:57:02,821] INFO [Worker clientId=connect-2, groupId=cv-mm2] Herder stopping (org.apache.kafka.connect.runtime.distributed.DistributedHerder:609)
[2019-11-05 23:57:07,822] INFO [Worker clientId=connect-2, groupId=cv-mm2] Herder stopped (org.apache.kafka.connect.runtime.distributed.DistributedHerder:629)
[2019-11-05 23:57:07,822] INFO Kafka MirrorMaker stopped. (org.apache.kafka.connect.mirror.MirrorMaker:191)

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@ning2008wisc ning2008wisc changed the title MINOR: prevent NullPointerException of polling metrics in Kafka Connect MINOR: prevent NullPointerException of polling metrics from Kafka Connect Nov 6, 2019
@ning2008wisc ning2008wisc changed the title MINOR: prevent NullPointerException of polling metrics from Kafka Connect MINOR: prevent NullPointerException when polling metrics from Kafka Connect Nov 6, 2019
@mimaison mimaison changed the title MINOR: prevent NullPointerException when polling metrics from Kafka Connect KAFKA-9219: prevent NullPointerException when polling metrics from Kafka Connect Nov 21, 2019
@mimaison
Copy link
Member

Thanks for the PR.
To allow users to easily find this issue, I've created a JIRA: https://issues.apache.org/jira/browse/KAFKA-9219 and renamed your PR.

metrics.addMetric(metrics.metricName("assigned-tasks",
this.metricGrpName,
"The number of tasks currently assigned to this consumer"), numTasks);
if (assignmentSnapshot != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, if assignmentSnapshot is null when we enter this constructor, the metrics will never get registered, even if it gets initialized shortly after.
Should we still register the metrics, and return Double.NaN in both measure() method if assignmentSnapshot is null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mimaison that sounds a good option that I have also considered. Let me make this change :)

@ning2008wisc ning2008wisc force-pushed the assignmentSnapshot_fix branch 2 times, most recently from 09fb4a1 to bb4ec47 Compare November 22, 2019 22:33
@ning2008wisc
Copy link
Contributor Author

@mimaison updated the pr, please take another review

@edoardocomar
Copy link
Contributor

@ning2008wisc I would consider returning 0 rather than NaN

…onnect

`assignmentSnapshot` may not always get initialized in some cases, especially when
Kafka connect is started from scratch. If `assignmentSnapshot` is not initialized,
blindly resgistering `assigned-connectors` and `assigned-tasks` metrics will cause
NullPointerException when the two metrics are polled.

the proposed fix is to return 0 in both measure() method if `assignmentSnapshot` is null
@ning2008wisc
Copy link
Contributor Author

@edoardocomar updated the pr to return 0.0, rather than NaN

@mimaison mimaison merged commit 88448f6 into apache:trunk Nov 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants