Skip to content

PrometheusEmitter NullPointerException fix#13286

Merged
FrankChen021 merged 4 commits intoapache:masterfrom
599166320:feature-prometheus-emitter-bugfix
Nov 3, 2022
Merged

PrometheusEmitter NullPointerException fix#13286
FrankChen021 merged 4 commits intoapache:masterfrom
599166320:feature-prometheus-emitter-bugfix

Conversation

@599166320
Copy link
Contributor

Fixes #13285.

Description

PrometheusEmitter NullPointerException fix


Key changed/added classes in this PR
  • PrometheusEmitter

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

private HTTPServer server;
private PushGateway pushGateway;
private String identifier;
private volatile String identifier;
Copy link
Contributor

@cryptoe cryptoe Oct 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious to understand the race here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existence of multiple threads calling emitMetric will modify the identifier. Even if the identifier is modified countless times, there will always be only one result, which is an idempotent design.
Currently only the thread of PrometheusPushGatewayEmitter will read the value of identifier. In order to guarantee the visibility of the reading thread, I use volatile to decorate the identifier.

In addition, there is an inefficient way to use LinkedBlockingQueue to achieve safe publishing. This is how InfluxdbEmitter is implemented.

@FrankChen021
Copy link
Member

Thanks for the fix.

From the assignment statement, I don't understand how could identifier be null?

identifier = (userDims.get("task") == null ? metricEvent.getHost() : (String) userDims.get("task"));

@599166320
Copy link
Contributor Author

I realized that there are two conditions that may lead to NPE in the access identifier

  1. The identifier may be modified by the emitMetric thread. Since identifier is a common object attribute, it may not be visible to the PrometheusPushGatewayEmitter thread

  2. emitMetric is not called in advance, but the identifier is accessed.

Copy link
Contributor

@cryptoe cryptoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @FrankChen021 Please let me know if it's okay from your side as well.

@FrankChen021
Copy link
Member

I realized that there are two conditions that may lead to NPE in the access identifier

  1. The identifier may be modified by the emitMetric thread. Since identifier is a common object attribute, it may not be visible to the PrometheusPushGatewayEmitter thread
  2. emitMetric is not called in advance, but the identifier is accessed.

Yeah, the 2nd case is the reason.
The check logic in pushMetric and flush is a little bit chaos. Could you improve it by putting all checks together like:

  private void pushMetric()
  {
    if ( pushGateway == null || identifier == null) {
       return ;
   }
  ...

Another point(not relevant to this PR) is, checking if namespace is null is meaningless since it's guaranteed it won't be null.

  @JsonCreator
  public PrometheusEmitterConfig(
      ...
  )
  {
    ...
    this.namespace = namespace != null ? namespace : "druid";

@599166320
Copy link
Contributor Author

@FrankChen021 I debugged. If druid.emitter.prometheus.namespace=xxx is not configured, the value of namespace will be at least druid. In order to make the code more concise and clear, I have deleted meaningless judgments

@FrankChen021
Copy link
Member

Thanks for the cleanup.
We also need to remove the Nullable annotation from the namespace since it won't be null at all.

Copy link
Member

@FrankChen021 FrankChen021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's wait for the CI to complete.

@FrankChen021 FrankChen021 merged commit c5fcc03 into apache:master Nov 3, 2022
@kfaraz kfaraz added this to the 25.0 milestone Nov 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PrometheusEmitter has null pointer exception

4 participants