Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spring Boot] Collector registered multiple times #279

Closed
lupo112 opened this issue Jul 19, 2017 · 28 comments
Closed

[Spring Boot] Collector registered multiple times #279

lupo112 opened this issue Jul 19, 2017 · 28 comments

Comments

@lupo112
Copy link

lupo112 commented Jul 19, 2017

Hello

In Spring Boot implementation there are some issues with registering of Collectors:

  • in tests I am using different @SpringBootTest configurations, some tests failed on multiple registering of the same collector
  • dev tools after recompiling does not started, again failed on multiple registration

After some experiments, I have found the solution. The problem is that it is used CollectorRegistry.defaultRegistry. Using created bean, all works. Can you add support for injecting of CollectorRegistry as Bean? For example:

	@Bean
	public CollectorRegistry collectorRegistry() {
		return new CollectorRegistry(true);
	}
@brian-brazil
Copy link
Contributor

A given metric is only meant to be registered once, typically via a class static initialiser so this doesn't usually come up in standard usage. Which metrics are conflicting exactly?

@lupo112
Copy link
Author

lupo112 commented Jul 19, 2017

Here is the example:

https://github.com/lupo112/prometheus_client_java_279

If you try to run all test, it will fail on:

java.lang.IllegalArgumentException: Collector already registered that provides name: service_computation_latency_seconds_count
	at io.prometheus.client.CollectorRegistry.register(CollectorRegistry.java:54)
	at io.prometheus.client.SimpleCollector$Builder.register(SimpleCollector.java:245)
	at io.prometheus.client.SimpleCollector$Builder.register(SimpleCollector.java:237)
	at com.github.lupo112.service.IssueService.<init>(IssueService.java:17)

Or dev server does not work. Start server and then try to change any file and recompile. It should reload given class. It will failed on:

java.lang.IllegalArgumentException: Collector already registered that provides name: controller_latency_seconds_count
	at io.prometheus.client.CollectorRegistry.register(CollectorRegistry.java:54) ~[simpleclient-0.0.24.jar:na]
	at io.prometheus.client.SimpleCollector$Builder.register(SimpleCollector.java:245) ~[simpleclient-0.0.24.jar:na]
	at io.prometheus.client.SimpleCollector$Builder.register(SimpleCollector.java:237) ~[simpleclient-0.0.24.jar:na]
	at com.github.lupo112.controller.IssueController.<init>(IssueController.java:19) ~[classes/:na]
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[na:1.8.0_91]
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[na:1.8.0_91]
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.8.0_91]
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[na:1.8.0_91]
	at org.springframework.beans.BeanUtils.instantiateClass(BeanUtils.java:142) ~[spring-beans-4.3.9.RELEASE.jar:4.3.9.RELEASE]

@checketts
Copy link
Contributor

@lupo112
Copy link
Author

lupo112 commented Jul 19, 2017

Thanks, but that will resolve only the first problem.

Second problem with dev server still remains:

java.lang.IllegalArgumentException: Collector already registered that provides name: controller_latency_seconds_count
	at io.prometheus.client.CollectorRegistry.register(CollectorRegistry.java:54) ~[simpleclient-0.0.24.jar:na]
	at io.prometheus.client.SimpleCollector$Builder.register(SimpleCollector.java:245) ~[simpleclient-0.0.24.jar:na]
	at io.prometheus.client.SimpleCollector$Builder.register(SimpleCollector.java:237) ~[simpleclient-0.0.24.jar:na]
	at com.github.lupo112.controller.IssueController.<clinit>(IssueController.java:19) ~[classes/:na]
	... 29 common frames omitted

@checketts
Copy link
Contributor

@lupo112 Great point since the dev server reloads classes for quicker dev experience

@lupo112
Copy link
Author

lupo112 commented Jul 19, 2017

@checketts exactly!

@brian-brazil
Copy link
Contributor

With the way metrics work, you need a way to use the same metric object across reloads.

@ollio
Copy link

ollio commented Aug 23, 2017

Hello,

I have the same issue with a spring-cloud app.

In the CollectorRegistry line 192 function filter() I found the following:
if (includedNames.isEmpty()) { return next; } else { ...
I think the problem will be solved, if the name of the sample will be added into the includedNames Set.

Thanks Olli

@brian-brazil
Copy link
Contributor

That's unrelated. This issue is due to using the simpleclient incorrectly.

@Serranya
Copy link

I have the same problem. When using @PrometheusTimeMethod.

Some of my @SpringBootTest tests are annotated with @DirtiesContext. When spring creates a new context the metrics are registered again:

Collector already registered that provides name: endpoint_availability_count

@golonzovsky
Copy link

golonzovsky commented Oct 11, 2017

as a possible workaround in tests:

@After
public void tearDown() {
	CollectorRegistry.defaultRegistry.clear();
}

@thomasdarimont
Copy link

thomasdarimont commented Oct 16, 2017

The following hack avoids the dev-tools reload problem:

package demo;

import io.prometheus.client.CollectorRegistry;
import io.prometheus.client.spring.boot.EnablePrometheusEndpoint;
import io.prometheus.client.spring.boot.EnableSpringBootMetricsCollector;
import org.springframework.context.annotation.Configuration;

@Configuration
// Registers /prometheus endpoint
@EnablePrometheusEndpoint
// Exposes spring boot metrics via the prometheus endpoint
@EnableSpringBootMetricsCollector
class PrometheusConfig {

  static {
    //HACK Avoids duplicate metrics registration in case of Spring Boot dev-tools restarts 
    CollectorRegistry.defaultRegistry.clear();
  }
}

@codefromthecrypt
Copy link

MetricsFilter implicitly registers a new collector (which you cannot access as it is private) on init. It should deregister it on destroy. What would be the harm of using lifecycle such as this?

@brian-brazil
Copy link
Contributor

It should deregister it on destroy. What would be the harm of using lifecycle such as this?

Once a metric is registered, it should stay registered until the process terminates. Time series should not appear and disappear over the lifetime of the process, that's difficult to deal with semantically.

@geoand
Copy link

geoand commented Nov 22, 2017

My understanding of the problem is the following:

  • While in production an application should never de-register a metric because that creates semantic problems for timeseries data that can't / shouldn't be handled by Prometheus.
  • In non-production environments such as unit tests or during development where an application might be restarted within the same OS process, the developer doesn't really care about the semantic correctness of the metrics, but instead justs wants to stand up the application with the metrics infrastructure enabled. In such cases we would like to be able to de-register / register metrics via some mechanism that would not mislead consumers of the library to think that it's a generally applicable.

Am I missing something else here?

I just wanted to get as clear a picture as possible of what's going on here because I am sure that there are (or will be in the future) many people that will stumble upon this issue and scratch there heads.

@codefromthecrypt
Copy link

The following servlet filter code makes it impractical to comply with @brian-brazil's desire

https://github.com/prometheus/client_java/blob/master/simpleclient_servlet/src/main/java/io/prometheus/client/filter/MetricsFilter.java#L148

As you'll notice the metric itself is instantiated and then cached as a private field which we cannot access. Somehow, we are supposed to reuse it when the application reloads. I suppose this means we are supposed to now cache the filter object itself. Basically it is extremely unpractical advise.

If there's a desire to ensure only one metric is made in a JVM singleton, there are other approaches to do that. For example, you can idempotently create the metric (by caching its input and if equals don't try again, but do that in the layer that's enforcing everything)

Right now, feedback feels quite draconian, limited and arbitrary

@brian-brazil
Copy link
Contributor

The following servlet filter code makes it impractical to comply with @brian-brazil's desire

That's a new issue. At a first glance, it is indeed the filter that would need to be reused, or use your own metrics.

For example, you can idempotently create the metric (by caching its input and if equals don't try again, but do that in the layer that's enforcing everything)

We purposefully avoid that on performance grounds. See https://www.robustperception.io/label-lookups-and-the-child/

That sounds like something you might be able to do within Spring Boot though.

Right now, feedback feels quite draconian, limited and arbitrary

The problem as I see it is that you are trying to do something quite complicated (dynamic class reloads) with known drawbacks, and you ran into the drawbacks.
All the errors and issues you are running into are protections against doing things incorrectly in the more standard cases, and they're also stopping you from doing something incorrect here.

I can only look at these things from the Prometheus standpoint, and help you avoid the pitfalls there. I'm afraid I can't hand you a solution as I'm not a Spring Boot expert.

@brian-brazil
Copy link
Contributor

@geoand Yes on the first point. I'd count development as production for the 2nd point ("development" is very broad), but yes on unittests.

@geoand
Copy link

geoand commented Nov 22, 2017

@brian-brazil Thanks for clearing that up

@brian-brazil
Copy link
Contributor

To provide some general background on why this is an error:

A common user mistake is to not make their metrics static, and thus there'd be duplicate metrics which mess everything up. Historically Prometheus would reject such bad input and fail the scrape, however discovering and debugging that was tricky for users. So in order to fail faster and make it easier for users to fix, the client library now detects this and throws an exception.
A related mistake is having the same metric defined in two different files, which now fails at startup.

@geoand
Copy link

geoand commented Nov 22, 2017

A very crude and also Spring Boot specific solution for sidestepping the problem of metrics getting registered multiple time in tests that create more than one ApplicationContext (for example by tests that use @DirtiesContext) is the following:

Create a class named ResetMetricsTestExecutionListener like so:

public class ResetMetricsTestExecutionListener extends AbstractTestExecutionListener {

    @Override
    public void beforeTestMethod(TestContext testContext) {
        CollectorRegistry.defaultRegistry.clear();
    }
}

Now on each test class (or perhaps only on an Abstract Base class which all Spring tests extend) would look something like the following:

@RunWith(SpringRunner.class)
@SpringBootTest //use whatever extra configuration you need on this annotation
@TestExecutionListeners(listeners = ResetMetricsTestExecutionListener.class, mergeMode =TestExecutionListeners.MergeMode.MERGE_WITH_DEFAULTS)
public class WhateverSpringTest {

}

Take note that the test above is a JUnit 4 test, but could easily be written in JUnit 5 with the necessary changes.

@codefromthecrypt
Copy link

codefromthecrypt commented Nov 24, 2017

In zipkin, we are no longer using a servlet filter for http duration. Here's the code we used for a very short while due to this issue. Hopefully will help give a different perspective even if not integrated here. Hope it helps someone.

/**
   * The normal prometheus metrics filter implicitly registers a histogram which is hidden in a
   * field and not deregistered on destroy. A registration of any second instance of that filter
   * fails trying to re-register the same collector by design (by brian-brazil). The rationale is
   * that you are not supposed to recreate the same histogram. However, this design makes it
   * impossible to re-init a servlet (ex in spring boot tests).
   *
   * <p>This filter replaces the normal prometheus filter, correcting the design flaw by allowing us
   * to re-use the JVM singleton. It also corrects a major flaw in the upstream filter which results
   * in double-counting of requests when they are performed asynchronously.
   */
  static final class PrometheusDurationFilter implements Filter {
    final Histogram httpRequestLatency;

    PrometheusDurationFilter(Histogram httpRequestLatency) { // << pass singleton here
      this.httpRequestLatency = httpRequestLatency;
    }

    @Override public void init(FilterConfig filterConfig) {
       // If you need a no-args ctor, this implies the field httpRequestLatency is null.
       // you could look for a context parameter for the histogram (ex assigned by a listener)
       // failing that, you can lazy-init/register a new one. This impl just accepts a pre-ordained
       // staticly initialized histogram.
    }

    @Override public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse,
      FilterChain filterChain) throws IOException, ServletException {

      HttpServletRequest request = (HttpServletRequest) servletRequest;

      // async servlets will enter the filter twice
      if (request.getAttribute("PrometheusDurationFilter") != null) {
        filterChain.doFilter(request, servletResponse);
        return;
      }

      request.setAttribute("PrometheusDurationFilter", "true");

      Histogram.Timer timer = httpRequestLatency
        .labels(request.getRequestURI(), request.getMethod())
        .startTimer();

      try {
        filterChain.doFilter(servletRequest, servletResponse);
      } finally {
        if (request.isAsyncStarted()) { // we don't have the actual response, handle later
          request.getAsyncContext().addListener(new CompleteTimer(timer));
        } else { // we have a synchronous response, so we can finish the recording
          timer.observeDuration();
        }
      }
    }

    @Override public void destroy() {
    }
  }

  /** Inspired by WingtipsRequestSpanCompletionAsyncListener */
  static final class CompleteTimer implements AsyncListener {
    final Histogram.Timer timer;
    volatile boolean completed = false;

    CompleteTimer(Histogram.Timer timer) {
      this.timer = timer;
    }

    @Override public void onComplete(AsyncEvent e) {
      tryComplete();
    }

    @Override public void onTimeout(AsyncEvent e) {
      tryComplete();
    }

    @Override public void onError(AsyncEvent e) {
      tryComplete();
    }

    /** Only observes the first completion event */
    void tryComplete() {
      if (completed) return;
      timer.observeDuration();
      completed = true;
    }

    /** If another async is created (ex via asyncContext.dispatch), this needs to be re-attached */
    @Override public void onStartAsync(AsyncEvent event) {
      AsyncContext eventAsyncContext = event.getAsyncContext();
      if (eventAsyncContext != null) eventAsyncContext.addListener(this);
    }
  }

@brian-brazil
Copy link
Contributor

Thank you for sharing your code, would someone like to send in a PR to fix the async issue?

@yarix
Copy link

yarix commented Jan 1, 2018

the workaround for spring-dev-tools by clearing the registry... just works!

static {
    //HACK Avoids duplicate metrics registration in case of Spring Boot dev-tools restarts 
    CollectorRegistry.defaultRegistry.clear();
  }

are there any 'hidden' side affects for that?

@brian-brazil
Copy link
Contributor

The only issue I can see for tests is if you had a test depending on it not being cleared. I haven't heard of anyone running into that one.

@brian-brazil
Copy link
Contributor

This has gone a bit stale, and I don't think there's anything for us to do here especially as there's Spring 2 now which we don't support.

@ashwinibest
Copy link

even after doing this

    fun teardown() {
        CollectorRegistry.defaultRegistry.clear()
    }

my tests are flaky, sometimes they pass, but other time they fail
@brian-brazil can you help ?

@fstab
Copy link
Member

fstab commented Feb 18, 2023

This is a very old closed issue. Please create a new one, and provide more context, like which Spring Boot version you are using and how you are using client_java with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests