Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TagsCache returns wrong tags #185

Closed
AlexMisha opened this issue Jul 8, 2023 · 2 comments · Fixed by #189
Closed

TagsCache returns wrong tags #185

AlexMisha opened this issue Jul 8, 2023 · 2 comments · Fixed by #189
Assignees
Labels
bug Something isn't working
Milestone

Comments

@AlexMisha
Copy link

Version

4.3.8, 4.4.0

Context

I use Prometheus for metrics collection.
Worker pool in unordered mode is widely used in my project.
I encountered an exception after upgrade my project to vert.x 4.3.8:

java.lang.IllegalArgumentException: Prometheus requires that all meters with the same name have the same set of tag keys. There is already an existing meter containing tag keys [address]. The meter you are attempting to register has keys [pool_name].
        at io.micrometer.prometheus.PrometheusMeterRegistry.lambda$collectorByName$9(PrometheusMeterRegistry.java:360)
        at java.base/java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1940)
        at io.micrometer.prometheus.PrometheusMeterRegistry.collectorByName(PrometheusMeterRegistry.java:347)
        at io.micrometer.prometheus.PrometheusMeterRegistry.newGauge(PrometheusMeterRegistry.java:229)
        at io.micrometer.core.instrument.MeterRegistry.lambda$gauge$1(MeterRegistry.java:304)
        at io.micrometer.core.instrument.MeterRegistry.lambda$registerMeterIfNecessary$5(MeterRegistry.java:563)
        at io.micrometer.core.instrument.MeterRegistry.getOrCreateMeter(MeterRegistry.java:618)
        at io.micrometer.core.instrument.MeterRegistry.registerMeterIfNecessary(MeterRegistry.java:570)
        at io.micrometer.core.instrument.MeterRegistry.registerMeterIfNecessary(MeterRegistry.java:563)
        at io.micrometer.core.instrument.MeterRegistry.gauge(MeterRegistry.java:304)
        at io.micrometer.core.instrument.Gauge$Builder.register(Gauge.java:195)
        at io.vertx.micrometer.impl.meters.Gauges.get(Gauges.java:78)
        at io.vertx.micrometer.impl.meters.Gauges.get(Gauges.java:66)
        at io.vertx.micrometer.impl.VertxEventBusMetrics.messageReceived(VertxEventBusMetrics.java:122)
        at io.vertx.core.eventbus.impl.EventBusImpl.deliverMessageLocally(EventBusImpl.java:361)
        at io.vertx.core.eventbus.impl.EventBusImpl.sendLocally(EventBusImpl.java:341)
        at io.vertx.core.eventbus.impl.EventBusImpl.sendOrPub(EventBusImpl.java:329)
        at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.sendOrPub(ClusteredEventBus.java:199)
        at io.vertx.core.eventbus.impl.OutboundDeliveryContext.execute(OutboundDeliveryContext.java:109)
        at io.vertx.core.eventbus.impl.DeliveryContextBase.next(DeliveryContextBase.java:72)
        at io.vertx.core.eventbus.impl.OutboundDeliveryContext.next(OutboundDeliveryContext.java:28)
        at io.vertx.core.eventbus.impl.EventBusImpl.sendOrPubInternal(EventBusImpl.java:422)
        at io.vertx.core.eventbus.impl.EventBusImpl.sendOrPubInternal(EventBusImpl.java:428)
        at io.vertx.core.eventbus.impl.EventBusImpl.send(EventBusImpl.java:119)

It seems like VertxEventBusMetrics sometimes tries to pass tags from pool meter instead of [address].
After some debugging I found incorrect tags come from TagsCache.

I've made a simple concurrency test for TagsCache. It submits lots of tasks to worker pool via Vertx#executeBlocking in unordered mode. Every task consists of 3 steps:

  1. Select random io.vertx.micrometer.Label and transform it to Tags via Labels#toTags. It is 'expected' tags.
  2. Retrieve tags from TagsCache using label from previous step. It is 'actual' tags.
  3. Compare string representations of expected and actual tags. If they are not equal then report inconsistency.

Test catches tens of such inconsistencies, i.e. cache can return tags which differ from requested. I think this is the root cause of original exception.

Do you have a reproducer?

https://github.com/AlexMisha/vertx-metrics-reproducer

Steps to reproduce

  1. git clone https://github.com/AlexMisha/vertx-metrics-reproducer
  2. mvn install && java -jar target/vertx-metrics-reproducer.jar

It will print found inconsistencies in the following form:

Expected: [tag(class=6)], actual: [tag(failure=9)]
Expected: [tag(pool_name=11)], actual: [tag(client_namespace=12)]
Expected: [tag(class=6)], actual: [tag(remote=1)]

Extra

JDK 17
micrometer-registry-prometheus 1.1.0

@AlexMisha AlexMisha added the bug Something isn't working label Jul 8, 2023
@tsegismont tsegismont self-assigned this Jul 17, 2023
@tsegismont tsegismont added this to the 4.4.5 milestone Jul 17, 2023
@tsegismont
Copy link
Contributor

Thanks for reporting @AlexMisha , I'll look into this

tsegismont added a commit to tsegismont/vertx-micrometer-metrics that referenced this issue Jul 19, 2023
See vert-x3#185

Use flyweight and cache only when running on standard context in event loop.
Otherwise, the cache might return wrong tags (e.g. when invoked inside executeBlocking on standard context).

Signed-off-by: Thomas Segismont <tsegismont@gmail.com>
tsegismont added a commit to tsegismont/vertx-micrometer-metrics that referenced this issue Jul 19, 2023
Fixes vert-x3#185

Use flyweight and cache only when running on standard context in event loop.
Otherwise, the cache might return wrong tags (e.g. when invoked inside executeBlocking on standard context).

Signed-off-by: Thomas Segismont <tsegismont@gmail.com>
@tsegismont tsegismont linked a pull request Jul 19, 2023 that will close this issue
tsegismont added a commit that referenced this issue Jul 19, 2023
See #185

Use flyweight and cache only when running on standard context in event loop.
Otherwise, the cache might return wrong tags (e.g. when invoked inside executeBlocking on standard context).

Signed-off-by: Thomas Segismont <tsegismont@gmail.com>
tsegismont added a commit that referenced this issue Jul 19, 2023
Fixes #185

Use flyweight and cache only when running on standard context in event loop.
Otherwise, the cache might return wrong tags (e.g. when invoked inside executeBlocking on standard context).

Signed-off-by: Thomas Segismont <tsegismont@gmail.com>
@tsegismont
Copy link
Contributor

Closed in #189

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

Successfully merging a pull request may close this issue.

2 participants