Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-gauge: create multiple metrics from a single function #807

Closed
ochezeau opened this issue Aug 26, 2018 · 23 comments
Closed

Multi-gauge: create multiple metrics from a single function #807

ochezeau opened this issue Aug 26, 2018 · 23 comments
Milestone

Comments

@ochezeau
Copy link

Hi,
I tried to create a list of gauge from the result of a query and i didn't find any way to do that.

I have the following query which return the size of elements in database (postgres) :

SELECT N.nspname || '.' || C.relname AS "relation",
CASE WHEN reltype = 0
THEN pg_total_relation_size(C.oid)
ELSE pg_total_relation_size(C.oid)
END AS "size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
LEFT JOIN pg_tables T ON (T.tablename = C.relname)
LEFT JOIN pg_indexes I ON (I.indexname = C.relname)
WHERE nspname NOT IN ('pg_catalog','pg_toast','information_schema');

My idea was to create a list of gauge with : id : 'db.relation.size' with tag 'relation' and value 'size'.

So that why i wanted to create a single function that perform the sql call and that allow me to create a list of gauge.

Maybe I'm wrong and that's not the right way of doing that or maybe this is contrary to the philosophy of Micrometer.

Thanks in advance of your help.

@jkschneider jkschneider changed the title Create multiple metrics from a single function Multi-gauge: create multiple metrics from a single function Sep 6, 2018
@jkschneider jkschneider added this to the 1.1.0-rc.1 milestone Sep 6, 2018
@jkschneider
Copy link
Contributor

This has been frequently requested, and depends on us first being able to expire metrics that go unused. It probably also requires a strong-reference gauge.

@kristwaa
Copy link

kristwaa commented Oct 9, 2018

This has been frequently requested, and depends on us first being able to expire metrics that go unused.

I've tried a very simple version of this, where the observed value "expires" after being read (I made a DecayingInteger-class extending AtomicInteger). If no new value was provided until the next read, it would return a default value (typically zero in my case). This was also influenced by the lack of being able to remove a meter from the registry. The expiry / decay logic could be more complex if required.

My use-case was counts of something in the form of a SELECT ... GROUP BY something query, where you wouldn't get a count of zero if there were no instances of a given value of something.

@jkschneider
Copy link
Contributor

@kristwaa I think you discovered the implicit dependency of this issue, which is meter removal (which in turn leads to expiration). I'm putting the finishing touches on removal right now. Once done, the MultiGauge will I believe be a very similar interface to TimeGauge, i.e. one that doesn't need specific implementations for each registry implementation.

@ochezeau
Copy link
Author

Thank you for the new feature and for the fantastic job done on Micrometer.

@jkschneider
Copy link
Contributor

@checketts Any input on this implementation? Does it work well enough for you?

@checketts
Copy link
Contributor

checketts commented Oct 11, 2018

The solution I needed in the past (previous employer) was either:

  1. A single gauge that could report multiple tag dimensions.

or

  1. A way to dynamically register multiple gauges that share a common datasource.

So if I had some database query: SELECT count(*) from job group by status that returns:

10    | OPEN
3     | DELAYED
80    | RUNNING

Then I would want to run my expensive query once, and report the multiple dimensions (option 1) or have some hook that Micrometer could publish (option 2) so I would know a given scrape is occurring and I should go update all my multigauges with the results of a single query.

The implementation could help with option 2, but doesn't complete solve it.

@jkschneider
Copy link
Contributor

jkschneider commented Oct 11, 2018

@checketts So with one more overload, the experience would be something like this:

// SELECT count(*) from job group by status WHERE job = 'dirty'
MultiGauge statuses = MultiGauge.builder("statuses")
        .tag("job", "dirty")
        .description("The number of widgets in various statuses")
        .baseUnit("widgets")
        .register(registry);

...

// run this periodically whenever you re-run your query
statuses.register(resultSet.stream().map(result -> Row.of(Tags.of("status", result.getAsString("status")), result.getAsInt("count"))));

The resultSet interaction is of course fictional.

This yields gauges like:

statuses{job: dirty, status: OPEN} = 10
statuses(job: dirty, status: DELAYED} = 3
statuses{job: dirty, status: RUNNING} = 80

MultiGauge isn't a meter itself, it is just a bag of gauges. It manages merging two sets of tags+values together, removing any gauges that aren't needed anymore and adding ones that are newly formed.

@checketts
Copy link
Contributor

checketts commented Oct 11, 2018

Sounds good! So then we would have the user just use their own mechanism like a scheduled thread or something to run the query? Or is there some room for us to give the user some sort of hook?

@jkschneider
Copy link
Contributor

jkschneider commented Oct 11, 2018

I toyed with the idea of counting down a latch each time a gauge in the bag was called and providing a callback mechanism in MultiGauge, but started worrying about interleaving poll calls (e.g. from multiple Prom scrapes) and decided to forgo it for now. Maybe could work by maintaining a latch per thread, but yikes.

@ochezeau
Copy link
Author

By adding the possibility to register a "pre-scrape function" for the Prometheus registry could help doing that, no ?

@jkschneider
Copy link
Contributor

@ochezeau Unfortunately not, because multiple Prometheus instances may be scraping the host at once.

@checketts
Copy link
Contributor

checketts commented Oct 12, 2018

The pre-scrape function could work if we based it on setting a thread local so the distinct scrapes appear as distinct thread locals.

Even an imperfect hook I think would be better than nothing. Since sleeping a thread yourself and refreshing your gauges every x seconds would suffer from the same corner case.

@ochezeau
Copy link
Author

@jkschneider Indeed, I did not think about the fact that we could have several Prometheus scrape in parallel.

In this case, having a scheduler that calculates at regular intrerval the metric seems to be the right solution.

Having the problem of parallel scrape raises the question of performance. It is important especially when using function tracking to have a very inexpensive answer.

So, would not it be possible to set up a kind of parametric cache that allowed to always return the same answer for a small period of time.

This would perhaps solve some performance and multi threading issues, and it will also allow to implements pre scrape functions.

This poses perhaps a problem on the true philosophy of Pometheus which is to recover a real value at a moment. This would require having a cache configuration lower than the scrape duration and thus bind Micrometer to the Prometheus configuration.

Do to think this can be a good Idea ?

@jkschneider
Copy link
Contributor

So, would not it be possible to set up a kind of parametric cache that allowed to always return the same answer for a small period of time.

For MultiGauge anyway, that is exactly what you are doing by providing a fixed answer for each gauge in the bag:

// run this periodically whenever you re-run your query
statuses.register(resultSet.stream().map(result -> Row.of(Tags.of("status", result.getAsString("status")), result.getAsInt("count"))));

All other OOTB instrumentation for Micrometer, even those that function track, should already be cheap.

Also, we may be prematurely optimizing for CPU cost by having a more complicated caching mechanism when it is likely you are only receiving scrapes from a handful of Prometheus instances even in an HA setup. If your service regularly is capable of taking in 10k RPS, what is the difference between 1 and 10 Prometheus scrapes in an interval? And remember, any caching mechanism simply trades CPU cost for memory.

@ochezeau
Copy link
Author

@jkschneider Ok i understand, thank you for the answer and for the time spent on this question

@Docjones
Copy link

Docjones commented Jul 4, 2019

Hello!

i found this great feature - but i cannot find a few examples. I am (currently still) weak in terms of stream usage. I have a List<T> and want to register the elements to the MultiGauge and Mapping a few field of T.

can anyone enlighten me a bit?

thanks
/M

@checketts
Copy link
Contributor

@Docjones Let's move this to a stack overflow question. Post a code sample there of what you have so far.

@Docjones
Copy link

Docjones commented Jul 5, 2019

I am sorry, but i dont have much:

List<T> with 100 Elements read from a database table using springboot/hibernate
T contains the fields

  • unique identifier
  • longitude
  • latitude
  • value

i want to turn this into multiple gauges like

object{id=<unique identifier>,longitude=<longitude>,latitude=<latitude>} <value>
object{id=<unique identifier>,longitude=<longitude>,latitude=<latitude>} <value>
object{id=<unique identifier>,longitude=<longitude>,latitude=<latitude>} <value>
...

The metrics shall be scraped by prometheus and then visualized using grafana/worldmap

i am building my multi-gauge using
MultiGauge l100 = MultiGauge.builder("metric-last100").register(registry);

and terribly failing at converting that List<T> into a stream to feed l100.register(...) as shown above

Please forgive me my dumbness

@checketts
Copy link
Contributor

Let's answer this on https://stackoverflow.com/ so it doesn't cause noise in the issue tracker. (Especially closed tickets)

@Docjones
Copy link

Docjones commented Jul 6, 2019

ok

@Docjones
Copy link

Meh... Not a single answer since 11 days. Great feature (at least i think so), but either i am too dumb or there is a lack of documentation...

@debashish-github
Copy link

I have similar issue . No examples or documentation.. its pity..

@massamany
Copy link

Hi,

Thanks for the job. But, Couldn't it have been possible to use a Supplier for the Rows, instead of registering directly the List ? It would have been a lot more flexible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants