Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replacement relabel to multiple labels #894

Closed
jimmidyson opened this Issue Jul 15, 2015 · 10 comments

Comments

Projects
None yet
3 participants
@jimmidyson
Copy link
Member

jimmidyson commented Jul 15, 2015

My particular use case is splitting a Kubernetes Docker container name into multiple labels. This works but requires 3 relabels with the same matching regexp but different capture groups. Would be good to do this with a single relabel to multiple labels using a capture group for each label.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Jul 17, 2015

I've had this situation before. It's certainly a bit verbose. Relabeling as a concept is not trivial to begin with so I would like to avoid adding more complexity to it. Such a set definitions might be verbose but usually occurs only once or twice per setup.

The goal of relabeling is to provide a very generic approach that fits many different use cases. Naturally, it thus cannot provide minimal explicit solutions for all of them.

Would you have a proposal with a change to the configuration allowing this that is compatible to the current configuration format?

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jul 17, 2015

We generally assume that our users have configuration management that can do templating and the like, so we've always chosen repetition in configs if it keeps them simpler to understand and use.

This case has a bit more to it than config though, as metric ingestion is one of the places you can relabel and it's performance sensitive. Do we know how expensive the extra regex matches are?

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Jul 17, 2015

Is this a config you have in relabel_configs or metric_relabel_configs, @jimmidyson?

@brian-brazil we have not done any profiling for metric_relabel_configs – so I can only give a blind guess: the regexes are compiled on config loading. How much time relabeling per sample takes is thus highly dependant on the used regex and the number of rules. The former makes a general statement really hard.
We have some wiggle room regarding memory allocations in that phase.

@jimmidyson

This comment has been minimized.

Copy link
Member Author

jimmidyson commented Jul 17, 2015

metric_relabel_configs - we only have the info we want in the metric labels so need to relabel after scraping, before ingestion.

Can we assume that applying the same regex is O(n)? If so using multiple capture groups & allowing multiple target labels should increase performance?

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jul 17, 2015

Can we assume that applying the same regex is O(n)? If so using multiple capture groups & allowing multiple target labels should increase performance?

That'd be my assumption. The question is is the time to run the regex significant compared to everything else going on?

@jimmidyson

This comment has been minimized.

Copy link
Member Author

jimmidyson commented Jul 17, 2015

True - but it's a pretty trivial optimisation with no downsides that I can see, not much complexity added, etc - so no danger in implementing I wouldn't say.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jul 17, 2015

It'll make the configuration format more complicated, as I indicated above if repetition of config does the same job then we'll go with the more verbose config.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Jul 17, 2015

@jimmidyson so what are we talking about now? Doing an internal optimisation or simplifying the configuration?

For the latter, the point from previously still stands – relabeling attempts to be generic. Addition and changes only with a good backwards-compatible suggestion and reasoning that the special case is large enough compared to other special cases that it deserves special treatment.

For the former, first of all I don't think it is trivial implementation-wise. Also, small complications accumulate and after some time the code becomes hard to maintain.
Are there actual performance issues with the verbose rules you have to declare at the moment. If so we can absolutely think about where and how we can optimize.

@jimmidyson

This comment has been minimized.

Copy link
Member Author

jimmidyson commented Jul 17, 2015

We're not running at a scale to really tell whether it causes any issues, my guess is it won't but we'll find out :) I was more interested in reducing duplication in my config but understand your arguments against this. If this does cause any performance issues I'll reopen. Thanks!

@jimmidyson jimmidyson closed this Jul 17, 2015

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 24, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.