Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read custom histogram latency buckets from env #1134

Merged
merged 3 commits into from
Aug 24, 2021
Merged

read custom histogram latency buckets from env #1134

merged 3 commits into from
Aug 24, 2021

Conversation

elite-koder
Copy link
Contributor

the default latency buckets (.005, .01, .025, .05, .075, .1, .25, .5, .75, 1.0, 2.5, 5.0, 7.5, 10.0, INF) prometheus client provides but it is not ideal for all scenarios.
prometheus python client has feature to provide custom latency buckets when creating Histogram objects. but flower does not have any ft to provide custom latency buckets from outside. so I have added one.

@mher
Copy link
Owner

mher commented Aug 6, 2021

cc @Tomasz-Kluczkowski

@Tomasz-Kluczkowski
Copy link
Contributor

Hi,

I should have time to review this tomorrow.

I should think we want also a CLI option?

Copy link
Contributor

@Tomasz-Kluczkowski Tomasz-Kluczkowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

Left some comments about the current implementation.
This is good feature to have.

@mher If I missed anything or you disagree with my review - please add your comments :).

flower/events.py Outdated
# float('inf')]"
buckets = Histogram.DEFAULT_BUCKETS
if os.getenv("PROMETHEUS_LATENCY_BUCKETS"):
buckets = eval(os.environ["PROMETHEUS_LATENCY_BUCKETS"])
Copy link
Contributor

@Tomasz-Kluczkowski Tomasz-Kluczkowski Aug 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid using eval for this (and in general), in fact since you actually do import os here a perfect storm is created for a malicious actor to find a way to supply the environment variable like this:

export PROMETHEUS_LATENCY_BUCKETS="os.system('rm -rf /')"

Your root and everything in it gets force-deleted :).

Therefore let's use what we already have - the options

In fact they are prepared for what you want to do!

See:
flower/command.py -> apply_env_options for how the environment variable will be handled (lists are supported) Here we support supplying comma separated list of values of a given type specified in the option automatically.

and
flower/command.py -> apply_options for how command line arguments will be handled (multiple are supported), so via command line you have to do:
--option=value1 --option=value2 and this will be converted to option=[value1, value2] by the library processing the command line arguments for Flower (called click).

Also see:
flower/options.py

define("basic_auth", type=str, default=None, multiple=True,
       help="enable http basic authentication")

As an example of an option which has multiple values. In your case, you want to set type=float and keep multiple=True.

This should be enough to get this going auto-magically.

So I would add a new option in options.py

define("task_runtime_metric_buckets", type=float, default=Histogram.DEFAULT_BUCKETS, multiple=True,
       help="please write some help text here ....")

This then gives you a sensible default that you want to use anyway...

flower/events.py Outdated
# buckets export PROMETHEUS_LATENCY_BUCKETS="[.05, .1, .2, .5, 1.0, 2.0, 5.0, 10.0, 60.0, 120.0, 300.0, 600.0,
# float('inf')]"
buckets = Histogram.DEFAULT_BUCKETS
if os.getenv("PROMETHEUS_LATENCY_BUCKETS"):
Copy link
Contributor

@Tomasz-Kluczkowski Tomasz-Kluczkowski Aug 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is specifically for run time metric I would call it TASK_RUNTIME_METRIC_BUCKETS .
But since we should be using options for this it will be automatically named for you:
FLOWER_<option name you use in upper case> (see: docs/config.rst for more info)

Anyway, if you use the options you just have to supply it to buckets at line 39 as in the comment below - this code can be removed, all is done for you.

flower/events.py Outdated
'flower_task_runtime_seconds',
"Task runtime",
['worker', 'task'],
buckets=buckets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here you just want to access the global options module and say:

buckets=options.task_runtime_metric_buckets

The option will be filled in with a default if no env var or command line args are passed in...

@Tomasz-Kluczkowski
Copy link
Contributor

Tomasz-Kluczkowski commented Aug 21, 2021

Additional things missing:

  • tests - I would like to see a simple test showing buckets supplied are actually replacing the defaults used by flower_task_runtime_seconds metric and a test showing that the defaults from Histogram are used if no env var/cli args are supplied.
  • documentation - if we add this, it should be mentioned in man.rst, config.rst at least, (just see where options are mentioned please)

@elite-koder
Copy link
Contributor Author

@Tomasz-Kluczkowski Thanks for guidance, I have addressed all comments

@mher mher merged commit 24534be into mher:master Aug 24, 2021
@elite-koder elite-koder deleted the user-defined-prometheus-latency-buckets branch August 24, 2021 11:31
'flower_task_runtime_seconds',
"Task runtime",
['worker', 'task'],
buckets=options.task_runtime_metric_buckets
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Tomasz-Kluczkowski env var FLOWER_TASK_RUNTIME_METRIC_BUCKETS not populating in options. something strange happening. not able to figure out. help me
FYI I'm using development branch. the updated code is there in machine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Tomasz-Kluczkowski can you pls look into this ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

I should have time later next week.
A bit busy with work/life currently but yes it's on my list, sorry for not showing a sign of life sooner...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

I just had an initial look at this and know why we have this problem.

Unfortunately my broadband provider has an incident going on so I am just typing this on a phone.

We are using PrometheusMetrics class in EventState. The class has only class variables, meaning they are set in stone on initial import, waaay before the options object is modified by in command.py when the command line args are parsed.

We will have to refactor how this class is built so that we can pass config into it or delay setting instruments configurations.

Easiest first point of attack is to give it an init method and set all the instruments in that method instead of making them class variables. Then they should be initialized correctly from that options. We could also just pass this runtime buckets setting into the init of PrometheusMetrics and decouple it from options...

@mher / @Alienr ,let me know if you have other ideas. When I get some internet back I will try to push some proposed changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants