-
Notifications
You must be signed in to change notification settings - Fork 16.3k
StatsD metrics have tags + generate reference docs #12158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| # TODO: Remove for Airflow 2.0 | ||
| filename = file_stat.file.split('/')[-1].replace('.py', '') | ||
| Stats.timing(f'dag.loading-duration.{filename}', file_stat.duration) | ||
| Stats.timing('dag.loading-duration.{filename}', file_stat.duration, labels={"filename": filename}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also add (on top of existing ones) new metrics that have just "short" name and the variables are only passed as labels?
For example in this case:
Stats.timing('dag.loading-duration', file_stat.duration, labels={"filename": filename})
I think those will be much easier to aggregate in some tools.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe we should leave the old stats as they are and only add labels in the new ones ? WDYT? We could even add some prefix/indication that those stats are the "labeled" ones and distinguish them in the stats documentation ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only problem is that not every client supports tags, and some clients support tags in a different way than the official Datadog. For this reason, I preferred not to add more keys, but leave these decisions to the statsd client.
If the client supports the tags, it will keep metrics in the best form for itself.
We currently have two clients:
- Classic statsd client (without tags support), which in this case will get a metric -
dag.loading-duration.my-awesome-file.pywithout tags - Datadog statsd (with tags support), which in this in this case will get a metric -
dag.loading-duration.filenamewith tagfilename:my-awesome-file.py
So you can aggregate this data if you need it.
This key name is not ideal when the client supports tags, but we don't need to generate the metrics key twice. On the other hand, writing the available parameters in the name of the metrics is not a stupid idea. This can help us find the tags more easily.
It is also worth adding that Stackdriver stores labels differently. Instead of tags, it uses a dictionary that has a normal key and value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added documentation that describes this behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi guys.
Usually, best practice is to have generic metric names and use tags for any variable parameters, that it will make it possible to do all possible breakdowns, slice and dice etc. in the monitoring tool (if it supports it of course).
With that said, even if it is a weird name I actually do not see problem to have metric name like dag.loading-duration.filename with tags=['filename={filename}'], if this is going to simplify implementation/support/usage.
However, there is one caveat. What if some time later we are going to extend metric with additional tags, e.g. for metric operator_failures_{task_type} we would like to add one more tag like dag_id. Maybe not a very good example, but I am pretty sure for some metrics we could put more information that will be helpful to build breakdown by.
Then we actually will have to change name of the metric, that will be a breaking change, cause all existing setup alerts/dashboards will stop to work (or we will have to support two metrics from now operator_failures_{task_type} + operator_failures_{task_type}_{dag_id}, which is not also ideal).
So, as for me, actually omitting placeholders in names like operator_failures/dag.loading-duration/... and using tags should be the best for clients which support tags and want to leverage them. Unless we are sure that we will not add more tags to existing metrics, which can be not the true.
Then it seems like having maybe two type of metrics, one without tags, other with tags without placeholders in the names might be a good solution.
But maybe this is not an issue. What do you think guys about the fact that metrics can be extended with additional tags in the future.
… metrics have tags
|
@Acehaidrey do you have any thoughts on this change? |
|
some confflicts - but it's good for me. |
| @_format_safe_stats_logger_args | ||
| def incr(self, stat, count=1, rate=1, labels=None): | ||
| """Increment stat""" | ||
| del labels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This prevents an error about an unused variable and at the same time we do not have a poorly descriptive variable _.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually needed? Because of the _format_safe_stats_logger_args decorator, this is never called with any value:
def func_wrap(_self, stat, *args, labels=None, **kwargs):
if labels:
# Remove {} from stat key e.g. {class_name} => class_name
stat = stat.format(**labels)
return validate_stat(func)(_self, stat, *args, **kwargs)
labels is an explicit parameter, so won't be in the kwargs passed to this function (func in that snippet)
| @functools.wraps(func) | ||
| def func_wrap(_self, stat, *args, labels=None, **kwargs): | ||
| if labels: | ||
| # Remove {} from stat key e.g. {class_name} => class_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| # Remove {} from stat key e.g. {class_name} => class_name | |
| # Remove {} from stat key e.g. {class_name} => SchedulerJob |
| description: str | ||
|
|
||
|
|
||
| METRICS_LIST: List[Metric] = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure the purpose of this List, nor the Metric class -- it appears this is never used for emitting metrics at runtime, so I'm not sure why we need this?
What was your thinking here please?
|
I honestly think this is great. I'm not sure how to capture this as we use opentsdb and have an internal client for it, so I had to make changes to all of these stat calls to match the same signature and use tags instead of the |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |
| Stats.incr(f'operator_successes_{self.task.task_type}', 1, 1) | ||
| task_type = self.task.task_type | ||
| Stats.incr('operator_successes_{task_type}', 1, 1, labels={"task_type": task_type}) | ||
| Stats.incr('ti_successes') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to add the dag/task tags to ti_successes and ti_failures as well? I think this would useful for aggregation so we can ignore or watch specific dags for failures
|
Are there plans to still merge this? |
While I was working on generating automatic reference documentation, I entered the labels parameter to be able to generate text lazily. However, when I took a closer look at it, I realized that it can also be used to generate valid tags for statsd metrics.
Close: #11463
The name of the metric is not perfect as it contains placeholders, but it is good enough to be found in the documentation. We don't have two different sets of keys.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.