New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[docker] Add host tag for docker swarm node role. #3735
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great proposal.
I think the tags at the host level might be too redundant with the container meta @mfpierre is working on. Being able to tag the metrics with the node role is the goal here.
As you implemented the method to extract those, I think we can get the best out of the two worlds:
Use the container meta for the tag at the host level (i.e. infra list) and use the tag extractor in the check to add the tags to the docker metrics
We would want people to be able to use this new I'm still not clear on why it might make more sense as host metadata vs. being a host tag? If a tag is included on all metrics anyway isn't that the same thing as being a host tag? |
The core difference between host tags and host metadata is that host tags are part of the context of metrics coming from said host, whereas host metadata is not. So a change in host tags invalidates all metrics coming from the host and will require creating new time series for these metrics. This is why we avoid using host tags for things that can change frequently and/or all at once (or in a short time frame) across many hosts (like the docker version for example). We also allow filtering by host tags in queries, but not by host metadata. In this case it's fine to use tags as hosts don't flap between worker/manager. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great first contrib 🎉
LGTM
I did some testing to determine what would happen with the tags if a worker node was promoted to a manager and/or manager demoted to a worker. Since the host metadata (and host tags) is collected every 4 hours by default, it can take a long time for the tags to get updated for metrics (unless the agent is restarted). If it makes sense to have the |
Please note that tags from the Agent5 orchestrator package / Agent6 Tagger are only applied to docker / kubernetes metrics and AutoDiscovery checks (but are constant once the check template is resolved). |
This is a good point. However host tags allow us to factor this info in a single payload instead of attaching it to each metric (it's common for hosts to have 10's of tags x thousands of metrics). And promoting a worker to manager shouldn't happen often in prod. Immutability dictates that one should rather create a new node built and configured for that purpose. I don't think it's too bad to impose the 4h delay in this context. edit: + what Xavier said |
@xvello Could you elaborate on what you mean by
I am seeing metrics for |
@devonboyer indeed, host tag will be better suited than container tag for this use case, let's go with that code |
Note: Please remember to review the Datadog Contribution Guidelines
if you have not yet done so.
What does this PR do?
This adds a host tag to docker swarm nodes containing the node role (manager or worker).
Motivation
Testing Guidelines
Additional Notes
To determine if the node is a manager/worker this checks the
ControlAvailable
flag in the response fromdocker info
the same it is done by the docker cli.