Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report host metadata for Kubernetes logs #12790

Merged
merged 3 commits into from
Jul 23, 2019

Conversation

exekias
Copy link
Contributor

@exekias exekias commented Jul 4, 2019

Filebeat was not reporting host metadata in the default Kubernetes manifest,
this change gives Filebeat access to the hostNetwork to retrieve
localhost metadata. add_host_metadata is added to gather it.

After this change host metadata should be present in logs, which will make it easier to correlate them with host metrics.

Filebeat was not reporting host metadata in the default Kubernetes manifest,
this change gives Filebeat access to the hostNetwork to retrieve
localhost metadata. `add_host_metadata` is added to gather it.
@exekias exekias added enhancement review Filebeat Filebeat containers Related to containers use case Team:Integrations Label for the Integrations team labels Jul 4, 2019
@exekias exekias self-assigned this Jul 4, 2019
@exekias exekias added the discuss Issue needs further discussion. label Jul 4, 2019
@exekias
Copy link
Contributor Author

exekias commented Jul 4, 2019

Wondering if we should go this way or just report host.name from kubernetes metadata: #10926, the added benefit of doing it there is that it should always work, regardless the deployment model, on the other side, it will report less info (hostname only, no other network/os/arch data)

@odacremolbap
Copy link
Contributor

kubernetes node info is partly managed by the cloud controller manager.
there is no guarantee that you get the same name when you use retrieve the hostname from the OS API than when you use the kubernetes API.

iirc for AWS kubernetes would return the FQDN, it is up to cloud providers to fill the node information (including IPs) with whatever information they find relevant.

That said, I think the kubernetes API contains all data needed to correlate with other non kubernetes events if needed. Also, I haven't looked at how beats retrieve node info but Node Status contains some of the info you mention above.

Without having a strong opinion, as a user I would like to have the kubernetes API info to avoid multi-sources, but I wouldn't put the info into host.name since beats seem to allow add_host_metadata alongside any other metric that might be trying to re-inform host related fields.

Is this kind of overlapping something we have faced before?

@exekias
Copy link
Contributor Author

exekias commented Jul 8, 2019

kubernetes node info is partly managed by the cloud controller manager.
there is no guarantee that you get the same name when you use retrieve the hostname from the OS API than when you use the kubernetes API.

iirc for AWS kubernetes would return the FQDN, it is up to cloud providers to fill the node information (including IPs) with whatever information they find relevant.

That's a fair point, if we use different codepaths between Metricbeat and Filebeat we could end up with different values -> problems.

That said, I think the kubernetes API contains all data needed to correlate with other non kubernetes events if needed. Also, I haven't looked at how beats retrieve node info but Node Status contains some of the info you mention above.

Without having a strong opinion, as a user I would like to have the kubernetes API info to avoid multi-sources, but I wouldn't put the info into host.name since beats seem to allow add_host_metadata alongside any other metric that might be trying to re-inform host related fields.

I see some benefits from extracting all host metadata from Kubernetes, that link looks promising. For instance, Filebeat should not need to have access to the host network just to retrieve the hostname.

Is this kind of overlapping something we have faced before?

We had situations like this, for instance, add_cloud_metadata won't overwrite existing info:

, where some modules report cloud metadata for the service they are monitoring (ie AWS)

I'm currently leaning towards the approach on this PR, for some reasons:

  • If we have a single agent in the future, it will require the permissions anyway
  • This change makes sure we use the same code paths both for Metricbeat and Filebeat, so data must be the same
  • add_host_metadata should add uniform data across k8s & non k8s scenarios, and UI uses it heavily

This should not stop any other research we can do in the future to revisit how metadata is added.

WDYT?

@odacremolbap
Copy link
Contributor

fully agree

@exekias exekias removed the discuss Issue needs further discussion. label Jul 8, 2019
@odacremolbap odacremolbap self-requested a review July 22, 2019 09:59
Copy link
Contributor

@odacremolbap odacremolbap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM if that's so to CI

@exekias exekias merged commit fe18c0c into elastic:master Jul 23, 2019
@exekias exekias added the v7.3.0 label Jul 23, 2019
exekias pushed a commit to exekias/beats that referenced this pull request Jul 23, 2019
* Report host metadata for Kubernetes logs

Filebeat was not reporting host metadata in the default Kubernetes manifest,
this change gives Filebeat access to the hostNetwork to retrieve
localhost metadata. `add_host_metadata` is added to gather it.

(cherry picked from commit fe18c0c)
exekias added a commit that referenced this pull request Jul 23, 2019
…13027)

* Report host metadata for Kubernetes logs (#12790)

* Report host metadata for Kubernetes logs

Filebeat was not reporting host metadata in the default Kubernetes manifest,
this change gives Filebeat access to the hostNetwork to retrieve
localhost metadata. `add_host_metadata` is added to gather it.

(cherry picked from commit fe18c0c)
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
… logs (elastic#13027)

* Report host metadata for Kubernetes logs (elastic#12790)

* Report host metadata for Kubernetes logs

Filebeat was not reporting host metadata in the default Kubernetes manifest,
this change gives Filebeat access to the hostNetwork to retrieve
localhost metadata. `add_host_metadata` is added to gather it.

(cherry picked from commit dbbe30b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
containers Related to containers use case enhancement Filebeat Filebeat review Team:Integrations Label for the Integrations team v7.3.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants