Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add host metadata processor #5968

Merged
merged 2 commits into from Mar 16, 2018

Conversation

@ruflin
Copy link
Collaborator

commented Dec 29, 2017

This adds a processor to add some information about the host machine to each event very similar to what we have for the cloud metadata processor. The idea is to make it possible to allow better filtering based on additional information about the host if needed. This information partially overlaps with what we have in beat.hostname but adds additional fields. Other fields like the IP address(es) could be also added here.

Copy link
Member

left a comment

+1 to this, would it make sense to send this info always by default instead of adding a processor?


func (p addHostMetadata) Run(event *beat.Event) (*beat.Event, error) {

info, err := host.Info()

This comment has been minimized.

Copy link
@exekias

exekias Jan 2, 2018

Member

Do you have any benchmark on this? it would make sense to move it to the constructor and cache the result, we just need to document that a hostname change won't get reflected here

This comment has been minimized.

Copy link
@ruflin

ruflin Feb 26, 2018

Author Collaborator

I changed it to be instantiated in the constructor. I'm pretty sure later a request for expiring the info is coming up :-)

@andrewkroh

This comment has been minimized.

Copy link
Member

commented Jan 2, 2018

I was kind of working on something related over the break. It can (or will be able to) fetch some of the information that you want to add with this processor. I am working on this because I wanted to log (#5946) some of this information and also make this data available for xpack monitoring.

https://github.com/andrewkroh/go-sysinfo

@tsg

This comment has been minimized.

Copy link
Collaborator

commented Jan 2, 2018

Big +1 to this. I think a processor makes sense to mirror add_cloud_metadata and co, but we can put it into the default configuration file.

@ruflin

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 3, 2018

@andrewkroh Very nice. Did not look into your code yet. Did you use a third party lib or implement it yourself?

@monicasarbu

This comment has been minimized.

Copy link
Contributor

commented Jan 5, 2018

+1 on adding add_host_metadata, that will also add cloud metadata in the future.

@ruflin

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 26, 2018

This will close #3480

@ruflin ruflin force-pushed the ruflin:add_host_metadata branch from 42d3931 to 485145d Feb 26, 2018
"hostname": p.info.Hostname,
"id": p.info.UniqueID,
"os": common.MapStr{
"architecture": p.info.Architecture,

This comment has been minimized.

Copy link
@ruflin

ruflin Feb 27, 2018

Author Collaborator

@andrewkroh Seeing that you have Architecture outside of info I have second thoughts on puttig it under os as I agree it more belongs to the host then the os itself.

Also thinking about putting os info on the top level.

package add_host_metadata

import (
sysinfo "github.com/elastic/go-sysinfo"

This comment has been minimized.

Copy link
@ruflin

ruflin Feb 27, 2018

Author Collaborator

@andrewkroh Did you have to add a - to the lib name? :-D

This comment has been minimized.

Copy link
@andrewkroh

andrewkroh Mar 5, 2018

Member

With Go you don't need to add a package alias, just import plain old "github.com/elastic/go-sysinfo" and use sysinfo.XYZ in the code.

This comment has been minimized.

Copy link
@ruflin

ruflin Mar 6, 2018

Author Collaborator

It was mainly my IDE that complained :-)

@ruflin ruflin changed the title [Discuss] Add host metadata processor Add host metadata processor Feb 27, 2018
@ruflin ruflin force-pushed the ruflin:add_host_metadata branch from 82df182 to 9da8b6a Feb 27, 2018
@@ -0,0 +1,26 @@
package add_host_metadata

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 27, 2018

don't use an underscore in package name

@andrewkroh

This comment has been minimized.

Copy link
Member

commented Mar 5, 2018

I'm glad you are trying out go-sysinfo. I'll take a closer look at this next week. I've been sitting on a bunch of updates to go-sysinfo, that I haven't had time to cleanup. I plan on getting the lib into a better place so that this can make 6.3.

@ruflin

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 6, 2018

@andrewkroh Good timing. I plan to put some work into this next week again. But also plan to start with a very basic version first which we then can expand on. The current fields are probably already enough for a first version.

@elastic elastic deleted a comment from houndci-bot Mar 7, 2018
@elastic elastic deleted a comment from houndci-bot Mar 7, 2018
Copy link
Member

left a comment

This looks like it's just about ready. 👍

}

func (p addHostMetadata) String() string {
return "add_host_metadata="

This comment has been minimized.

Copy link
@andrewkroh

andrewkroh Mar 7, 2018

Member

How about add_host_metadata=[] to indicate an empty config?

This comment has been minimized.

Copy link
@ruflin

ruflin Mar 13, 2018

Author Collaborator

done

return nil, err
}
return &addHostMetadata{
info: h.Info(),

This comment has been minimized.

Copy link
@andrewkroh

andrewkroh Mar 7, 2018

Member

I think we will want to refresh this data every couple of minutes (image that the user installs an OS update). And given that we are caching this data we might was well cache the common.MapStr containing it.

I'd probably add a time.Time value to track when the value was collected. Then refresh the value from Run after it becomes stale.

This comment has been minimized.

Copy link
@ruflin

ruflin Mar 13, 2018

Author Collaborator

I cached the mapstr and set the cache expiry to 5 minutes. We can still make it a config later if needed.

"os": common.MapStr{
"platform": p.info.OS.Platform,
"version": p.info.OS.Version,
"family": p.info.OS.Family,

This comment has been minimized.

Copy link
@andrewkroh

andrewkroh Mar 7, 2018

Member

Did you want codename and build?

This comment has been minimized.

Copy link
@ruflin

ruflin Mar 13, 2018

Author Collaborator

I added it.

Timestamp: time.Now(),
}
p, err := newHostMetadataProcessor(nil)
assert.NoError(t, err)

This comment has been minimized.

Copy link
@andrewkroh

andrewkroh Mar 7, 2018

Member

This should expect to receive sysinfo/types.ErrNotImplemented for any OS other than windows, linux, and darwin.

This comment has been minimized.

Copy link
@ruflin

ruflin Mar 13, 2018

Author Collaborator

done

data := common.MapStr{
"host": common.MapStr{
"hostname": p.info.Hostname,
"id": p.info.UniqueID,

This comment has been minimized.

Copy link
@andrewkroh

andrewkroh Mar 7, 2018

Member

info.UniqueID is optional (as signified by ,omitempty in the json tag). You should add handling for this. Older versions of Linux don't have /etc/machine-id.

This comment has been minimized.

Copy link
@ruflin

ruflin Mar 13, 2018

Author Collaborator

done

@ruflin ruflin force-pushed the ruflin:add_host_metadata branch from 9da8b6a to ec4e196 Mar 13, 2018
@@ -0,0 +1,80 @@
package add_host_metadata

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Mar 13, 2018

don't use an underscore in package name

@andrewkroh

This comment has been minimized.

Copy link
Member

commented Mar 13, 2018

I think we no longer need to keep github.com/elastic/procfs in vendor/. I'm pretty sure any changes got pushed upstream. Can you try removing references to github.com/elastic/procfs and replace them with github.com/prometheus/procfs so we don't have a duplicate dep under two names.

@ruflin

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 14, 2018

github.com/elastic/procfs is still used in the system/socket and system/raid metricset. I suggest to make this change in a follow up PR and not mix it with this one.

@ruflin ruflin force-pushed the ruflin:add_host_metadata branch from b3a7396 to 7fbe357 Mar 14, 2018
@ruflin ruflin added review and removed discuss labels Mar 14, 2018
@ruflin

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 14, 2018

Here is a separate PR that changes the dependency. We can merge the other one first and rebase on top of it: #6548

@ruflin ruflin force-pushed the ruflin:add_host_metadata branch from 7fbe357 to 08123ea Mar 14, 2018
This adds a processor to add some information about the host machine to each event very similar to what we have for the cloud metadata processor. The idea is to make it possible to allow better filtering based on additional information about the host if needed. This information partially overlaps with what we have in `beat.hostname` but adds additional fields. Other fields like the IP address(es) could be also added here.
@ruflin ruflin force-pushed the ruflin:add_host_metadata branch from 08123ea to 107709c Mar 14, 2018
@ruflin ruflin referenced this pull request Mar 15, 2018
@ruflin

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 15, 2018

This PR currently fails because of a crosscompile issue with darwin 32 bit: elastic/go-sysinfo#3

@andrewkroh

This comment has been minimized.

Copy link
Member

commented Mar 16, 2018

I believe elastic/go-sysinfo#4 should fix the cross-compile issue seen here. The sysinfo code was lacking cgo tags to prevent the darwin provider from being included when cross-compiling on Linux (without a C x-compiler).

@andrewkroh andrewkroh merged commit 1fbeb65 into elastic:master Mar 16, 2018
3 of 4 checks passed
3 of 4 checks passed
continuous-integration/travis-ci/pr The Travis CI build failed
Details
CLA Commit author has signed the CLA
Details
beats-ci Build finished.
Details
hound 1 violation found.
@TimWardOrigami

This comment has been minimized.

Copy link

commented Mar 28, 2018

Is the host's FQDN included in the output (eg "hostname -f" output)? If not should/can it be?

@ruflin ruflin deleted the ruflin:add_host_metadata branch Apr 3, 2018
@ruflin

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 3, 2018

So far the code use to create the hostname is os.Hostname() which is the same as under beat.hostname. Now that we have a processor for the data we should make it a potential config option for it to be the FQDN if it not already is. I assume the fetch the FQDN we will need to do some lookups. @TimWardOrigami Want to add a feature request for this?

@andrewkroh WDYT on where this belongs? Should we have this in go-sysinfo?

@TimWardOrigami

This comment has been minimized.

Copy link

commented Apr 3, 2018

@ruflin How do I "add a feature request" please?

@ruflin

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 3, 2018

@TimWardOrigami Just open a new Github issue in this repo, explain quickly on what it is a about and what it is needed for. I will label it then accordingly.

@TimWardOrigami

This comment has been minimized.

Copy link

commented Apr 3, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.