-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Filesystem and Process Metricset to System Module #1081
Conversation
ruflin
commented
Mar 1, 2016
•
edited
Loading
edited
- Add Filesystem Metricset with fields.yml and example doc
- Add Process Metricset with fields.yml and example doc
- Enhance template generation to support nested documents
- Fix issue with type: string
- Raise exception in template generation script if invalid type is used
"steal": cpuStat.Stolen, | ||
"user_p": cpuStat.UserPercent, | ||
"system_p": cpuStat.SystemPercent, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we somehow create the event (MapStr) in the topbeat code? Otherwise we have to remember adding the key here as well every time we add something to topbeat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should add some abstraction to Topbeat to also profit directly from things like addCPUPercentage etc: https://github.com/elastic/beats/blob/master/topbeat/beater/topbeat.go#L262
LGTM |
@ruflin If we want to add per process statistics, are you planning to add a new module to Metricbeat or re-use/rename "system"? |
@monicasarbu I would definitively want to add it. I think topbeat and metricbeat should have feature parity. I would see all topbeat features under the module system (but we could also rename it). What I'm not sure yet how to call all the metricsets. For example belongs Disk I/O under filesystem or is it its own metricset? What about the per-process-stats? Should this be a metricset |
26cfdf1
to
cc9fb92
Compare
cc9fb92
to
aefe190
Compare
system.AddFileSystemUsedPercentage(fsStat) | ||
|
||
fsEvent := common.MapStr{ | ||
fsStat.DevName: system.GetFilesystemEvent(fsStat), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do we want to have as a key here? Filesystem names can also have spaces etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tsg Would be good to get your thoughts on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately organizing the data like this is making it impossible to do top like widgets in Kibana. For example top processes by memory usage, top FS by disk usage, etc. So it's a question of how uniform we want to have the data model versus enabling different viz in Kibana.
This would mean that Topbeat is not strictly a subset of Metricbeat, because the data is organized differently. This could be OK, but we have to take a conscious decision about it.
Overall I find this model in which the fields names are not predictable less flexible on the data consumption part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that both implementations should be "almost" identical. I think that topbeat should send the status for processes and file system stats in one event instead of lots of sub events. This still doesn't solve the above problem how to do it in the best way. We could potentially use arrays (https://www.elastic.co/guide/en/elasticsearch/guide/current/complex-core-fields.html#object-arrays) but I have to check how this would work for visualisations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid arrays are also not a good option when it comes to visualizing in Kibana, see for e.g. elastic/kibana#998. Besides the visualization aspect, having ephemeral fields like PID is quite space inefficient, because it tends to create a lot of sparse doc values.
I was thinking the metric name will be something like filesystem.size
and the device name would be a label, just like the host, for example. IMO putting the device name or PID into the metric name sends us back to the Graphite ways where this is the only way to put metadata in.
aefe190
to
9fb6387
Compare
9fb6387
to
ca2710d
Compare
This is currently blocked by finding the right data model for process and filesystem. |
ca2710d
to
0271de8
Compare
properties[field["name"]] = { | ||
"type": field.get("type") | ||
} | ||
if field["type"] == "keyword": | ||
properties[field["name"]]["ignore_above"] = \ | ||
defaults.get("ignore_above", 1024) | ||
|
||
elif field["type"] == "dict": | ||
elif field["type"] in ["dict", "list"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tsg @monicasarbu Packetbeat had a type "list". I assumed this is identical to dict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'd say lets use only dict
for now, for simplicity. At some point we might have to separate them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
Can you briefly elaborate on how dict and list could be different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking we'd use dict
only for actual sub-dictionaries, and list
for "arrays of dictionaries", like we have in DNS at the moment. The requirements are likely to be different, but at the moment dict
without dict-type
adds nothing to the template, so that works on anything :-).
@tsg I completely rewrote / updated this PR. Have a look. |
"rtt": 20982, | ||
"system-process": { | ||
"processes": [ | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I added the additional "processes" array is that this will allow us to store additional data in the metricset if needed without changing the structure.
0271de8
to
a4d7b0f
Compare
properties[field.get("name")] = {"type": "nested", "properties": {}} | ||
properties[field.get("name")]["properties"] = prop | ||
|
||
dynamic_templates.extend(dynamic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if dynamic templates work on nested documents. We don't need it now, but we should know if that's a limitation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tsg I would expect that we can use path_match for this: https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-templates.html#path-match-unmatch But I didn't test it.
8d790c6
to
7f6ce34
Compare
@@ -240,7 +240,7 @@ def fill_field_properties(args, field, defaults, path): | |||
path = path + "." + field["name"] | |||
else: | |||
path = field["name"] | |||
prop, dynamic = fill_section_properties(field, defaults, path) | |||
prop, dynamic = fill_section_properties(args, field, defaults, path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tsg Seems like this one only affected metricbeat
7f6ce34
to
50cffb0
Compare
In a recent meeting we decided to do the following with the data structure:
@andrewkroh is currently working on making it possible for a metricset to return multiple events. This PR will be updated as soon as these changes are in master. |
5ff8db7
to
b205dce
Compare
I updated this PR to send for each process and filesytem an event. This is now possible with the new metricset interfaces. In addition I added the fsstats metricset that contains the file system stats. @andrewkroh @tsg Please have a look. |
"metricset": "filesystem", | ||
"module": "system", | ||
"rtt": 434, | ||
"system-filesystem": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason to have system in the name of the object? Why not just filesystem? I am thinking once we will have conditions in generic filtering, then the name of the field available in the condition is a bit too long (e.g. system-filesystem.device_name). Also, I think a mixture of "-" and "_" is not a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@monicasarbu That is the namespacing we require. This is always $module-$metricset
for all events.
Looks like you need a |
Other than the cross-compile error, LGTM. My comments were just minor things that I can fix later if you want. |
event := common.MapStr{ | ||
"@timestamp": common.Time(time.Now()), | ||
"type": "process", | ||
"count": 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
count: 1 is not longer exported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good one. I would have introduced count accidentially again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
@monicasarbu @andrewkroh Cleaned up and pushed again. |
@ruflin The system/process package also needs a |
* Add Filesystem Metricset with fields.yml and example doc * Add Fsstats Metricset with file system stats * Add Process Metricset with fields.yml and example doc * Enhance template generation to support nested documents * Fix issue with type: string * Raise exception in template generation script if invalid type is used
@andrewkroh Fixed |
`system-filesystem` contains local filesystem stats | ||
fields: | ||
- name: avail | ||
type: integer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of these are marked as integer
but are marked as long
s in Topbeat. Looks like they should long
because their data types are either int64 or uint64.