Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Filesystem and Process Metricset to System Module #1081

Merged
merged 1 commit into from
May 3, 2016

Conversation

ruflin
Copy link
Member

@ruflin ruflin commented Mar 1, 2016

  • Add Filesystem Metricset with fields.yml and example doc
  • Add Process Metricset with fields.yml and example doc
  • Enhance template generation to support nested documents
  • Fix issue with type: string
  • Raise exception in template generation script if invalid type is used

@ruflin ruflin added the discuss Issue needs further discussion. label Mar 1, 2016
"steal": cpuStat.Stolen,
"user_p": cpuStat.UserPercent,
"system_p": cpuStat.SystemPercent,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we somehow create the event (MapStr) in the topbeat code? Otherwise we have to remember adding the key here as well every time we add something to topbeat.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should add some abstraction to Topbeat to also profit directly from things like addCPUPercentage etc: https://github.com/elastic/beats/blob/master/topbeat/beater/topbeat.go#L262

@monicasarbu
Copy link
Contributor

LGTM

@monicasarbu
Copy link
Contributor

@ruflin If we want to add per process statistics, are you planning to add a new module to Metricbeat or re-use/rename "system"?
I imagined that in the future Topbeat & Topbeat module in Metricbeat are the same thing, share the same code and export the same data.

@ruflin
Copy link
Member Author

ruflin commented Mar 1, 2016

@monicasarbu I would definitively want to add it. I think topbeat and metricbeat should have feature parity. I would see all topbeat features under the module system (but we could also rename it). What I'm not sure yet how to call all the metricsets. For example belongs Disk I/O under filesystem or is it its own metricset? What about the per-process-stats? Should this be a metricset processes or it is part of cpu?

@ruflin ruflin changed the title [POC] Add Topbeat to Metricbeat Add Topbeat to Metricbeat Mar 22, 2016
system.AddFileSystemUsedPercentage(fsStat)

fsEvent := common.MapStr{
fsStat.DevName: system.GetFilesystemEvent(fsStat),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we want to have as a key here? Filesystem names can also have spaces etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsg Would be good to get your thoughts on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately organizing the data like this is making it impossible to do top like widgets in Kibana. For example top processes by memory usage, top FS by disk usage, etc. So it's a question of how uniform we want to have the data model versus enabling different viz in Kibana.

This would mean that Topbeat is not strictly a subset of Metricbeat, because the data is organized differently. This could be OK, but we have to take a conscious decision about it.

Overall I find this model in which the fields names are not predictable less flexible on the data consumption part.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that both implementations should be "almost" identical. I think that topbeat should send the status for processes and file system stats in one event instead of lots of sub events. This still doesn't solve the above problem how to do it in the best way. We could potentially use arrays (https://www.elastic.co/guide/en/elasticsearch/guide/current/complex-core-fields.html#object-arrays) but I have to check how this would work for visualisations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid arrays are also not a good option when it comes to visualizing in Kibana, see for e.g. elastic/kibana#998. Besides the visualization aspect, having ephemeral fields like PID is quite space inefficient, because it tends to create a lot of sparse doc values.

I was thinking the metric name will be something like filesystem.size and the device name would be a label, just like the host, for example. IMO putting the device name or PID into the metric name sends us back to the Graphite ways where this is the only way to put metadata in.

@ruflin
Copy link
Member Author

ruflin commented Apr 25, 2016

This is currently blocked by finding the right data model for process and filesystem.

@ruflin ruflin added Metricbeat Metricbeat and removed discuss Issue needs further discussion. labels Apr 25, 2016
@ruflin ruflin changed the title Add Topbeat to Metricbeat Add Topbeat to MetricbeatAdd Filesystem and Process Metricset to System Module Apr 26, 2016
properties[field["name"]] = {
"type": field.get("type")
}
if field["type"] == "keyword":
properties[field["name"]]["ignore_above"] = \
defaults.get("ignore_above", 1024)

elif field["type"] == "dict":
elif field["type"] in ["dict", "list"]:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsg @monicasarbu Packetbeat had a type "list". I assumed this is identical to dict?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd say lets use only dict for now, for simplicity. At some point we might have to separate them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

Can you briefly elaborate on how dict and list could be different?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we'd use dict only for actual sub-dictionaries, and list for "arrays of dictionaries", like we have in DNS at the moment. The requirements are likely to be different, but at the moment dict without dict-type adds nothing to the template, so that works on anything :-).

@ruflin
Copy link
Member Author

ruflin commented Apr 26, 2016

@tsg I completely rewrote / updated this PR. Have a look.

"rtt": 20982,
"system-process": {
"processes": [
{
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I added the additional "processes" array is that this will allow us to store additional data in the metricset if needed without changing the structure.

@ruflin ruflin changed the title Add Topbeat to MetricbeatAdd Filesystem and Process Metricset to System Module Add Filesystem and Process Metricset to System Module Apr 26, 2016
properties[field.get("name")] = {"type": "nested", "properties": {}}
properties[field.get("name")]["properties"] = prop

dynamic_templates.extend(dynamic)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if dynamic templates work on nested documents. We don't need it now, but we should know if that's a limitation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsg I would expect that we can use path_match for this: https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-templates.html#path-match-unmatch But I didn't test it.

@@ -240,7 +240,7 @@ def fill_field_properties(args, field, defaults, path):
path = path + "." + field["name"]
else:
path = field["name"]
prop, dynamic = fill_section_properties(field, defaults, path)
prop, dynamic = fill_section_properties(args, field, defaults, path)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsg Seems like this one only affected metricbeat

@ruflin
Copy link
Member Author

ruflin commented May 2, 2016

In a recent meeting we decided to do the following with the data structure:

  • Not use nested documents and send for each process / filesystem a separate document (as this is more or less anyway how ES stores nested docs). Correlate the docs which belong together (for example all processes) with an identifider
  • Have a metricset processes which sends all info for all processes and have a metricset process-info (or similar) which provides overview information over processes.
  • Same for Filesystem
  • It can be that the information of the two metricsets partially overlaps. Shared functionality should go into the module.

@andrewkroh is currently working on making it possible for a metricset to return multiple events. This PR will be updated as soon as these changes are in master.

@ruflin ruflin force-pushed the metricbeat-topbeat branch 2 times, most recently from 5ff8db7 to b205dce Compare May 3, 2016 08:45
@ruflin
Copy link
Member Author

ruflin commented May 3, 2016

I updated this PR to send for each process and filesytem an event. This is now possible with the new metricset interfaces. In addition I added the fsstats metricset that contains the file system stats.

@andrewkroh @tsg Please have a look.

"metricset": "filesystem",
"module": "system",
"rtt": 434,
"system-filesystem": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason to have system in the name of the object? Why not just filesystem? I am thinking once we will have conditions in generic filtering, then the name of the field available in the condition is a bit too long (e.g. system-filesystem.device_name). Also, I think a mixture of "-" and "_" is not a good idea.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@monicasarbu That is the namespacing we require. This is always $module-$metricset for all events.

@andrewkroh
Copy link
Member

Looks like you need a doc.go placeholder in the fsstats and filesystem packages so that the package is not empty for the operating systems on which those metric sets are unavailable.

@andrewkroh
Copy link
Member

Other than the cross-compile error, LGTM. My comments were just minor things that I can fix later if you want.

event := common.MapStr{
"@timestamp": common.Time(time.Now()),
"type": "process",
"count": 1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

count: 1 is not longer exported.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good one. I would have introduced count accidentially again.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@ruflin
Copy link
Member Author

ruflin commented May 3, 2016

@monicasarbu @andrewkroh Cleaned up and pushed again.

@andrewkroh
Copy link
Member

@ruflin The system/process package also needs a doc.go file.

* Add Filesystem Metricset with fields.yml and example doc
* Add Fsstats Metricset with file system stats
* Add Process Metricset with fields.yml and example doc
* Enhance template generation to support nested documents
* Fix issue with type: string
* Raise exception in template generation script if invalid type is used
@ruflin
Copy link
Member Author

ruflin commented May 3, 2016

@andrewkroh Fixed

`system-filesystem` contains local filesystem stats
fields:
- name: avail
type: integer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of these are marked as integer but are marked as longs in Topbeat. Looks like they should long because their data types are either int64 or uint64.

@andrewkroh andrewkroh merged commit 8e71923 into elastic:master May 3, 2016
@ruflin ruflin deleted the metricbeat-topbeat branch May 4, 2016 06:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants