-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mesos input: Collect framework_offers and allocator metrics #5719
Conversation
@danielnelson it'd be great to get this merged! Is there something I can do to help it along? |
I started reviewing this and it looked just fine; however, once I built and ran it, it seems that it isn't backwards compatible, though I do prefer the more generic/cleaned up metrics created from this (plus the ability to properly filter/collect). It doesn't necessarily break compatibility, as inserts still work when using this branch after having used the original plugin, but it removes fields previously gathered. For example, someone may have a query with dependency on It seems like this needs to be trimmed down to simply add the ability to filter the collection of Config (running docker mesos/mesos-mini): [[inputs.mesos]]
timeout = 100
masters = ["http://localhost:5050"]
master_collections = ["allocator"] Output from master (same output when
Output from this branch (
Output from this branch (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following appears sufficient to allow filtering on allocators
and framework_offers
. Let me know what you think or if I'm way off the mark.
diff --git a/plugins/inputs/mesos/mesos.go b/plugins/inputs/mesos/mesos.go
index 8b322b84..0cf6588c 100644
--- a/plugins/inputs/mesos/mesos.go
+++ b/plugins/inputs/mesos/mesos.go
@@ -380,6 +380,10 @@ func getMetrics(role Role, group string) []string {
"master/slaves_connected",
"master/slaves_disconnected",
"master/slaves_inactive",
+ "master/slave_unreachable_canceled",
+ "master/slave_unreachable_completed",
+ "master/slave_unreachable_scheduled",
+ "master/slaves_unreachable",
}
m["frameworks"] = []string{
@@ -405,6 +409,11 @@ func getMetrics(role Role, group string) []string {
"master/tasks_running",
"master/tasks_staging",
"master/tasks_starting",
+ "master/tasks_dropped",
+ "master/tasks_gone",
+ "master/tasks_gone_by_operator",
+ "master/tasks_killing",
+ "master/tasks_unreachable",
}
m["messages"] = []string{
@@ -444,12 +453,18 @@ func getMetrics(role Role, group string) []string {
"master/task_lost/source_master/reason_slave_removed",
"master/task_lost/source_slave/reason_executor_terminated",
"master/valid_executor_to_framework_messages",
+ "master/invalid_operation_status_update_acknowledgements",
+ "master/messages_operation_status_update_acknowledgement",
+ "master/messages_reconcile_operations",
+ "master/messages_suppress_offers",
+ "master/valid_operation_status_update_acknowledgements",
}
m["evqueue"] = []string{
"master/event_queue_dispatches",
"master/event_queue_http_requests",
"master/event_queue_messages",
+ "master/operator_event_stream_subscribers",
}
m["registrar"] = []string{
@@ -463,6 +478,11 @@ func getMetrics(role Role, group string) []string {
"registrar/state_store_ms/p99",
"registrar/state_store_ms/p999",
"registrar/state_store_ms/p9999",
+ "registrar/log/ensemble_size",
+ "registrar/log/recovered",
+ "registrar/queued_operations",
+ "registrar/registry_size_bytes",
+ "registrar/state_store_ms/count",
}
} else if role == SLAVE {
m["resources"] = []string{
@@ -683,65 +703,22 @@ func (m *Mesos) gatherMainMetrics(u *url.URL, role Role, acc telegraf.Accumulato
}
}
- taggedFields := map[string][]TaggedField{}
- extraTags := map[string]fieldTags{}
-
- for metricName, val := range jf.Fields {
- if !strings.HasPrefix(metricName, "master/frameworks/") && !strings.HasPrefix(metricName, "allocator/") {
+ for metricName := range jf.Fields {
+ if !strings.HasPrefix(metricName, "master/frameworks/") && !strings.HasPrefix(metricName, "frameworks/") && !strings.HasPrefix(metricName, "allocator/") {
continue
}
// filter out framework offers/allocator metrics if necessary
- if (!includeFrameworkOffers && strings.HasPrefix(metricName, "master/frameworks/")) ||
+ if !includeFrameworkOffers &&
+ (strings.HasPrefix(metricName, "master/frameworks/") || strings.HasPrefix(metricName, "frameworks/")) ||
(!includeAllocator && strings.HasPrefix(metricName, "allocator/")) {
delete(jf.Fields, metricName)
continue
}
-
- parts := strings.Split(metricName, "/")
- if (parts[0] == "master" && len(parts) < 5) || (parts[0] == "allocator" && len(parts) <= 5) {
- // All framework offers metrics have at least 5 parts.
- // All allocator metrics with <= 5 parts can be sent as is and does not pull
- // any params out into tags.
- // (e.g. allocator/mesos/allocation_run_ms/count vs allocator/mesos/roles/<role>/shares/dominant)
- continue
- }
-
- tf := generateTaggedField(parts)
- tf.Value = val
-
- if len(tf.tags()) == 0 {
- // indicates no extra tags were added
- continue
- }
-
- tfh := tf.hash()
- if _, ok := taggedFields[tfh]; !ok {
- taggedFields[tfh] = []TaggedField{}
- }
- taggedFields[tfh] = append(taggedFields[tfh], tf)
-
- if _, ok := extraTags[tfh]; !ok {
- extraTags[tfh] = tf.tags()
- }
-
- delete(jf.Fields, metricName)
}
acc.AddFields("mesos", jf.Fields, tags)
- for tfh, tfs := range taggedFields {
- fields := map[string]interface{}{}
- for _, tf := range tfs {
- fields[tf.FieldName] = tf.Value
- }
- for k, v := range tags {
- extraTags[tfh][k] = v
- }
-
- acc.AddFields("mesos", fields, extraTags[tfh])
- }
-
return nil
}
This restores framework and allocator metrics to their original names, without extracting parts into tags.
Thanks for taking a look, @glinton. You're right, this would break existing queries. I pushed commits that restore the original metric names, per your suggestion. However, it's been very useful to parse framework names, etc. out of these metrics into tags. Is there a way to accomplish that without breaking backwards compatibility? This seems like a good fit for a processor plugin, though I don't think there's an existing one I can use for this purpose. I think I'll write a small processor tailored for this, if you don't have a better idea. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you on the usefulness of the metrics this originally introduced. I'm not sure whether that's going to be best in a processor or in a plugin of it's own. I'd imagine a plugin of it's own, as processors should be pretty generic.
I can imagine an option to add them to this plugin may work, but we need to be careful with that. We've learned from other plugins that making different measurements based on a config flag can cause more trouble than it solves.
Any thoughts @danielnelson?
// based on presence of "framework_offers"/"allocator" in MasterCols. | ||
// These lines are included to prevent the "unknown" info log below. | ||
m["framework_offers"] = []string{} | ||
m["allocator"] = []string{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we don't list the metrics here and remove the logic in gatherMainMetrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plugins/inputs/mesos/mesos.go
Outdated
// filter out framework offers/allocator metrics if necessary | ||
if !includeFrameworkOffers && | ||
(strings.HasPrefix(metricName, "master/frameworks/") || strings.HasPrefix(metricName, "frameworks/")) || | ||
(!includeAllocator && strings.HasPrefix(metricName, "allocator/")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic ended up a bit too complex for my taste, can you move the filtering into func (m *Mesos) filterMetrics(
? I think we could wedge in a switch, something like this:
for _, k := range metricsDiff(role, selectedMetrics) {
fmt.Println("group:", k)
switch k {
case "allocators":
// TODO remove all allocator metrics
case "framework_offers":
// TODO remove all framework_offers
default:
// Removes other categories
for _, v := range getMetrics(role, k) {
if _, ok = (*metrics)[v]; ok {
delete((*metrics), v)
}
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed 1ee1789. This required a refactor to the test module: I had to move masterMetricNames
out of generateMetrics()
and make it a global, so that TestMasterFilter()
could know which metrics to expect. I did the same with slaveMetricNames
just for consistency. LMK what you think!
658414e
to
8352b73
Compare
8352b73
to
1ee1789
Compare
Thanks for reviewing and merging this, @danielnelson @glinton. Please let me know if you have suggestions for how to add the more useful metrics that this PR originally introduced, while maintaining backwards compatibility. I'm eager to contribute that improvement. |
I'm sure we can find a path forward on it. I didn't look much at the original pr, would you be able to write a new feature request issue that lists the changes you would make now if you had full license to break things? This will give me a better idea of how we should go about the changes. |
This PR causes the mesos input plugin to collect allocator metrics, and additional metrics for frameworks. These metrics are documented here:
Required for all PRs: