-
Notifications
You must be signed in to change notification settings - Fork 349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tetragon: Check final size for data event #1224
Conversation
9fe8180
to
b96a4ef
Compare
pkg/observer/data_stats.go
Outdated
@@ -13,6 +13,13 @@ var ( | |||
Help: "Data event statistics. For internal use only.", | |||
ConstLabels: nil, | |||
}, []string{"event"}) | |||
|
|||
DataEventSizeHist = promauto.NewHistogramVec(prometheus.HistogramOpts{ | |||
Name: consts.MetricNamePrefix + "data_event_size_histogram", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched all metrics to use a Namespace
field instead of concatenating the prefix in #1228, can you change this one too?
29fb310
to
6f8fd75
Compare
✅ Deploy Preview for tetragon ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I left some comments/suggestions for the userspace part.
cmd/tetragon/flags.go
Outdated
@@ -77,6 +77,8 @@ const ( | |||
keyEnablePidSetFilter = "enable-pid-set-filter" | |||
|
|||
keyEnableMsgHandlingLatency = "enable-msg-handling-latency" | |||
|
|||
keyEnableDataEventsSizeMetric = "enable-data-events-size-metric" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this flag needed?
From what I see, currently from all metrics only LatencyStats
has to be explicitly enabled with a flag. Having a way to enable/disable metrics is useful, but having a separate flag for each of them doesn't seem scalable. If it's not necessary right now, then I would rather develop a more generic way to enable/disable metrics in the near future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any idea how bad/slow is the histogram observe call? I got the impression it's better if it's disabled by default
maybe we could have some generic --enable-metric-hist=<hist1,hist2,..> option, or something like that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any idea how bad/slow is the histogram observe call>
Histogram observe call shouldn't be too bad I think, summaries are slow, but histograms generally ok. The cardinality also shouldn't be too bad in this case, all possible label values are known.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with this the main problem with some of the metrics was cardinality of the labels. This doesn't look bad from that side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I removed the option
pkg/observer/data_stats.go
Outdated
// Define a counter metric for data event statistics | ||
DataEventStats = promauto.NewCounterVec(prometheus.CounterOpts{ | ||
Namespace: consts.MetricsNamespace, | ||
Name: "data_event_stats_total", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name: "data_event_stats_total", | |
Name: "data_events_total", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
pkg/observer/data_stats.go
Outdated
DataEventStats = promauto.NewCounterVec(prometheus.CounterOpts{ | ||
Namespace: consts.MetricsNamespace, | ||
Name: "data_event_stats_total", | ||
Help: "Data event statistics. For internal use only.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Help: "Data event statistics. For internal use only.", | |
Help: "The number of data events by type. For internal use only.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Adding size check on receiving side of data event to make sure we won't use incomplete data. There was a bug keeping the loop in do_str in-effective, because rd_bytes were never incremented. The clang compilation seemed to skip the loop completely so now when it's fixed, we actually can't have 10 iteration, but only 2 in order not to reach verifier complexity. Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Adding stats for data events to keep track of what's happening in there. Signed-off-by: Jiri Olsa <jolsa@kernel.org>
6a81fa5
to
393ed1b
Compare
pkg/option/config.go
Outdated
@@ -75,6 +75,8 @@ type config struct { | |||
EnablePidSetFilter bool | |||
|
|||
EnableMsgHandlingLatency bool | |||
|
|||
EnableDataEventsSizeMetric bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unused now
pkg/observer/data_stats.go
Outdated
|
||
DataEventSizeHist = promauto.NewHistogramVec(prometheus.HistogramOpts{ | ||
Namespace: consts.MetricsNamespace, | ||
Name: "data_event_stats_size", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last nit, promise - can it be just data_event_size
? Having "stats" in the metric name might be a bit confusing about what is actually measured IMO :)
Name: "data_event_stats_size", | |
Name: "data_event_size", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, np ;-)
Adding good/bad histograms to keep track of data events sizes. Signed-off-by: Jiri Olsa <jolsa@kernel.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good
Adding size check on receiving side of data event to make
sure we won't use incomplete data.
Also adding stats for data events to keep track of what's
happening in there.