New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add performance metrics for initial sync and netpol #3450
Conversation
750d039
to
9a757cd
Compare
/retest |
EnableConfigDuration bool `gcfg:"enable-config-duration"` | ||
EnableEIPScaleMetrics bool `gcfg:"enable-eip-scale-metrics"` | ||
EnableConfigDuration bool `gcfg:"enable-config-duration"` | ||
EnableScaleMetrics bool `gcfg:"enable-scale-metrics"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flavio had added the previous EIP metrics flag:
Add flags to explicitly enable the histogram metrics, since we only see value
in having them when scale testing egress ips. The flag introduced here is:
--metrics-enable-eip-scale
Signed-off-by: Flavio Fernandes <flaviof@redhat.com>
@flavio-fernandes are you ok with changing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trozet yes. we can change it w/out any problems.
@@ -368,35 +368,39 @@ func (oc *DefaultNetworkController) Run(ctx context.Context) error { | |||
|
|||
// Sync external gateway routes. External gateway may be set in namespaces | |||
// or via pods. So execute an individual sync method at startup | |||
oc.cleanExGwECMPRoutes() | |||
WithWatchDurationMetricNoError("external gateway router", oc.cleanExGwECMPRoutes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this something we are interested in? I guess it doesn't hurt but this isnt really a watcher, it just cleans up stale stuff (sync) function. Also it should be called "external gateway routes", and maybe add the word "cleanup"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I added this to make sure we see where the summarized time comes from, about the name - the metric is actually called sync_duration_seconds
- probably not the best name, but at least it can be applied to watchers and syncs like this one.
ovn-k master to start watching every resource. Add scale metrics for network policies, rename existing enable-eip-scale-metrics flag to more general enable-scale-metrics, and use it for network policy metric too. Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com>
9a757cd
to
fbc9e4e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
@trozet we have an lgtm, do you have any more comments? |
@npinaeva can you create a new PR updating the metrics doc with these new metrics? |
@martinkennelly sure, I just wanted to ask if we mention scale metric in that doc? I see egress ip metrics are not there, so should I only add |
@npinaeva yeah, i think thats ok |
Add MetricMasterSyncDuration metric to track how much it takes for
ovn-k master to start watching every resource.
Add scale metrics for network policies, rename existing
enable-eip-scale-metrics flag to more general enable-scale-metrics, and
use it for network policy metric too.
Add an option to enable scale metrics in kind
For anyone who would like to test this,
ovnkube_master_sync_duration_seconds
is enabled by default. To check scale metrics use newly addedkind.sh
flag--scale-metrics
To fetch master metrics
curl <ovn-k master leader ip>:9409/metrics