This CFP proposes adding metrics to high level Hive structure to be able to track hive startup time and report it along with other provider's (e.g. Cilium) metrics. A newly added metrics should have the same value as the one which is seen in the logs next to "Started hive":
|
log.Info("Started hive", "duration", time.Since(start)) |
While it is the initial motivation, the idea is to implement this mechanism in hive to be reusable for any of the future use cases.
Background
Following the PR, created quite some time ago, about making hive timeouts configurable (#60), this CFP proposes to add a metric to track the actual startup time. This would allow to set SLOs and monitor regressions.
Solutions
Proposed solution
In my opinion, perhaps the most generic and scalable solution would be to implement singleton HiveMetrics structure (similar to what Cilium has for each metric in pkg/metrics/metrics.go) and initialize it with concrete implementation somewhere inside the metrics provider (e.g. cilium-agent):
type Metrics interface {
StartupTimeout(duration time.Duration)
}
type NopMetrics struct {}
...
var _ Metrics = NopMetrics{}
var HiveMetrics Metrics = NopMetrics{}
In, for example, cilium-agent:
type hiveMetricsImpl struct{}
func (hiveMetricsImpl) StartupTimeout(duration time.Duration) {
metrics.HiveStartupDuration.WithLabelValue(<some-label>).Set(duration)
}
...
hive.HiveMetrics = hiveMetricsImpl
Similar idea would be to not have HiveMetrics as a singleton, but rather embed it inside Hive structure. Implementation would be similar, though here HiveMetrics is initialized from RunOptions:
func WithMetricsProvider(provider Metrics) RunOptionFunc {
return func(h *Hive) {
h.opts.MetricsProvider = provider
}
}
And then calling hive.Run() with WithMetricsProvider() with an implementation specific to Cilium.
While this option also seems reasonable, the first one looks more natural and similar to what Cilium already has.
Those two approaches would allow to easily extend hive with new high-level metrics without any structural modifications.
Alternatives
Another approach could be to use pre/post start hooks (i.e. callbacks) which would be defined in metrics provider. Those hooks could be stored inside Options and initialized from RunOptions:
type hiveHook func()
type hookWrapper struct {
preHook hiveHook
postHook hiveHook
}
type Options struct {
...
startHooks hookWrapper
}
func WithStartHooks(preHook, postHook hiveHook) {
return func(h *Hive) {
h.opts.startHooks = hookWrapper {preHook, postHook}
}
}
And then in, for example, Cilium:
func (hm hiveMetrics) preStartHook() {
hm.hiveDuration = spanStat.Start()
}
func (hm hiveMetrics) portStartHook() {
metrics.HiveStartupDuration.WithLabelValue(<some-label>).Set(hm.hiveDuration.End().Total().Seconds())
}
New().WithStartHooks(hm.preStartHook(), hm.postStartHook())
...
Not a big fan of this solution as I believe that use cases of this approach are limited, though I might be missing something. From the advantages side, it does not require any metric or provider specific logic to be introduced in cilium/hive.
Summary
I believe there might be better solutions to this problem, so I would be happy to hear them or overall the feedback regarding this CFP.
This CFP proposes adding metrics to high level Hive structure to be able to track hive startup time and report it along with other provider's (e.g. Cilium) metrics. A newly added metrics should have the same value as the one which is seen in the logs next to "Started hive":
hive/hive.go
Line 389 in a129bcc
While it is the initial motivation, the idea is to implement this mechanism in hive to be reusable for any of the future use cases.
Background
Following the PR, created quite some time ago, about making hive timeouts configurable (#60), this CFP proposes to add a metric to track the actual startup time. This would allow to set SLOs and monitor regressions.
Solutions
Proposed solution
In my opinion, perhaps the most generic and scalable solution would be to implement singleton HiveMetrics structure (similar to what Cilium has for each metric in
pkg/metrics/metrics.go) and initialize it with concrete implementation somewhere inside the metrics provider (e.g. cilium-agent):In, for example, cilium-agent:
Similar idea would be to not have HiveMetrics as a singleton, but rather embed it inside Hive structure. Implementation would be similar, though here HiveMetrics is initialized from RunOptions:
And then calling
hive.Run()withWithMetricsProvider()with an implementation specific to Cilium.While this option also seems reasonable, the first one looks more natural and similar to what Cilium already has.
Those two approaches would allow to easily extend hive with new high-level metrics without any structural modifications.
Alternatives
Another approach could be to use pre/post start hooks (i.e. callbacks) which would be defined in metrics provider. Those hooks could be stored inside Options and initialized from RunOptions:
And then in, for example, Cilium:
Not a big fan of this solution as I believe that use cases of this approach are limited, though I might be missing something. From the advantages side, it does not require any metric or provider specific logic to be introduced in
cilium/hive.Summary
I believe there might be better solutions to this problem, so I would be happy to hear them or overall the feedback regarding this CFP.