-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sidecar Flag: Random delay before upload blocks on sidecar #7239
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Vanshikav123 <vanshikav928@gmail.com>
Hello ! @yeya24 PTAL |
cmd/thanos/sidecar.go
Outdated
@@ -506,7 +510,30 @@ type sidecarConfig struct { | |||
storeRateLimits store.SeriesSelectLimits | |||
} | |||
|
|||
type durationValue time.Duration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to set duration via flag directly without implementing interface.
https://github.com/thanos-io/thanos/blob/main/cmd/thanos/receive.go#L968
cmd/thanos/sidecar.go
Outdated
@@ -363,6 +364,9 @@ func runSidecar( | |||
uploadCompactedFunc, conf.shipper.allowOutOfOrderUpload, metadata.HashFunc(conf.shipper.hashFunc), conf.shipper.metaFileName) | |||
|
|||
return runutil.Repeat(30*time.Second, ctx.Done(), func() error { | |||
// Generate random delay using upload jitter. | |||
jitter := time.Duration(rand.Int63n(int64(conf.shipper.uploadJitter))) | |||
time.Sleep(jitter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of sleeping, can we do something similar to this https://github.com/cortexproject/cortex/blob/master/pkg/util/time.go#L73?
I image we will add a new function called runutil.RepeatWithJitter
which takes the same parameters as runutil.Repeat
but also takes variancePerc
.
We can just hardcode variancePerc
to a sane value like 0.05 so allows [-1.5s, 1.5s] jitter.
Signed-off-by: Vanshikav123 <vanshikav928@gmail.com>
…thanos into patch-sidecar_flag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this contribution. I am curious what is the benefit to adding jitter to the upload.
cmd/thanos/sidecar.go
Outdated
@@ -105,6 +106,19 @@ func registerSidecar(app *extkingpin.App) { | |||
}) | |||
} | |||
|
|||
// DurationWithJitter returns random duration from "input - input*variancePerc" to "input + input*variancePerc" interval. | |||
func DurationWithJitter(input time.Duration, variancePerc float64) time.Duration { | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmd/thanos/sidecar.go
Outdated
@@ -518,4 +538,7 @@ func (sc *sidecarConfig) registerFlag(cmd extkingpin.FlagClause) { | |||
sc.storeRateLimits.RegisterFlags(cmd) | |||
cmd.Flag("min-time", "Start of time range limit to serve. Thanos sidecar will serve only metrics, which happened later than this value. Option can be a constant time in RFC3339 format or time duration relative to current time, such as -1d or 2h45m. Valid duration units are ms, s, m, h, d, w, y."). | |||
Default("0000-01-01T00:00:00Z").SetValue(&sc.limitMinTime) | |||
conf := &sidecarConfig{} | |||
cmd.Flag("upload-jitter", "Maximum random delay before uploading blocks.").Default("0s").DurationVar(&uploadJitter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we bind the flag directly to conf.shipper.uploadJitter
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yess i will do it
pkg/runutil/runutil.go
Outdated
case <-ctx.Done(): | ||
return nil | ||
default: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding jitter prevents network blockage as it introduces random delays to the uploads |
Signed-off-by: Vanshikav123 <vanshikav928@gmail.com>
Hello! @fpetkovski @yeya24 resolved the reviews PTAL! |
cmd/thanos/sidecar.go
Outdated
jitter := DurationWithJitter(conf.shipper.uploadJitter, 0.05) | ||
if jitter > 0 { | ||
time.Sleep(jitter) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need to add another jitter here? Isn't RepeatWithJitter
enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should i also remove DurationWithJitter
function or keep it for future use ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove that
Signed-off-by: Vanshikav123 <vanshikav928@gmail.com>
cmd/thanos/sidecar.go
Outdated
@@ -362,7 +375,7 @@ func runSidecar( | |||
s := shipper.New(logger, reg, conf.tsdb.path, bkt, m.Labels, metadata.SidecarSource, | |||
uploadCompactedFunc, conf.shipper.allowOutOfOrderUpload, metadata.HashFunc(conf.shipper.hashFunc), conf.shipper.metaFileName) | |||
|
|||
return runutil.Repeat(30*time.Second, ctx.Done(), func() error { | |||
return runutil.RepeatWithJitter(30*time.Second, ctx, 0.05, func() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0.05 variance seems too small as it is only 30s * 0.05 = 1.5s
?
I am thinking about 0.2 here maybe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved all !
cmd/thanos/sidecar.go
Outdated
@@ -518,4 +531,6 @@ func (sc *sidecarConfig) registerFlag(cmd extkingpin.FlagClause) { | |||
sc.storeRateLimits.RegisterFlags(cmd) | |||
cmd.Flag("min-time", "Start of time range limit to serve. Thanos sidecar will serve only metrics, which happened later than this value. Option can be a constant time in RFC3339 format or time duration relative to current time, such as -1d or 2h45m. Valid duration units are ms, s, m, h, d, w, y."). | |||
Default("0000-01-01T00:00:00Z").SetValue(&sc.limitMinTime) | |||
conf := &sidecarConfig{} | |||
cmd.Flag("upload-jitter", "Maximum random delay before uploading blocks.").Default("0s").DurationVar(&conf.shipper.uploadJitter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this if it is not used anymore
cmd/thanos/config.go
Outdated
@@ -164,6 +164,7 @@ type shipperConfig struct { | |||
allowOutOfOrderUpload bool | |||
hashFunc string | |||
metaFileName string | |||
uploadJitter time.Duration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this if not needed
Signed-off-by: Vanshikav123 <vanshikav928@gmail.com>
docs/components/sidecar.md
Outdated
@@ -228,6 +228,7 @@ Flags: | |||
configuration. See format details: | |||
https://thanos.io/tip/thanos/tracing.md/#configuration | |||
--tsdb.path="./data" Data directory of TSDB. | |||
--upload-jitter=0s Maximum random delay before uploading blocks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to fix docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also it looks like this flag is never consumed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also added nowhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yess will remove it
cmd/thanos/sidecar.go
Outdated
@@ -362,7 +362,7 @@ func runSidecar( | |||
s := shipper.New(logger, reg, conf.tsdb.path, bkt, m.Labels, metadata.SidecarSource, | |||
uploadCompactedFunc, conf.shipper.allowOutOfOrderUpload, metadata.HashFunc(conf.shipper.hashFunc), conf.shipper.metaFileName) | |||
|
|||
return runutil.Repeat(30*time.Second, ctx.Done(), func() error { | |||
return runutil.RepeatWithJitter(30*time.Second, ctx, 0.2, func() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this really help though if we are jittering only 6s; at least for us block uploads can take far longer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can increase the jitter. What value do you think that makes sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, the whole method feels kinda weird. It was about in house blob storage and blocks being uploaded at same time right?
Signed-off-by: Vanshikav123 <vanshikav928@gmail.com>
pkg/runutil/runutil.go
Outdated
// RepeatWithJitter executes f with a random jitter added to the interval between each execution. | ||
// It continues until ctx is done or f returns an error. | ||
// The jitter factor should be between 0 and 1, where 0 means no jitter and 1 means the interval can vary from 0 to 2 times the original interval. | ||
func RepeatWithJitter(interval time.Duration, ctx context.Context, jitterFactor float64, f func() error) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ctx should be always first parameter. https://pkg.go.dev/context#pkg-overview
Signed-off-by: Vanshikav123 <vanshikav928@gmail.com>
Changes
#7184
Implemented the logic to sleep for a random time between 0 and the specified jitter before uploading blocks.Defined a flag to specify the maximum random delay before uploading blocks.
Verification