Skip to content

Commit

Permalink
satellite/audit: much stricter audit transfer speed reqs
Browse files Browse the repository at this point in the history
With MinBytesPerSecond's old value of 128B, we were allowing multiple
hours for transfer of a maximum-size piece (~2MiB) during
reverifications. We do want to be generous until it has been proven that
something needs to be stricter, but we have hit this point. Hours is
much too long.

Instead, this new value of 150kB is still reasonably generous while
limiting transfer time for a full-size piece to around 15s.

In the case of downloading a single share during verification, the
MinDownloadTimeout will apply and allow for minor network hiccups. We
shorten that time here as well, from 5m to 15s. This should allow a much
higher rate of audit completion in the presence of very slow nodes.

Refs: storj/infra#5390
Change-Id: I7402ebd91b2115b379e7671d6b7455d3b0eae1fb
  • Loading branch information
thepaul authored and Storj Robot committed Jan 27, 2024
1 parent 0b90a4a commit 5ec1232
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 4 deletions.
4 changes: 2 additions & 2 deletions satellite/audit/worker.go
Expand Up @@ -21,8 +21,8 @@ var Error = errs.Class("audit")
// Config contains configurable values for audit chore and workers.
type Config struct {
MaxRetriesStatDB int `help:"max number of times to attempt updating a statdb batch" default:"3"`
MinBytesPerSecond memory.Size `help:"the minimum acceptable bytes that storage nodes can transfer per second to the satellite" default:"128B" testDefault:"1.00 KB"`
MinDownloadTimeout time.Duration `help:"the minimum duration for downloading a share from storage nodes before timing out" default:"5m0s" testDefault:"5s"`
MinBytesPerSecond memory.Size `help:"the minimum acceptable bytes that storage nodes can transfer per second to the satellite" default:"150kB" testDefault:"1.00 KB"`
MinDownloadTimeout time.Duration `help:"the minimum duration for downloading a share from storage nodes before timing out" default:"15s" testDefault:"5s"`
MaxReverifyCount int `help:"limit above which we consider an audit is failed" default:"3"`

ChoreInterval time.Duration `help:"how often to run the reservoir chore" releaseDefault:"24h" devDefault:"1m" testDefault:"$TESTINTERVAL"`
Expand Down
4 changes: 2 additions & 2 deletions satellite/satellite-config.yaml.lock
Expand Up @@ -77,10 +77,10 @@
# audit.max-reverify-count: 3

# the minimum acceptable bytes that storage nodes can transfer per second to the satellite
# audit.min-bytes-per-second: 128 B
# audit.min-bytes-per-second: 150.00 KB

# the minimum duration for downloading a share from storage nodes before timing out
# audit.min-download-timeout: 5m0s
# audit.min-download-timeout: 15s

# how often to recheck an empty audit queue
# audit.queue-interval: 1h0m0s
Expand Down

0 comments on commit 5ec1232

Please sign in to comment.