Skip to content

Less-exponential alternative to internal_service_policy()? #1270

@jgallagher

Description

@jgallagher

internal_service_policy() is used throughout omicron (at the time of creating this issue, ~20 times). It serves as a "retry this forever with exponential backoff", but there are cases where either exponential backoff is not necessary, or the max backoff time allowed by internal_service_policy() (1 hour) is much too long. #996 describes one such case with discussion about alternatives (some of which may be specific to disk creation). RSS also has several uses of this policy, which seems unlikely to be correct; e.g., #1251 uses this policy to wait for maghemite to report its awareness of other sleds in the rack, but ultimately that happens while an operator is sitting waiting for RSS to make progress, which definitely shouldn't allow for an hour between retries. Should we have at least one other retry policy available that has a much shorter max interval?

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions