-
Notifications
You must be signed in to change notification settings - Fork 62
Description
internal_service_policy() is used throughout omicron (at the time of creating this issue, ~20 times). It serves as a "retry this forever with exponential backoff", but there are cases where either exponential backoff is not necessary, or the max backoff time allowed by internal_service_policy() (1 hour) is much too long. #996 describes one such case with discussion about alternatives (some of which may be specific to disk creation). RSS also has several uses of this policy, which seems unlikely to be correct; e.g., #1251 uses this policy to wait for maghemite to report its awareness of other sleds in the rack, but ultimately that happens while an operator is sitting waiting for RSS to make progress, which definitely shouldn't allow for an hour between retries. Should we have at least one other retry policy available that has a much shorter max interval?