Skip to content

Commit

Permalink
psc-seq: don't repeat StickyFault every time.
Browse files Browse the repository at this point in the history
We're currently in a situation where the dogfood rack has the power whip
feeding PSUs 3-5 disconnected. This means the PSUs are sitting there in
faulted state refusing to enable. Since we can't distinguish this from
actual PSU failures, currently we're frantically chatting about this
situation in the logs. This is annoying, and overwrites other data.

This change alters the behavior to note a sticky fault condition _once_
and then quietly keep trying to resolve it without the spam.
  • Loading branch information
cbiffle committed May 22, 2024
1 parent 1d3f3d5 commit 0b1b95c
Showing 1 changed file with 25 additions and 3 deletions.
28 changes: 25 additions & 3 deletions drv/psc-seq-server/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -259,7 +259,16 @@ enum PresentState {
NewlyInserted { settle_deadline: u64 },
/// The PSU has unexpectedly deasserted the OK signal, or failed to assert
/// it within a reasonable amount of time after being turned on.
Faulted { turn_on_deadline: u64 },
Faulted {
// Try to turn the PSU back on when this time is reached, but only if
// the fault has cleared. Otherwise, we will stay in the fault state
// with a "sticky fault" situation.
turn_on_deadline: u64,
/// Initially false, this gets set to true once we've attempted to clear
/// the current fault condition, and failed, at least once. We use this
/// to suppress repeated logging of the condition.
already_sticky: bool,
},
}

#[export_name = "main"]
Expand Down Expand Up @@ -385,6 +394,7 @@ fn main() -> ! {
));
PresentState::Faulted {
turn_on_deadline: start_time.saturating_add(FAULT_OFF_MS),
already_sticky: false,
}
})
});
Expand Down Expand Up @@ -565,14 +575,19 @@ impl Psu {
let turn_on_deadline = now.wrapping_add(FAULT_OFF_MS);
self.state = PsuState::Present(PresentState::Faulted {
turn_on_deadline,
// We're newly entering a fault state, so, not sticky yet.
already_sticky: false,
});
Some(ActionRequired::DisableMe {
attempt_snapshot: true,
})
}

(
PsuState::Present(PresentState::Faulted { turn_on_deadline }),
PsuState::Present(PresentState::Faulted {
turn_on_deadline,
already_sticky,
}),
_,
ok,
) => {
Expand All @@ -585,8 +600,15 @@ impl Psu {
self.state =
PsuState::Present(PresentState::Faulted {
turn_on_deadline,
already_sticky: true,
});
Some(ActionRequired::NoteStickyFault)
if already_sticky {
// Don't repeat this in the logs until we clear
// the fault condition.
None
} else {
Some(ActionRequired::NoteStickyFault)
}
}
Status::Good => {
self.state = PsuState::Present(PresentState::On);
Expand Down

0 comments on commit 0b1b95c

Please sign in to comment.