Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving spd's worst case latency #599

Closed
cbiffle opened this issue Jun 10, 2022 · 2 comments
Closed

Improving spd's worst case latency #599

cbiffle opened this issue Jun 10, 2022 · 2 comments

Comments

@cbiffle
Copy link
Collaborator

cbiffle commented Jun 10, 2022

Currently we have a priority inversion in the gimlet images between spd (the DIMM SPD proxy task) and i2c_driver. This is because spd issues i2c transactions to read information from DIMMs, but also has strict latency requirements as a result of implementing an SMBus device, due to how the ST I2C controller works. We moved spd above the i2c_driver task because, since i2c_driver is monolithic (serves all controllers one at a time), putting it higher priority than spd meant that any i2c traffic anywhere on the system would block spd from running, making it very, very easy to miss its deadlines.

The priority inversion creates a very real availability bug:

  1. spd starts a blocking call to i2c_driver.
  2. Any task with priority between i2c_driver and spd starts running
  3. spd is now out to lunch for an arbitrarily long period of time despite being higher priority

The system does not currently enforce the uphill send rule that prevents these priority inversions, which is why you can build gimlet. But we intend to start, to avoid bugs like the one I just described.

So, let's fix spd.

I worked through a number of options on paper, and the one that seems like the best compromise between "reasonably low amount of work" and "not a total hack that will prevent me from sleeping" can be summarized as:

  1. Poll the DIMM thermal sensors from the thermal task (lower priority). Post the results to the sensors task (high priority) like all other thermal sensors.
  2. Read the SPD EEPROM contents from the gimlet_seq task (lower priority), because it knows exactly when the window between "DIMM SPD power available" (A2) and "potentially racing the host" (A0) is, since it's responsible for sequencing it.
  3. Deliver the SPD EEPROM contents into the spd task via IPC.
  4. Alter spd to serve cached EEPROM contents (pre-prepared by gimlet_seq before leaving A2), plus live temperature data taken from the sensors task. This means spd never makes a blocking call to any task except sensors, which is high priority and responds promptly.

The relative priority of the affected tasks, and their IPC relationships, are shown in this diagram:
PXL_20220610_175113553

In descending order from highest to lowest priority we have

  • sensors
  • spd
  • i2c
  • thermal / gimlet_seq - my assumption is that, for other reasons, gimlet_seq will wind up being lower priority than thermal, but for the purposes of fixing spd they are partially ordered and their relative priority doesn't matter.

This proposal would fix #562, #545, and the thing people have run into (but for which I cannot find a filed issue) where the host will lock up on boot intermittently despite known bad actors like humility dashboard not being used.

@cbiffle
Copy link
Collaborator Author

cbiffle commented Jun 15, 2022

Over in #601 I've reworked the H7 I2C target code to be more robust, and it seems like it even worked. So I'm looking into the SPD EEPROM side of this proposal now.

Currently, the spd task reads the EEPROMs once, at startup, using this routine. (The routine is called in a loop, but the loop exits once it succeeds; the comments mention that this is to handle racing the sequencer turning power on. That problem will be conveniently fixed by this proposal!)

Two 256-byte pages are read and proxied, though they're read in different ways -- the first as a sequence of 1+255 byte reads, the second as a pair of 128+128 byte reads. From my previous experience with this style of EEPROMs that difference is probably not important.

The output from the read phase consists of three pieces of information:

  1. The actual EEPROM contents.
  2. A "present" bitset recording which SPD EEPROMs were contacted.
  3. An overall count of DIMMs present. (This is only used to decide whether to retry the scan, and is not retained for the rest of the proxy's operation.)

The remainder of the code is the actual proxy implementation, which shouldn't need to change that much.

An issue I can foresee, though: the I2C target code assumes that it has complete control of the task and provides its own main loop. This makes it incompatible with tasks exposing an IPC interface. For the SPD task to receive EEPROM data from the sequencer as I've proposed, we will have to make two concessions:

  1. SPD wakes up serving an IPC interface, and later (in response to an IPC) stops serving that interface and begins running the proxy.
  2. This implies that the SPD EEPROM data cannot be updated after having been transferred by IPC.

This isn't great, in large part because any task so unfortunate as to send to spd after it has stopped listening will block forever...and the task most likely to do that is gimlet-seq. If gimlet-seq were to restart, in its current form, it would presumably attempt to contact spd and hang, to be eventually freed by a watchdog. This makes it yet more important that gimlet-seq can recover its state on a restart and not re-do work it has already done, something that I've built the plumbing for but have not implemented.

Another option is that I rework the I2C target code to remove the "inversion of control" structure. This would be ... involved, because it's using that inversion-of-control thing to good effect: it implements a multi-part state machine using code, rather than an explicit data-driven state machine, which is nice. Inverting that would require moving to an explicit data-driven state machine built out of match statements, and that would be harder to follow.

A final option is that I look into (ab)using the I2C target code's wfi hook to process incoming messages. This is less weird than it sounds, since the implementation of wfi in a Hubris task winds up being a recv. So, I think I'll take a crack at that. Fortunately Idol servers do not assume they control the main loop, so we can call idol_runtime::dispatch_n from within the I2C WFI hook. I think.

cbiffle added a commit that referenced this issue Jun 16, 2022
This will provide a channel for feeding EEPROM data into the cache,
incrementally, per the design laid out in #599.
cbiffle added a commit that referenced this issue Jun 16, 2022
The sequencer task now reads the EEPROMs once it's confident they're
available, and pushes the info into the SPD proxy before starting the
host. This eliminates the last blocking I2C transfer from the SPD proxy,
which should improve its liveness. It also eliminates the priority
inversion between SPD (high) and `i2c_driver` (lower).

Fixes #599.
@cbiffle
Copy link
Collaborator Author

cbiffle commented Apr 27, 2023

This got fixed as a side effect of #1242.

@cbiffle cbiffle closed this as completed Apr 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant