Skip to content

Conversation

@rivos-eblot
Copy link

@rivos-eblot rivos-eblot commented Nov 24, 2025

Following #294 I've updated our internal SPI device tests, so they fully pass using Verilator backend (Darjeeling).

However, when running those tests on QEMU, some tests are badly failing. It seems that at least on regression comes from 7378578. The symptom is the following:

Upon READ_JEDEC command processing, an INTR_UPLOAD_PAYLOAD_NOT_EMPTY interrupt is raised, which does not seem valid (and is not raised on Verilator).

PAYLOAD_NOT_EMPTY interrupt should not be raised for commands that do not have a payload.

I think this new sequence is invalid:

    case CTRL_MODE_FLASH:
        // ...
        /* uploaded payload */
        if (f->pos) {
            s->spi_regs[R_INTR_STATE] |= INTR_UPLOAD_PAYLOAD_NOT_EMPTY_MASK;
            update_irq = true;
        }
        // ...    

On command completion, the following code:raises an interruption irrespective of the payload_en setting.

This means that the READ_JEDEC command which can be arbitrary long may trigger a INTR_UPLOAD_PAYLOAD_NOT_EMPTY, while it does not trigger the INTR_UPLOAD_CMDFIFO_NOT_EMPTY, i.e. a payload without a command...

I think this comes from a combination of invalid code in original implementation that remained hidden until changes/fixes have been made with the new implementation (such as addressing the payload/upload misunderstanding).

Moreover, I do not think defining f->len = SPI_SRAM_PAYLOAD_SIZE; would be accurate in this case, as SRAM buffer is not addressed; fortunately other parts in the code safely deal with f->pos >= f->len cases.

While tracking down this issue, I discovered that PAYLOAD_EN bitfield was not used at all, so it is possible that for the SPI_PASSTHROUGH implementation (that we do not use), there is more to address than this fix.

@rivos-eblot
Copy link
Author

rivos-eblot commented Nov 24, 2025

TPM issue //sw/device/tests:spi_device_tpm_tx_rx_test_sim_qemu_sival_rom_ext was pre-existing: https://github.com/lowRISC/qemu/actions/runs/19632441818

@rivos-eblot rivos-eblot marked this pull request as ready for review November 24, 2025 17:32
Comment on lines 903 to 924
if (f->pos) {
s->spi_regs[R_INTR_STATE] |= INTR_UPLOAD_PAYLOAD_NOT_EMPTY_MASK;
update_irq = true;
if (ot_spi_device_flash_command_has_payload(f)) {
if (f->pos) {
s->spi_regs[R_INTR_STATE] |= INTR_UPLOAD_PAYLOAD_NOT_EMPTY_MASK;
update_irq = true;
}
Copy link

@ziuziakowska ziuziakowska Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a cursory glance at the RTL, and it seems the Read JEDEC and Read Status submodules (and possibly some others) do not have any interrupt wires, so a command shouldn't fire these interrupts if it was handled by the submodule, not necessarily if the command info has payload_en unset.

It is possible for payload_en to be set in the command info slots corresponding to Read JEDEC and Read Status, as the command info slots lose their special meaning in Passthrough mode and it avoids having to swap out the values when switching mode, but this field is not honoured in Flash mode for these slots.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you suggest we need to check for the current command, then for the payload_en state if the command supports it?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, it should check whether the command slot has been handled by hardware (slot is Read Status 1-3 or Read JEDEC, device is in Flash mode or in Passthrough Mode with the corresponding intercept enable bit set), if so, do not raise any interrupts, otherwise only then check payload_en.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm lost :-) has_payload requires the command to be a SW command, that is none of the HW commands.
If the behavior is different for the passthrough mode that I do not know at all, feel free to amend this PR.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the check should be something like this: #296 (comment)

@rivos-eblot rivos-eblot force-pushed the dev/ebl/spidev_payload_fix branch from a7c9fa3 to 95ececc Compare November 27, 2025 15:23
Comment on lines 920 to 924
if (ot_spi_device_flash_command_has_payload(f)) {
if (f->pos) {
s->spi_regs[R_INTR_STATE] |= INTR_UPLOAD_PAYLOAD_NOT_EMPTY_MASK;
update_irq = true;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the check should be something like this (without any deduplication):

Suggested change
if (ot_spi_device_flash_command_has_payload(f)) {
if (f->pos) {
s->spi_regs[R_INTR_STATE] |= INTR_UPLOAD_PAYLOAD_NOT_EMPTY_MASK;
update_irq = true;
}
bool cond = false;
switch (ot_spi_device_get_mode(s) {
case CTRL_MODE_FLASH:
if (f->slot <= SLOT_HW_READ_JEDEC) {
break;
}
cond = (f->cmd_info & CMD_INFO_PAYLOAD_EN_MASK) != 0u));
break;
case CTRL_MODE_PASSTHROUGH:
switch (f->slot) {
case SLOT_HW_READ_STATUS1:
case SLOT_HW_READ_STATUS2:
case SLOT_HW_READ_STATUS3:
cond = (FIELD_EX32(intercept_val32, INTERCEPT_EN, STATUS)) ? false : (f->cmd_info & CMD_INFO_PAYLOAD_EN_MASK) != 0u));
break;
case SLOT_HW_READ_JEDEC:
cond = (FIELD_EX32(intercept_val32, INTERCEPT_EN, JEDEC)) ? false : (f->cmd_info & CMD_INFO_PAYLOAD_EN_MASK) != 0u));
break;
default:
cond = (f->cmd_info & CMD_INFO_PAYLOAD_EN_MASK) != 0u));
break;
}
break;
default:
break;
}
if (cond) {
if (f->pos) {
s->spi_regs[R_INTR_STATE] |= INTR_UPLOAD_PAYLOAD_NOT_EMPTY_MASK;
update_irq = true;
}

At least, that is how I understand these lines.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least, that is how I understand these lines.

Never trust the documentation :) This is not true for at least READ_SFDP, from the waves I've observed.

I'll push a new version with your changes for the passthrough mode.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and it seems many tests no longer pass.

I think I will split this PR in flash_mode / passthough_mode. We are blocked by the regression that has been introduced with the latest changes in SPI device on the flash mode part; maybe the emulation is not accurate if I leave the passthrough mode as is, but it won't bring any regression and the passthrough mode later.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if @ziuziakowska's changes were still considering some flash commands to have payloads that shouldn't. I obviously don't know what's failing in your tests, but maybe you could change the flash case to break on f->slot <= SLOT_HW_READ_QUAD_IO rather than f->slot <= SLOT_HW_READ_JEDEC?

@rivos-eblot rivos-eblot force-pushed the dev/ebl/spidev_payload_fix branch 2 times, most recently from 31e1518 to 016b52f Compare December 2, 2025 09:44
@rivos-eblot rivos-eblot force-pushed the dev/ebl/spidev_payload_fix branch from 016b52f to 216eae4 Compare December 2, 2025 10:43
Copy link

@AlexJones0 AlexJones0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SPI device is definitely very complicated to get right. I had a look through the RTL and have given some links. Hopefully they can help.


static bool ot_spi_device_flash_command_is_payload_en(const SpiDeviceFlash *f)
{
return (f->cmd_info & CMD_INFO_PAYLOAD_EN_MASK) != 0u;
Copy link

@AlexJones0 AlexJones0 Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this is correct for the flash case. See usages of payload_en in spi_readcmd.sv in the RTL: if a non-standard lane-select value that isn't 0010/0011/1111 is selected then the SPI device just defaults to Single IO (0010). So there is no "disabling" the payload via payload_en.

However it looks like this is not the case for passthrough mode, which actually does care about the payload_en command info being non-zero - see spi_passthrough.sv.

As far as I can tell (it is very possible I am missing something, SPI device is very complex), nothing out of this readcmd and passthrough logic use this field - all other SPI modules tie it off. So its only use in QEMU emulation should probably be in the passthrough mode logic, whereas flash mode just ignores it given that we aren't emulating SPI data lanes here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK this is true, I believe payload_en is used to control Passthrough mode, and in Flash mode the "payload" is handled by cmdparse and forwarded to the hardware modules which have their own state machine, or for the software commands it is uploaded if the upload field is set, so there is no "payload" for the downstream flash so-to-speak.

upload_intr = ot_spi_device_flash_command_is_payload_en(f);
}
break;
case SLOT_HW_READ_JEDEC:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that SLOT_HW_READ_SFDP can also be intercepted, so that case should probably be handled here as well?

upload_intr = ot_spi_device_flash_command_has_payload(f);
break;
case CTRL_MODE_PASSTHROUGH:
switch (f->slot) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic seems correct to me (modulo my previous comments) - why remove it? Did something break in your tests with these changes?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there were dozen of failures in the CI, so this was a test, hence the additional commit :-)

I really do not know what to do with this PR to be honest: I do not know at all the passthrough feature, I'm afraid I have no time to dig into it, and we do not use it at all - and have no unit test for it.

However we are in a dead lock situation with the flash mode being broken. Maybe I should close this PR and let you guys address it when you have some time, and only address the flash mode on our side.

Copy link

@AlexJones0 AlexJones0 Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I think it's fine to not address passthrough in this PR, you can leave it untouched if you don't have the capacity/understanding (maybe you could add a short todo comment to capture that this IRQ is not correct for passthrough?)

I think it is good to just make the fix to flash mode in this PR to unblock you - I am hoping that with my other comment addressed (just removing the payload_en check for flash mode) your tests are still passing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thanks.
I pushed a new version where

  1. we check whether the command is SW command or not, without testing the status of PAYLOAD_EN.
  2. I've added a @todo for the passthrough mode
  3. I think I have kept the existing behaviour for passthrough (i.e. only testing f->pos)

Now waiting for the CI.

@rivos-eblot rivos-eblot force-pushed the dev/ebl/spidev_payload_fix branch from 216eae4 to d080f55 Compare December 2, 2025 17:25
Copy link

@AlexJones0 AlexJones0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks OK to me, thanks.

@rivos-eblot
Copy link
Author

I think this looks OK to me, thanks.

Thanks. Sorry not to be able to work on the passthrough mode.

@AlexJones0 AlexJones0 self-requested a review December 3, 2025 10:39
Copy link

@AlexJones0 AlexJones0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked locally and the ownership transfer test failures we're seeing in the CI ROM_EXT regression (or the //sw/device/silicon_creator/rom_ext/e2e/attestation:print_certs_test_sim_qemu_rom_ext which is the smallest) seem to be genuine issues with the SPI Device and not just flakiness.

I've fixed it by making this change, on upload we also should not be checking cmd_info.payload_en != 0

diff --git a/hw/opentitan/ot_spi_device.c b/hw/opentitan/ot_spi_device.c
index 7d22d31643..4a45114016 100644
--- a/hw/opentitan/ot_spi_device.c
+++ b/hw/opentitan/ot_spi_device.c
@@ -1465,12 +1465,10 @@ static void ot_spi_device_flash_exec_sw_command(OtSPIDeviceState *s, uint8_t rx)
         }
         break;
     case SPI_FLASH_UP_PAYLOAD:
-        if (ot_spi_device_flash_command_is_payload_en(f)) {
-            if (ot_spi_device_flash_command_is_rx_payload(f)) {
-                f->payload[f->pos % SPI_SRAM_PAYLOAD_SIZE] = rx;
-            }
-            f->pos++;
+        if (ot_spi_device_flash_command_is_rx_payload(f)) {
+            f->payload[f->pos % SPI_SRAM_PAYLOAD_SIZE] = rx;
         }
+        f->pos++;
         break;
     case SPI_FLASH_DONE:
         FLASH_CHANGE_STATE(s, ERROR);

@AlexJones0 AlexJones0 self-requested a review December 3, 2025 11:01
@rivos-eblot
Copy link
Author

I've fixed it by making this change, on upload we also should not be checking cmd_info.payload_en != 0

Ok, that explains why the f->pos counter differs between runs on both versions. Thanks.

PAYLOAD_NOT_EMPTY interrupt should not be raised for commands that do not have a payload.

Signed-off-by: Emmanuel Blot <eblot@rivosinc.com>
@rivos-eblot rivos-eblot force-pushed the dev/ebl/spidev_payload_fix branch from d080f55 to 576e1e5 Compare December 3, 2025 11:18
@rivos-eblot rivos-eblot merged commit 514e20a into lowRISC:ot-9.2.0 Dec 3, 2025
7 of 10 checks passed
@rivos-eblot rivos-eblot deleted the dev/ebl/spidev_payload_fix branch December 3, 2025 12:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants