Skip to content

Conversation

jwnrt
Copy link

@jwnrt jwnrt commented Sep 3, 2025

This fixes a couple of bugs that I missed in the original PR:

  1. Collecting the test results for the artifact caused failures to be missed.
  2. Some flaky tests were not in the flaky test list.
  3. Sorting is locale-dependent, causing local runs to give different errors to CI.

@jwnrt jwnrt force-pushed the regression-pipefail branch 2 times, most recently from cbf69c5 to 5895be8 Compare September 3, 2025 12:40
…e is caught

The `| tee` pipe on the test script absorbs the failure status and
causes the job to always succeed. Adding `pipefail` propagates the
failure through the pipe.

Signed-off-by: James Wainwright <james.wainwright@lowrisc.org>
The GitHub Actions CI has a different locale to my own local one, making
it hard to run this script locally. Change script to sort locally so
that the files don't need to be sorted with the same locale.

Also fixes a bug where the flaky test list wasn't being used properly.

Signed-off-by: James Wainwright <james.wainwright@lowrisc.org>
When running this script a second time, cached results will include the
string `PASSED` twice which messes with this regex.

Signed-off-by: James Wainwright <james.wainwright@lowrisc.org>
I have seen these tests pass and fail on different runs.

Signed-off-by: James Wainwright <james.wainwright@lowrisc.org>
@rivos-eblot
Copy link

BTW it would be worth adding some reference & doc for this script to docs/opentitan

@jwnrt jwnrt force-pushed the regression-pipefail branch from 5895be8 to 0a676d4 Compare September 3, 2025 16:04
@jwnrt
Copy link
Author

jwnrt commented Sep 3, 2025

Good point, I've added a file for regressions.

@jwnrt jwnrt requested a review from rivos-eblot September 3, 2025 16:05
Signed-off-by: James Wainwright <james.wainwright@lowrisc.org>
@jwnrt jwnrt force-pushed the regression-pipefail branch from 0a676d4 to 67feaa2 Compare September 3, 2025 16:19
@rivos-eblot
Copy link

Thanks. LGTM

Comment on lines +3 to +10
//sw/device/silicon_creator/lib/sigverify/sphincsplus/test:verify_test_kat1_sim_qemu_rom_with_fake_keys
//sw/device/silicon_creator/lib/sigverify/sphincsplus/test:verify_test_kat2_sim_qemu_rom_with_fake_keys
//sw/device/silicon_creator/lib/sigverify/sphincsplus/test:verify_test_kat3_sim_qemu_rom_with_fake_keys
//sw/device/silicon_creator/lib/sigverify/sphincsplus/test:verify_test_kat4_sim_qemu_rom_with_fake_keys
//sw/device/silicon_creator/lib/sigverify/sphincsplus/test:verify_test_kat5_sim_qemu_rom_with_fake_keys
//sw/device/silicon_creator/lib/sigverify/sphincsplus/test:verify_test_kat6_sim_qemu_rom_with_fake_keys
//sw/device/silicon_creator/lib/sigverify/sphincsplus/test:verify_test_kat7_sim_qemu_rom_with_fake_keys
//sw/device/silicon_creator/lib/sigverify/sphincsplus/test:verify_test_kat9_sim_qemu_rom_with_fake_keys

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find these failures surprising since I've never seen them fail personally. I think it's fine to add for now though if you've seen this, and we can remove them again later if needed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me it fails more often than not:

//sw/device/silicon_creator/lib/sigverify/sphincsplus/test:verify_test_kat4_sim_qemu_rom_with_fake_keys TIMEOUT in 48 out of 50 in 60.2s

It must be timing related

Copy link

@AlexJones0 AlexJones0 Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think it might be timing related - but not in the way you think. Running that test myself locally:

INFO: Build completed successfully, 6 total actions
//sw/device/silicon_creator/lib/sigverify/sphincsplus/test:verify_test_kat4_sim_qemu_rom_with_fake_keys PASSED in 10.6s
  Stats over 5 runs: max = 10.6s, min = 10.6s, avg = 10.6s, dev = 0.0s

I think it heavily depends upon the speed of the host machine. This could explain why you keep seeing errors on the CI runner (and locally) but I don't? What is the failure mode - is it just a genuine timeout?

If so, these tests probably need tagging with longer timeouts in OpenTitan.

@jwnrt jwnrt merged commit 9cbbcb0 into lowRISC:ot-earlgrey-9.2.0 Sep 4, 2025
8 checks passed
@jwnrt jwnrt deleted the regression-pipefail branch September 4, 2025 07:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants