OpenTitan regression script bug fixes #166

jwnrt · 2025-09-03T10:21:22Z

This fixes a couple of bugs that I missed in the original PR:

Collecting the test results for the artifact caused failures to be missed.
Some flaky tests were not in the flaky test list.
Sorting is locale-dependent, causing local runs to give different errors to CI.

.github/workflows/opentitan_regression.yaml

…e is caught The `| tee` pipe on the test script absorbs the failure status and causes the job to always succeed. Adding `pipefail` propagates the failure through the pipe. Signed-off-by: James Wainwright <james.wainwright@lowrisc.org>

The GitHub Actions CI has a different locale to my own local one, making it hard to run this script locally. Change script to sort locally so that the files don't need to be sorted with the same locale. Also fixes a bug where the flaky test list wasn't being used properly. Signed-off-by: James Wainwright <james.wainwright@lowrisc.org>

When running this script a second time, cached results will include the string `PASSED` twice which messes with this regex. Signed-off-by: James Wainwright <james.wainwright@lowrisc.org>

I have seen these tests pass and fail on different runs. Signed-off-by: James Wainwright <james.wainwright@lowrisc.org>

rivos-eblot · 2025-09-03T14:04:14Z

BTW it would be worth adding some reference & doc for this script to docs/opentitan

jwnrt · 2025-09-03T16:04:55Z

Good point, I've added a file for regressions.

docs/opentitan/regressions.md

Signed-off-by: James Wainwright <james.wainwright@lowrisc.org>

rivos-eblot · 2025-09-03T16:22:19Z

Thanks. LGTM

AlexJones0 · 2025-09-03T16:24:07Z

scripts/opentitan/tests-flaky.txt

+//sw/device/silicon_creator/lib/sigverify/sphincsplus/test:verify_test_kat1_sim_qemu_rom_with_fake_keys
+//sw/device/silicon_creator/lib/sigverify/sphincsplus/test:verify_test_kat2_sim_qemu_rom_with_fake_keys
+//sw/device/silicon_creator/lib/sigverify/sphincsplus/test:verify_test_kat3_sim_qemu_rom_with_fake_keys
+//sw/device/silicon_creator/lib/sigverify/sphincsplus/test:verify_test_kat4_sim_qemu_rom_with_fake_keys
+//sw/device/silicon_creator/lib/sigverify/sphincsplus/test:verify_test_kat5_sim_qemu_rom_with_fake_keys
+//sw/device/silicon_creator/lib/sigverify/sphincsplus/test:verify_test_kat6_sim_qemu_rom_with_fake_keys
+//sw/device/silicon_creator/lib/sigverify/sphincsplus/test:verify_test_kat7_sim_qemu_rom_with_fake_keys
+//sw/device/silicon_creator/lib/sigverify/sphincsplus/test:verify_test_kat9_sim_qemu_rom_with_fake_keys


I find these failures surprising since I've never seen them fail personally. I think it's fine to add for now though if you've seen this, and we can remove them again later if needed.

For me it fails more often than not:

//sw/device/silicon_creator/lib/sigverify/sphincsplus/test:verify_test_kat4_sim_qemu_rom_with_fake_keys TIMEOUT in 48 out of 50 in 60.2s

It must be timing related

Ah, I think it might be timing related - but not in the way you think. Running that test myself locally:

INFO: Build completed successfully, 6 total actions //sw/device/silicon_creator/lib/sigverify/sphincsplus/test:verify_test_kat4_sim_qemu_rom_with_fake_keys PASSED in 10.6s Stats over 5 runs: max = 10.6s, min = 10.6s, avg = 10.6s, dev = 0.0s

I think it heavily depends upon the speed of the host machine. This could explain why you keep seeing errors on the CI runner (and locally) but I don't? What is the failure mode - is it just a genuine timeout?

If so, these tests probably need tagging with longer timeouts in OpenTitan.

jwnrt requested review from AlexJones0 and rivos-eblot September 3, 2025 10:21

jwnrt force-pushed the regression-pipefail branch from d8e3ae0 to 3cb56c4 Compare September 3, 2025 10:31

rivos-eblot reviewed Sep 3, 2025

View reviewed changes

.github/workflows/opentitan_regression.yaml Show resolved Hide resolved

jwnrt force-pushed the regression-pipefail branch 2 times, most recently from cbf69c5 to 5895be8 Compare September 3, 2025 12:40

jwnrt added 4 commits September 3, 2025 13:50

[ot] scripts/opentitan: bazel: constrain passing test regex

d1cb726

When running this script a second time, cached results will include the string `PASSED` twice which messes with this regex. Signed-off-by: James Wainwright <james.wainwright@lowrisc.org>

[ot] scripts/opentitan: bazel: add more tests to the flaky test list

0e9e1a9

I have seen these tests pass and fail on different runs. Signed-off-by: James Wainwright <james.wainwright@lowrisc.org>

jwnrt force-pushed the regression-pipefail branch from 5895be8 to 0a676d4 Compare September 3, 2025 16:04

jwnrt requested a review from rivos-eblot September 3, 2025 16:05

rivos-eblot approved these changes Sep 3, 2025

View reviewed changes

docs/opentitan/regressions.md Show resolved Hide resolved

[ot] docs/opentitan: regressions: add docs

67feaa2

Signed-off-by: James Wainwright <james.wainwright@lowrisc.org>

jwnrt force-pushed the regression-pipefail branch from 0a676d4 to 67feaa2 Compare September 3, 2025 16:19

AlexJones0 approved these changes Sep 3, 2025

View reviewed changes

AlexJones0 mentioned this pull request Sep 3, 2025

ot_kmac: Implement error processing + misc. fixes #171

Merged

jwnrt merged commit 9cbbcb0 into lowRISC:ot-earlgrey-9.2.0 Sep 4, 2025
8 checks passed

jwnrt deleted the regression-pipefail branch September 4, 2025 07:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OpenTitan regression script bug fixes #166

OpenTitan regression script bug fixes #166

Uh oh!

jwnrt commented Sep 3, 2025

Uh oh!

Uh oh!

rivos-eblot commented Sep 3, 2025

Uh oh!

jwnrt commented Sep 3, 2025

Uh oh!

Uh oh!

rivos-eblot commented Sep 3, 2025

Uh oh!

AlexJones0 Sep 3, 2025

Uh oh!

jwnrt Sep 3, 2025

Uh oh!

AlexJones0 Sep 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

OpenTitan regression script bug fixes #166

OpenTitan regression script bug fixes #166

Uh oh!

Conversation

jwnrt commented Sep 3, 2025

Uh oh!

Uh oh!

rivos-eblot commented Sep 3, 2025

Uh oh!

jwnrt commented Sep 3, 2025

Uh oh!

Uh oh!

rivos-eblot commented Sep 3, 2025

Uh oh!

AlexJones0 Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

jwnrt Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

AlexJones0 Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AlexJones0 Sep 3, 2025 •

edited

Loading