Skip to content

twister: Adjust status for quarantined instances #90617

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 5, 2025

Conversation

PerMac
Copy link
Member

@PerMac PerMac commented May 26, 2025

Skipped status fits quarantined items better than filtered.
Filtered tests are by default removed from reports, which
shouldn't be the case for quarantined tests.
Adjust tests unit and blackbox tests accordingly.

hakehuang
hakehuang previously approved these changes May 27, 2025
@hakehuang hakehuang self-requested a review May 27, 2025 03:47
@PerMac PerMac force-pushed the quarantineF branch 3 times, most recently from 5f04812 to b7939b7 Compare May 27, 2025 12:44
hakehuang
hakehuang previously approved these changes May 28, 2025
Copy link
Collaborator

@gchwier gchwier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you want to have also information about quarantined tests on the console in Summary report?

Summary
├── Total test suites: 6
├── Processed test suites: 6
│   ├── Filtered test suites: 5
│   │   ├── Filtered test suites (static): 5
│   │   └── Filtered test suites (at runtime): 0
│   └── Selected test suites: 1
│       ├── Skipped test suites: 0
│       ├── Passed test suites: 1
│       ├── Built only test suites: 0
│       ├── Failed test suites: 0
│       └── Errors in test suites: 0
└── Total test cases: 7
    ├── Filtered test cases: 5
    └── Selected test cases: 2
        ├── Passed test cases: 2
        ├── Skipped test cases: 0
        ├── Built only test cases: 0
        ├── Blocked test cases: 0
        ├── Failed test cases: 0
        └── Errors in test cases: 0

I've tested with command:
./scripts/twister -T samples/hello_world -T samples/subsys/testsuite/pytest/shell -T samples/basic/blinky -vv --west-flash --device-testing --hardware-map hardware_map.yml --quarantine-list quarantine.yaml --no-clean

where quarantine file:

- scenarios:
    - sample.basic.helloworld
  platforms:
    - nrf54l15dk/nrf54l15/cpuapp

- scenarios:
    - .*.vt100_colors_off
    - sample.harness.shell

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know if someone is using generated files: twister_suite_report.xml and twister_report.xml :
see line 778:

            self.xunit_report(json_file, filename + ".xml", full_report=False)
            self.xunit_report(json_file, filename + "_report.xml", full_report=True)
            self.xunit_report_suites(json_file, filename + "_suite_report.xml")

but in generated files the skipped counters is set to 0 (proper value only in twister.xml)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an oversight on my side, I will fix it. thanks

@nashif
Copy link
Member

nashif commented May 29, 2025

why does this need to be a new status? Did you consider using SKIPPED instead with a reason that indicates it is because of quarantine?

@PerMac
Copy link
Member Author

PerMac commented May 29, 2025

@nashif I believe that having a dedicated (native) status for quarantined tests would signify the difference between such tests and tests that where skipped/filtered by other reasons. I think there is a psychological effect, where people would ignore filtered/skipped tests (as something that is expected) but seeing quarantined could raise their awareness (similarly to a warning in logs).
There are also some differences how such tests are handled (e.g. beside the presence in report which is addressed in this PR, it was discussed that quarantined tests should have extra verbosity, which I plan to tackle in a next PR).
Of course the same effect could be achieved by having slightly more complex clauses (if "skipped" and reason "quarantine") but I believe that dedicated status helps with organizing/understanding a logic of test management workflows and should be less error prone.

However, I am opened to suggestions/counter opinions. We can briefly ask other testers what's their take on this.

@nashif
Copy link
Member

nashif commented May 29, 2025

Moving them from FILTERED to SKIPPED would get you this visibility IMO, anything we have marked as SKIPPED right now, should have exactly the same scrutiny and attention, there is not much special about quarantine beside the fact it is managed, however, if a test is skipped because of overflows, it is a test that was not executed and needs exactly the same attention.

see here https://github.com/zephyrproject-rtos/zephyr/runs/43108372284 for example, skipped tests are already listed, nothing special to be done and it is already supported by various tooling.

@PerMac
Copy link
Member Author

PerMac commented May 29, 2025

@gchwier I didn't want to change counting in this PR. My idea was to first fix quarantine test not present in reports and achieve this by introducing a new status. In a follow up PR I plan to further modify some behaviors, like having separate counter and addressing if quarantine tests should be more verbose in twisters output.
However, if we decide to add new status, adding counting for might also be needed in the same PR

@PerMac
Copy link
Member Author

PerMac commented May 29, 2025

@nashif I understand that there are multiple ways to address quarantine tests in reports:

  • new status
  • leave them as "filtered" but modify behavior in reporting section
  • move them to "skipped", so they are present in reports but add different treatment (e.g. they are not counted as runtime skips)

IMO the most transparent solution is to have the dedicated status.
From my understanding we have "filtered" for tests that were "statically" (based on strings) descoped at the "test plan" level, and "skipped" for runtime skips requiring cmake pre-compilation. Now "quarantined" is a subset of "filtered", since it is a "static" filter at the scope selection level.
TBH if you don't think the new status is a good idea I would prefer to leave them as "filtered" but address their reporting, so we won't blur the boundaries between "static" and "dynamic" descoping.

@nashif
Copy link
Member

nashif commented May 29, 2025

skipped is not necessarily "runtime". Those things that are being skipped right now could be considered quarantied at runtime as well, we could also list those test with 'skip: True' as skipped and report them.

  • new status
  • leave them as "filtered" but modify behavior in reporting section

That does not provide any value, filtered are things I really do not care about. Those are tests that not supppoed to be run and do not need tracking

  • move them to "skipped", so they are present in reports but add different treatment (e.g. they are not counted as runtime skips)

The way I see it, this is the intended usecase for skipped, coming up with another status is an overkill, non-standard and requires lots of adaption to tooling. In the past we used skipped for many other things, but looking at what we have right now, those are actually things we can act on and need attention.

Verbosity >1 should be enough to see filtered tests. Coupling
it with debug logging level makes the output messy.

Signed-off-by: Maciej Perkowski <maciej.perkowski@nordicsemi.no>
@PerMac PerMac changed the title twister: Add QUARANTINE status twister: Adjust status for quarantined instances Jun 2, 2025
@PerMac
Copy link
Member Author

PerMac commented Jun 2, 2025

@nashif As suggested, I change quarantined statuses to skipped

@@ -114,7 +114,7 @@ def twister(options: argparse.Namespace, default_options: argparse.Namespace):
if i.status == TwisterStatus.FILTER:
if options.platform and not tplan.check_platform(i.platform, options.platform):
continue
logger.debug(
logger.info(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very verbose, even in debug mode, given that the comment field is pretty much freestyle and someone might end up writing a novel there. Maybe we should have a field for an issue number, so it is managed and can be added to the console output, right now:

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand the issue is the length of messages, and not the change in logger level, right?
I am worried that unify field for a number might not work, since quarantine is not only for upstream zephyr. We use it also linking to internal tickets, and I assume some users might not even care that much about such linking in their projects.
What about some simplified generic comment: "Quarantined, check reports for details", or even just "Quarantined"?

Copy link
Member

@nashif nashif Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand the issue is the length of messages, and not the change in logger level, right?

yes, putting a freestyle comment in the output here is asking for trouble. A comment might be anything.

I am worried that unify field for a number might not work, since quarantine is not only for upstream zephyr.

Then do not use it downstream, or use your own numbering scheme or whatever, not using it downstream is not a good reason for not adding features and usabiliy upstream.

We use it also linking to internal tickets, and I assume some users might not even care that much about such linking in their projects. What about some simplified generic comment: "Quarantined, check reports for details", or even just "Quarantined"?

This is not linking anything, this is a reference only, if an issue field exists, it will be shown as:

Quarantied: see #<issue number>

instead of a full link based on a comment.

issue number can be anything and you can use your own numbering/reference scheme downstream.

For now, you can just leave it as "Quarantined", that reference field thing can be done later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw now we already will need some differentiation between "quarantine" skips and "other" skips, (not showing the whole reason in logs). I like the idea of some extra formatting field for issue number, but please keep in mind that it will add even more tricky handling of skip types at reporting level. that's why I preferred to have a dedicated status for quarantined items.

@nashif
Copy link
Member

nashif commented Jun 3, 2025

ok, i can see those reported as skipped in twister.json, but shouldnt those skipped tests actually be reported on the screen as skipped as well? I am getting nothing righ now:

 ./scripts/twister -T samples/hello_world/ --quarantine-list  ~/quar.yml -v
ZEPHYR_BASE unset, using "/home/nashif/zephyrproject/zephyr"
Renaming output directory to /home/nashif/zephyrproject/zephyr/twister-out.5
INFO    - Using Ninja..
INFO    - Zephyr version: v4.1.0-4737-gc9872af0ad31
INFO    - Using 'zephyr' toolchain.
INFO    - Selecting default platforms per testsuite scenario
INFO    - Building initial testsuite list...
INFO    - Writing JSON report /home/nashif/zephyrproject/zephyr/twister-out/testplan.json
INFO    - JOBS: 22
INFO    - Adding tasks to the queue...
INFO    - Added initial list of jobs to queue

INFO    - 1 test scenarios (46 configurations) selected, 0 configurations filtered (0 by static filter, 0 at runtime).
INFO    - 0 of 46 executed test configurations passed (0.00%), 0 built (not run), 0 failed, 0 errored, with no warnings in 1.12 seconds.
INFO    - 0 of 0 executed test cases passed (0.00%) on 0 out of total 1067 platforms (0.00%).
INFO    - 46 selected test cases not executed: 46 skipped.
INFO    - 0 test configurations executed on platforms, 46 test configurations were only built.
INFO    - Saving reports...
INFO    - Writing JSON report /home/nashif/zephyrproject/zephyr/twister-out/twister.json
INFO    - Writing xunit report /home/nashif/zephyrproject/zephyr/twister-out/twister.xml...
INFO    - Writing xunit report /home/nashif/zephyrproject/zephyr/twister-out/twister_report.xml...
INFO    - Run completed

@nashif
Copy link
Member

nashif commented Jun 3, 2025

Also

INFO - 46 selected test cases not executed: 46 skipped.

that one is correct.

INFO - 0 test configurations executed on platforms, 46 test configurations were only built.

That one is not. AFAIK those skipped tests were never built?

@PerMac
Copy link
Member Author

PerMac commented Jun 3, 2025

shouldn't those skipped tests actually be reported on the screen as skipped as well?

These outputs are "locked" with verbosity >1 to follow how "filtered" is handled. With the change to the logging level there, adding extra -v will enable skipped and filtered on the screen as well (before it also required adding -ll debug, which overloaded the output with debug logs).

Do you think skips should be visible already at verbosity>0?

@nashif
Copy link
Member

nashif commented Jun 3, 2025

Do you think skips should be visible already at verbosity>0?

yes, it is already visible right now, there should not be different types of skipped, one being shown, the other not:

image

Skipped status fits quarantined items better than filtered.
Filtered tests are by default removed from reports, which
shouldn't be the case for quarantined tests.
Adjust tests unit and blackbox tests accordingly.

Signed-off-by: Maciej Perkowski <maciej.perkowski@nordicsemi.no>
Copy link

sonarqubecloud bot commented Jun 4, 2025

@kartben kartben merged commit 49595c7 into zephyrproject-rtos:main Jun 5, 2025
47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants