Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

twister: simulation_only testsuite option #52264

Closed
wants to merge 6 commits into from

Conversation

teburd
Copy link
Collaborator

@teburd teburd commented Nov 15, 2022

Adds an option to inform twister a testsuite should only be built and
run for simulation platforms greatly reducing the build and runtime cost
for libraries and non-hardware specific code that needs to be built and
tested in CI on real hardware.

This is respected even if a platform filter is given for example -p intel_adsp_cavs25.
The only way to have the simulation_only rule be disregarded is with the force platform (-K)
or --all option being passed to twister.

Marks the rtio_api and log_api test suites as a simulation_only to show
how the rules are used.

Adds posix as a simulator type and sets native_posix as a simulation platform.

Signed-off-by: Tom Burdick thomas.burdick@intel.com

hakehuang
hakehuang previously approved these changes Nov 16, 2022
Copy link
Collaborator

@hakehuang hakehuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there are many cases can/need be tagged with simulation_only. but this is not an issue for this PR

@teburd
Copy link
Collaborator Author

teburd commented Nov 16, 2022

I believe there are many cases can/need be tagged with simulation_only. but this is not an issue for this PR

Completely agree, and I wanted to start off with what I felt like I knew were obvious ones to try it out with.

@aescolar
Copy link
Member

Hi @teburd ,
Could you motivate a bit more the aim of this change? what is the benefit of doing this compared to using platform_allow in the test yaml and/or a platform filter (-p) on the twister invocation?
(Note: I'm not opposing the change, I'd just like to understand the background)

@nashif
Copy link
Member

nashif commented Nov 17, 2022

agree we need to be able to identify some tests we currently have as:

simulator coverage of those tests is more than enough, no need to bother and run this on hardware, because the subsystem or library is not hardware specific, this is basically a unit test that is hardware agnostic, so do not waste your time build and flashing this on devices, focus on what really matter., etc. etc..

However, this will just hardcode the targets to simulators. We need to be able to run this code on devices as well, and -K while it might work, is a a manual step needed and might have some side effects on other filters in place. So my thought here is to provide the information re being testable using simulators as a hint rather than as a filter and use this for selecting tests, for example if I was to focus on hardware specific tests, I would exclude those tests marked as 'simulation only'.

Building on that, maybe it makes sense to make this an enum, where simulation is just one option, i.e.

Testability:

  • simulation
  • hardware
  • host

@teburd
Copy link
Collaborator Author

teburd commented Nov 17, 2022

Hi @teburd , Could you motivate a bit more the aim of this change? what is the benefit of doing this compared to using platform_allow in the test yaml and/or a platform filter (-p) on the twister invocation? (Note: I'm not opposing the change, I'd just like to understand the background)

The problem is testing on real hardware (or imagine systemc sims of real hardware!) is relatively slow and the tests being selected to run on hardware include things that don't really need to run on hardware.

twister -p someplatform -T tests

Suddenly you are running things like log_api and rtio (as noted here) which really don't need to be run and built on that platform, they are libraries of code that while maybe need to be validated on a particular architecture they don't need to be validated on a particular hardware platform that implements that architecture.

That takes time, costs real dollars and cents, and provides little to no additional value.

@teburd
Copy link
Collaborator Author

teburd commented Nov 17, 2022

agree we need to be able to identify some tests we currently have as:

simulator coverage of those tests is more than enough, no need to bother and run this on hardware, because the subsystem or library is not hardware specific, this is basically a unit test that is hardware agnostic, so do not waste your time build and flashing this on devices, focus on what really matter., etc. etc..

However, this will just hardcode the targets to simulators. We need to be able to run this code on devices as well, and -K while it might work, is a a manual step needed and might have some side effects on other filters in place. So my thought here is to provide the information re being testable using simulators as a hint rather than as a filter and use this for selecting tests, for example if I was to focus on hardware specific tests, I would exclude those tests marked as 'simulation only'.

Building on that, maybe it makes sense to make this an enum, where simulation is just one option, i.e.

Testability:

* simulation

* hardware

* host

I think that makes some sense though to start I'd limit it to simulation/hardware, to me posix is an API level simulation but maybe I'm misinterpreting.

so...

platform_hint: [simulation|hardware]

with a default of hardware

twister --ignore-platform-hint perhaps could be an option then to ignore them rather than -K/--all which as you mentioned have other possible implications.

Thoughts @nashif?

@PerMac
Copy link
Member

PerMac commented Nov 18, 2022

@aescolar Platform_allow is incorrectly used in many places. It is used to limit a CI, but in fact it is a hard filter, blocking execution on anything outside of the list for everyone. It is also hard to maintain. With adding new platform a platform_allow list would have to grow as well. The point is we don't want to fully block, but rather be able to limit whenever not needed.
Using --platform also won't help. We need a variable on a test level telling twister that a given test is well covered by (sim)emulation and there is no need to execute it every time on hw.
Real life example: we have a CI with several nrf platforms connected. We call twister with --platform pointing to those ones. What we see:

  • some tests cannot be executed at all, even though they are runnable on our platform, since they have platform_allow not containing platforms we test
  • some tests are executed even though there is no point in executing them in daily frequency, i.e. the types of tests this PR is addressing. I would like to be able to deselect such tests from a daily scope but e.g. verify them on a weekly bases (even though they shouldn't be hardware dependent, it won't hurt validating the execution on real hw)
    This feature could mitigate both issues.

@aescolar
Copy link
Member

aescolar commented Nov 22, 2022

@teburd , @nashif, @PerMac nice input :)
It would seem that the problem we are trying to solve would indeed benefit from adding something akin to hints that would then be used to decide what tests (and tests x platform combinations to run).
As @PerMac states the platform filters are abused today. Those should just be used to filter out platforms that cannot run a test.

Just as an idea that may help: What about defining the concept of having a default, and an extended set of tests (the extended set would be a superset of the default). Tests, boards on themselves, and tests-platforms combinations would be tagged as being part of the default set or extended. And we would add a command line option in twister to select if only the default set should be run, or the whole extended set.
In that way all tests that would be well enough covered in a given simulation or host platform would be tagged as being normally tested only in that one platform (as a hint), which twister would use by default. If somebody wanted to be extra sure (say on a weekly run, or before a release) they could run the extended set.
When a default hint contradicts the -p option, twister would just ignore the hint and run with the platform selected by -p.

This idea could be extended if so desired by defining also a "reduced" set that would be a subset of the default set.

@nashif
Copy link
Member

nashif commented Nov 22, 2022

Just as an idea that may help: What about defining the concept of having a default, and an extended set of tests (the extended set would be a superset of the default).

@aescolar good idea, and that is in many ways what we are discussing right now, see here for example: #51718

@teburd teburd added the DNM This PR should not be merged (Do Not Merge) label Dec 2, 2022
@teburd
Copy link
Collaborator Author

teburd commented Dec 2, 2022

Marked DNM as I'm taking into account the feedback here.

My plan is to add a platform_hint enum attribute that specifies better at what level in a taxonomy of platforms a test needs to run once to qualify as passing.

The taxonomy of a platforms soc and cpu archs is something that Zephyr has partially done already.

For SoC we have soc, soc family, soc series

For arch we have at a top level the arch name (arm, arm64, x86, x86_64, arc, xtensa, etc)
but this should be further classified.

A platform hint(s) provided by a test then says, given the set of platforms a test could be run on, run the test once per platform at a given taxonomy level.

Prioritization of a particular test platform at a particular taxonomy level could be provided by the board yaml saying as much. e.g. qemu_xtensa may specify taxonomy_priority: arch and take the top spot among all test platforms that provide an xtensa arch platform given a test needs to run once per arch.

Still mulling this over, but it seems to solve even more the specifications of at what taxonomy level a test needs to run at, and reducing test coverage duplication in that way.

Adds an option to inform twister a testsuite should only be built and
run for simulation platforms greatly reducing the build cost for libraries
and non-hardware specific code that needs to be built and tested in CI.

Signed-off-by: Tom Burdick <thomas.burdick@intel.com>
Native posix is a simulation platform and should be labeled as one. This
lets the simulator_only test suite option work correctly on native_posix
as would be expected.

Signed-off-by: Tom Burdick <thomas.burdick@intel.com>
RTIO is a software library that does not require direct hardware
integration and can be built and tested on emulation platforms only
to ensure the code works on all architectures supported by Zephyr.

Signed-off-by: Tom Burdick <thomas.burdick@intel.com>
Mark the log tests as simulation only avoiding building
testing on platforms that aren't simulators.

Signed-off-by: Tom Burdick <thomas.burdick@intel.com>
A test qualifies as a passing test given it builds/runs on each platform
with a unique key given the fields to key on.

Signed-off-by: Tom Burdick <thomas.burdick@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Logging area: native port Host native arch port (native_posix) area: RTIO area: Twister Twister DNM This PR should not be merged (Do Not Merge)
Projects
Development

Successfully merging this pull request may close these issues.

None yet

6 participants