-
Notifications
You must be signed in to change notification settings - Fork 796
[SYCL][Unit Tests] Fix regression after https://github.com/intel/llvm/pull/19687 #20534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][Unit Tests] Fix regression after https://github.com/intel/llvm/pull/19687 #20534
Conversation
This patch fixes a regression happened after intel#19687: `check-all` lost the entire SYCL-Unit test suite, so SYCL unit tests did not run for `check-all` target.
sycl/test/Unit/lit.cfg.py
Outdated
|
|
||
| # testFormat: The test format to use to interpret tests. | ||
| config.test_format = lit.formats.GoogleTest(config.llvm_build_mode, "Tests") | ||
| config.test_format = lit.formats.GoogleTest(config.llvm_build_mode, test_suffix="Tests_non_preview") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what is better:
Option 1: add a workaroud here as it is now: we call the existing ctor, and when modify attribute self.test_suffixes
Option 2: modify llvm/llvm/utils/lit/lit/formats/googletest.py but it will be a modification of common LLVM sources, so if community changes something there, it will break on our end. Modification means add a ctor which takes list of test suffixes. I think suggesting the patch to the community is not an option as we need to fix this very fast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I prefer a combination of the two: Go with Option 1 for now, then open a change to upstream LLVM with the modifications in Option 2. When it is merged upstream and it reaches intel/llvm we can drop the workaround.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also Option 3 exists: refactor our CMake files so that even for non_preview the test binary name suffix is still Tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As pre-commit shows, it looks like it is better to separate non_preview and preview tests by dividing SYCL-Unit into SYCL-Unit-Preview and SYCL-Unit-Non-Preview as the preview and non-preview variations of the same test can run simultaneously and we can get a data race.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In c8999d6: kept single test_prefix (Tests). Test names started to look ugly, like SchedulerTests_Non_Preview_Tests, but I think it could be temporary. Also disabled tests which used the same external recourses and as a result failed due to data races. These tests should be re-written. I will create a ticket against SYCL RT team and add the ticket number to the test sources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than completely disabling unit tests for testing preview breaking changes, you could do:
config.test_format = lit.formats.GoogleTest(config.llvm_build_mode, "Non_Preview_Tests")
in sycl/test/Unit/lit.cfg.py - so that we run only non-preview unitests as a part of check-sycl/check-all. This won't reduce testing coverage as we'd still be running preview version of SYCL unittests in intel/llvm CI as part of ninja check-sycl-unittests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@uditagarwal97 But what about our feature board? Let's imagine the situation that we need to see the results with preview version of library in it. In that case they won't be there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@uditagarwal97 But what about our feature board? Let's imagine the situation that we need to see the results with
previewversion of library in it. In that case they won't be there.
Why would we need such results? Either we can't break ABI meaning that our main goal is to cover regular path with tests, or we can break ABI in which case let's just break it and just have one code path instead of two
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@uditagarwal97 But what about our feature board? Let's imagine the situation that we need to see the results with
previewversion of library in it. In that case they won't be there.Why would we need such results? Either we can't break ABI meaning that our main goal is to cover regular path with tests, or we can break ABI in which case let's just break it and just have one code path instead of two
OK. Let's include only non-preview tests to check-all as part of this specific PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ready to review.
| #include <regex> | ||
|
|
||
| TEST(ConfigTests, CheckConfigProcessing) { | ||
| TEST(ConfigTests, DISABLED_CheckConfigProcessing) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are all the disabled unittests failing because they set some environment variable which gets unintentionally modified when we run preview and non-preview unittests parallelly?
IIRC, preview and non-preview unit tests are in different folders so there shouldn't be any race on filesystem...
Btw, in CI, when we run check-sycl-unitests, we run both preview and non-preview unittests and we don't see any failure there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference in our setup when we run unit tests via check-sycl-unitests and check-sycl? It's weird that unitests are failing with one but not another.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the main difference between check-sycl-unittests and check-sycl is that check-sycl-unittests handles different tests at build system level, i.e. each test executable is a separate target which are all run sequentially:
From CI logs:
cmake --build $GITHUB_WORKSPACE/build --target check-sycl-unittests
I don't think that cmake runs more than 1 thread by default.
check-sycl on the other hand is defined as a LIT test suite:
Lines 79 to 85 in e05e82b
| add_lit_testsuite(check-sycl-combined-triples "Running device-agnostic SYCL regression tests for all available triples" | |
| ${CMAKE_CURRENT_BINARY_DIR} | |
| ARGS ${RT_TEST_ARGS} | |
| PARAMS "SYCL_TRIPLE=${TRIPLES}" | |
| DEPENDS ${SYCL_TEST_DEPS} | |
| ${SYCL_TEST_EXCLUDE} | |
| ) |
And therefore it is by default launched in highly parallel mode by LIT where each TEST is a separate LIT test and can be run in parallel with other tests.
So yes, our CI simply runs everything sequentially in one thread, hence no races. LIT runs everything in parallel - and I assume that working directory for all tests is the same, so conf.txt file that we create is shared by more then one tests (i.e. its preview and non-preview versions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes a lot of sense. Thanks!
|
@intel/llvm-gatekeepers please consider merging |
This patch fixes a regression happened after #19687:
check-alllost the entire SYCL-Unit test suite, so SYCL unit tests did not run forcheck-alltarget.