Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overhaul test integration in build system #7851

Merged
merged 21 commits into from
Aug 29, 2021
Merged

Conversation

mqudsi
Copy link
Contributor

@mqudsi mqudsi commented Mar 22, 2021

This is my second attempt at overhauling our varied test suites and unifying them under the ctest umbrella. Due to the limitations of the way ctest was hacked on to CMake, test remains a separate custom target, but nevertheless utilizes ctest under the hood.

The biggest change is that each individual test is run in a clean environment, created just for that test. And "test" here no longer means one of three (low level, pexpect, little check) but rather each test run by those test drivers. (Never again will you have to account for an alias that was added but not removed in a previous test).

This change also made it possible to run the tests in parallel (well, mostly), which lets me run the entire fish test suite, including the overhead of creating a new environment for each test, in 13 seconds. Tests continue to be run sequentially under CI.

An additional improvement is the addition of top-level targets for each test, meaning if test foo failed and you would like to retry just that test, ninja -C build test_foo will retry that test (regardless of which test suite it belongs to).

That said, the ability to run tests in parallel has shown some tests that are designed with global system state in mind - they need to be changed in order to pass when executed in parallel (or else can be forced to execute serially, but I think it's a good idea to clean up the test suite to make everything less brittle) - this applies mainly to jobs.fish. The history merge test suite has issues that I believe might indicate a bug in either the creation of a new environment or what constitutes a unique fish session, the history race condition test continues to be extremely flaky and reports items dropped from history when the system is under heavy load when the test is executed - at this point, I suspect an actual issue. Edit: later commits have changed the serialization level for history tests to prevent multiple history tests being run (serially) in one process (inadvertently in the case of this fork, by design in master); this seems to have resolved their flakiness.

The actual changes are fairly free of any hacks with one notable exception: cmake/ctest doesn't have a "discover tests" stage so as part of the "configure" step, a bastardized version of fish_tests is built (without any dependencies on fish itself) and invoked to emit a list of tests - the alternative to this is for each low level test to be explicitly named in the CMake files (ugly) or each low level test to be moved to a separate source file (cringe).

@mqudsi mqudsi force-pushed the ctest branch 2 times, most recently from c5d6f7c to 1a56762 Compare March 23, 2021 06:00
@mqudsi mqudsi changed the title Improve correctness, ergonomics, and performance of fish tests Overhaul test integration in build system Mar 23, 2021
@mqudsi mqudsi force-pushed the ctest branch 4 times, most recently from c79e655 to 5d67be5 Compare March 24, 2021 01:43
@ridiculousfish
Copy link
Member

ridiculousfish commented Mar 24, 2021

Awesome! Thanks for tackling this! I'm very very happy to have real isolated tests and also the ability to run individual tests.

@mqudsi can you say a bit about why you chose CTest? I don't know much about it - is it actually good, or just convenient? I see the configure-time test discovery step, I worry about that since it means you have to reconfigure to pick up new tests.

@mqudsi mqudsi force-pushed the ctest branch 3 times, most recently from c1b9c35 to 1d21329 Compare March 24, 2021 02:13
@mqudsi
Copy link
Contributor Author

mqudsi commented Mar 25, 2021

The primary reason for going with ctest is to avoid any additional dependencies or further complications to our build system (I feel like they're all ugly, so might as well stick with what we have).

CTest has integration for Google Test if we ever want to formally adopt a framework. Looking through the code for that, I have an idea for how we can avoid the configure-time discovery for the low-level tests suite that I might try my hand at (all other tests discovered via globbing do not require a manual reconfigure to be picked up, as the addition of any files matching the glob pattern for the little check and pexpect tests will automatically trigger a CMake run).

On a more positive note, as of last night all the tests are passing. The new test driver was fine, but some of the tests had shared state via the ../test/temp folder that I patched out, and jobs.fish was further isolated to prevent asserting anything about the global state of pre-existing or simultaneously created zombies in the session. macOS runs the tests in a different order, which threw me for a loop because I thought it was an OS-specific issue that was causing them to fail when it turned out to be state (or lack thereof) peristing from a previous test.

Copy link
Member

@krobelus krobelus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running individual littlecheck and pexpect tests is a great improvement.
With fish_tests that already works fairly well, so if test discovery causes problems we could get by without that. Though a better integration is surely nice here as well.

A detail: the individual tests are named just by the basename, like abbr.fish. I'd prefer something like test/abbr.fish so they are all grouped. I noticed this in the tab-completion results of ninja that are sorted alphabetically by fish, which makes sense since ninja -t targets does not give a consistent order.

# dynamically discover tests, our options are either this or resorting to sed/awk to parse the low
# level tests source file to get the list of individual tests. Or to split each test into its own
# source file.
execute_process(
Copy link
Member

@krobelus krobelus Mar 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get this on linux:

/usr/bin/ld: /tmp/ccPMN3np.o: warning: relocation against `_ZN10features_t15global_featuresE' in read-only section `.text._Z21mutable_fish_featuresv[_Z21mutable_fish_featuresv]'
/usr/bin/ld: warning: creating DT_TEXTREL in a PIE
./fish_tests_list: error while loading shared libraries: unexpected PLT reloc type 0x00

A quick fix would be to use conditional compilation:

#ifdef FISH_LIST_LOW_LEVEL_TESTS
int main(int argc, char **argv) {
      list_tests();
 }
# else
// normal fish_tests.cpp goes here
#endif

Another idea is to keep a separate file like

TEST("utility_functions", "test_utility")
TEST("wcstring_tok, "test_wcstring_tok")

Define TEST as macro and include it from fish_tests.cpp, and the build system. One advantage of the current setup is that we get nicer stacktraces, that point to failing test cases, though that's not really a problem.
However, I don't think that's worth the split to a separate file (unless it greatly improves on test discovery).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I developed this branch on Linux, compiling with gcc but linking with lldb. What’s your toolchain?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linux + gcc (same with clang) + GNU ld

tests/test_driver.sh Show resolved Hide resolved
tests/checks/jobs.fish Outdated Show resolved Hide resolved
@krobelus
Copy link
Member

I noticed is that "unsupported" littlecheck tests are listed as "passed" with ctest while littlecheck would print

image

Of course that's a really minor detail since we only have like 2 tests with REQUIRES.

@mqudsi
Copy link
Contributor Author

mqudsi commented Mar 26, 2021

I noticed is that "unsupported" littlecheck tests are listed as "passed" with ctest while littlecheck would print

image

Of course that's a really minor detail since we only have like 2 tests with REQUIRES.

If little check returns a specific error code, we can program that to be a “skipped” marker and have ctest act accordingly. I’m guessing it’s just returning 0 here.

@krobelus
Copy link
Member

If little check returns a specific error code, we can program that to be a “skipped” marker

Ok, let's use exit code 125 since git bisect uses the same? Anything other than 0 or 1 should work. We can change littlecheck's behavior here since the REQUIRE syntax is fairly new.

@faho faho marked this pull request as draft March 27, 2021 14:45
@faho
Copy link
Member

faho commented Mar 27, 2021

The "wip (do not merge)" label seems superfluous, that's what PR drafts are for - they block merging in the UI, so I've taken the liberty of converting this to one.

@mqudsi
Copy link
Contributor Author

mqudsi commented Mar 28, 2021

A detail: the individual tests are named just by the basename, like abbr.fish. I'd prefer something like test/abbr.fish so they are all grouped. I noticed this in the tab-completion results of ninja that are sorted alphabetically by fish, which makes sense since ninja -t targets does not give a consistent order.

Unfortunately, CMake target names may not contain a slash:

CMake Error at cmake/Tests.cmake:101 (add_custom_target):
  add_custom_target called with invalid target name "tests/wait.py".  Target
  names may not contain a slash.  Use ADD_CUSTOM_COMMAND to generate files.
Call Stack (most recent call first):
  cmake/Tests.cmake:155 (add_test_target)
  CMakeLists.txt:207 (include)

-Wl,-undefined,dynamic_lookup,--unresolved-symbols=ignore-all
WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}
)
execute_process(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not try to make each low-level test a separate CTest test. This adds a lot of complexity at configuration time, and it is easy to run individual low-level tests with the fish_tests binary itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hack has been removed and CMake now uses text preprocessing combined with a no-op macro in fish_tests.cpp to detect low-level tests. Complexity is considerably reduced and imho now at an acceptable level. I am strongly in favor of making this new system holistic, and believe that all test atomics should be runnable via a unified harness/interface.

tests/test_driver.sh Show resolved Hide resolved
tests/test_driver.sh Outdated Show resolved Hide resolved
@krobelus
Copy link
Member

Unfortunately, CMake target names may not contain a slash:

Okay, then maybe test-wait.py as the next best thing?

@ridiculousfish
Copy link
Member

@mqudsi do you plan to pick this up again near term? If not I'd be eager to take it over, I'm very excited for this improvement.

@mqudsi
Copy link
Contributor Author

mqudsi commented Jul 30, 2021

@ridiculousfish, thanks for the ping. I needed some time away from things and when I started dipping my toes back in the water I was too worried about how long I'd abandoned this and how out-of-date it must be by now, which wasn't exactly the kind of motivation I needed. Your enthusiasm, however, is quite inspiring. Let me blow the dust off this and see what the deal is; it was already very close to ready iirc, so hopefully it won't be that bad.

@mqudsi mqudsi force-pushed the ctest branch 2 times, most recently from 4c13b36 to f8f1630 Compare July 31, 2021 01:09
It depends on `mktemp -d TAG` returning a relative path, which isn't guaranteed to be the case (and
isn't the case when run by our test driver).
The default matching logic for fish_tests was prefix based, so when we
were running `history` we were also running all history tests. This
causes the test to fail for an unknown reason.
test_driver.sh is guaranteed to take care of them.
Aside from the fact that the shared state could cause problems, tests
were randomly assuming it would be created where that wasn't the case.
In particular, `redirect.fish` and `basic.fish` were failing on only
macOS because `../test/temp` didn't exist yet - it would be created by
other tests later.
Instead of trying to assert that there are no zombies when the test
starts (which often fails) and to prevent conflating existing or
irrelevant zombies with the ones we are interested in checking for,
have `ps` also emit the parent process id and filter its output to
include only children of the current fish instance.
This is in preparation for adding skipped test support to our ctest
integration.
This prevents tests that were skipped (e.g. because of a missing
REQUIRES) from being reported as successes in the CTest overall run
results list.
`test:foo` is not allowed by CMake ("reserved name") and `test/foo`
won't work since CMake doesn't allow targets to have a directory
separator in their name.
Tests are now executed in a test-specific temporary directory, so test
output on failure should be reproducible/reusable as-is without needing
to have TMPDIR defined (as it only exists by default under macOS).
If called within a temporary directory that had an X in the path, it
would fail. This caused sporadic CI test failures.
Copy link
Member

@krobelus krobelus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay - this looks awesome, let's merge.

I was confused for a bit because ninja test_basic.fish would take 4 seconds but that's just because I had BUILD_DOCS=ON which runs sphinx

tests/checks/features-qmark1.fish Outdated Show resolved Hide resolved
tests/checks/basic.fish Outdated Show resolved Hide resolved
This was requested by a team member who would like for some tests to
remain invokable (in thier own $HOME) directly via littlecheck without
relying on the test driver to prep the environment.

A comment explaining the rationale is also added so this doesn't get
passed down as folklore "you need to include this for tests to run" even
though no one understands why.
On request of a team member, this patches `basic.fish` to no longer
depend on being invoked by the test driver and started up in a $PWD that
points to a clean temporary directory.
@mqudsi mqudsi merged commit 069d396 into fish-shell:master Aug 29, 2021
@mqudsi
Copy link
Contributor Author

mqudsi commented Aug 29, 2021

🎉

thank you, all.

@zanchey
Copy link
Member

zanchey commented Aug 31, 2021

There are two bits of fallout in the package builders from this merge, neither of which are showstoppers.

It looks like the overrides from 4c2a0de have been moved into the test driver, which means the low-level tests go back to running in the user's home directory. This is a bit of an issue on Debian pbuilders which use a non-existent HOME - see d518b01 as well.

Second, old CMake versions call skipped tests "failed". I can't see that this was changed explicitly - it's broken in 3.7 but working in 3.10. This only affects the package builds on LTS version of Debian, so I'll try and chase it down at some stage but it's not a blocker by any means.

@zanchey
Copy link
Member

zanchey commented Aug 31, 2021

it's broken in 3.7

[ 327s]
[ 327s] The following tests FAILED:
[ 327s] 130 - git.fish (Not Run)
[ 327s] 184 - tmux-bind.fish (Not Run)
[ 327s] 185 - tmux-complete.fish (Not Run)
[ 327s] 186 - tmux-prompt.fish (Not Run)
[ 327s] Errors while running CTest

but working in 3.10

The following tests did not run:
130 - git.fish (Skipped)
184 - tmux-bind.fish (Skipped)
185 - tmux-complete.fish (Skipped)
186 - tmux-prompt.fish (Skipped)

@faho
Copy link
Member

faho commented Aug 31, 2021

Second, old CMake versions call skipped tests "failed". I can't see that this was changed explicitly - it's broken in 3.7 but working in 3.10

Lovely. The alternative here would be to "pass" these tests on our end, maybe even just if cmake < 3.10. It's not like skipped tests are super visible anyway.

@mqudsi
Copy link
Contributor Author

mqudsi commented Aug 31, 2021

It looks like the overrides from 4c2a0de have been moved into the test driver, which means the low-level tests go back to running in the user's home directory.

The overrides have indeed been moved into the test driver but at the same time the low-level tests are now executed via the test driver. Perhaps I missed a target in the CMake file and they're being executed both independent of the test driver and then again under it?

Second, old CMake versions call skipped tests "failed". I can't see that this was changed explicitly - it's broken in 3.7 but working in 3.10

I thought maybe the feature to support skipping a test based on the return code was only introduced post CMake 3.7 but it seems to have been supported since at least version 3.0! (Side note: I can't believing viewing the documentation for a CMake API/feature in the latest documentation doesn't include a "introduced in CMake version xxx" or something). Maybe the API for setting the skip code changed or something?

@zanchey
Copy link
Member

zanchey commented Sep 1, 2021

The overrides have indeed been moved into the test driver but at the same time the low-level tests are now executed via the test driver

No, the low-tests are run directly - the test driver always invokes fish first.

foreach(LTEST ${LOW_LEVEL_TESTS})
  add_test(
    NAME ${LTEST}
    COMMAND ${CMAKE_BINARY_DIR}/fish_tests ${LTEST}
    WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}
  )
  set_tests_properties(${LTEST} PROPERTIES SKIP_RETURN_CODE ${SKIP_RETURN_CODE})
  add_test_target("${LTEST}")
endforeach(LTEST)

I thought maybe the feature to support skipping a test based on the return code was only introduced post CMake 3.7 but it seems to have been supported since at least version 3.0!

They were counted as failures prior to 3.9 (see https://gitlab.kitware.com/cmake/cmake/-/issues/16822).

Side note: I can't believing viewing the documentation for a CMake API/feature in the latest documentation doesn't include a "introduced in CMake version xxx" or something

Heh. fish's documentation isn't great in that sense either.

@mqudsi
Copy link
Contributor Author

mqudsi commented Sep 7, 2021

@zanchey can you please try the ctest_redux branch from mqudsi/fish-shell? It should fix the debian problem.
https://github.com/mqudsi/fish-shell/tree/ctest_redux

@zanchey
Copy link
Member

zanchey commented Sep 12, 2021

@mqudsi, it works great!

@zanchey
Copy link
Member

zanchey commented Sep 14, 2021

Thanks for merging that branch in 93aaa18. Unfortunately that uncovered a new issue in Ubuntu Xenial, which uses CMake 3.5 (which I didn't test until after the merge). The test target generate in 3.5 doesn't depend on fish_tests! Anyway I think I'll just hack around that in the package build.

zanchey added a commit that referenced this pull request Sep 14, 2021
CMake 3.5 (shipped in Ubuntu Xenial) doesn't generate the test target
with appropriate dependencies. Build them in dh_auto_build; it's too
hard to convince any of the other steps to do it.

See #7851.
@mqudsi
Copy link
Contributor Author

mqudsi commented Sep 14, 2021

You're most welcome. I saw your patch - that's certainly one way to work around it!

@zanchey zanchey added this to the fish 3.4.0 milestone Dec 26, 2021
@mqudsi mqudsi deleted the ctest branch September 26, 2022 18:10
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants