Improve the test setup #260

fingolfin · 2015-09-26T19:50:35Z

This was prompted by some questions on PR #248. I think our test setup can be improved. There are some things with it that bother me; and some ideas for improvements; and I hope others have more ideas.

Speed

My understanding so far was that make testinstall is meant to run fast. It is not clear what that should mean.... testinstall.g claims this: "The test requires about 750MB of memory and runs about one minute on an Intel Core 2 Duo / 2.53 GHz machine."

For me, testinstall runs in slightly more than two minutes on my 2012er MBP, which has a 2.3 GHz Intel Core i7 -- so this is a much faster machine than in the comment I just quoted, yet it takes longer than the promised one minute.

I think staying below one minute on some fixed reference machine is a good idea. The faster these tests are, the easier it is to convince devs to regularly run them. Let's make them faster then!

For starters, make testinstall (attempts to) run the test twice; the second time with "LoadAllPackages()". Can we please get rid of that? For me it always errors out anyway, and the chance for that increases if you have various in-development packages floating around.

I realize that this might be a useful test for Jenkins, but then there should be a second targets testinstall_all_packages or so.

Going beyond that, I'd actually prefer if the tests run in, say, under 10 seconds, but that might be unrealistic. Still, I wonder whether e.g. trans.tst and grppcnrm.tst (the two slowest tests on my system) really "need" to do everything they do? Maybe they could be split into a "quick" version in "testinsall", and a "long" version in "teststandard"?

Memory

We run all tests with -m 100m. Why? I never would run my GAP with so little memory, it just slows things down. Why don't we just use whatever default gap.sh runs with? Surely that default should be "good" for everybody on average, and since it is what most users will end up with by default, we should use it, no? If we are concerned about comparing timings (which we shouldn't), or about reproducibility, we could simply let the tests print the active settings at the start, no?

Names of the test suites

Perhaps rename testinstall to testquick and teststandard to testall? Or even test-quick and test-all, which I find much more readable?

The text was updated successfully, but these errors were encountered:

olexandr-konovalov · 2015-09-26T20:36:04Z

Thank you for initiating this discussions. Just some details on where the current setup comes from.

The claim about 1 minute runtime was made a while ago, and after that some more tests were added (and even more things changed after that tests were rearranged). So it's not a speed regression. This of course should be updated. 750 MB memory comes from the -o option used in the Makefile - it is less than default, but we want it to be conservative to catch for regressions in memory usage.

I agree that keeping testinstall.g short would help to convince developers to run it more often, so maybe we should not put each quick test into testinstall - otherwise we may end up with a dozen of 5-second tests and get an extra minute. But it would be good at least to ensure that different areas of GAP functionality are exercised in testinstall. The idea with quick and long versions of some tests may be good for this.

I do not think that it is worth to split testinstall into two targets, one of them testinstall_all_packages. If you have made modifications in the core system, just start the tests like gapdev -r -A tst/testinstall.g to avoid loading packages, and this will give you exactly the 1st half of testinstall (except memory options). The -r option is also used in make testinstall so it will ignore your settings. If the version with all packages still fails, you may look first into the test log without packages and then decide yourself whether the problems in the other file are relevant or not.

Formerly, in GAP 4.4.12 testinstall.g was called testall.g, so it was decided not to use testall for the much longer test in GAP 4.5 to avoid confusions. I don't mind renaming now, as 4.4.12 is not around for several years already.

ChrisJefferson · 2015-09-26T22:38:25Z

I agree having a rearrange seems like a good idea, moving tests (or parts of tests) into teststandard instead of testinstall.

One option for improving matters would be to use parallelisation. This would be easy to do with the IO package and IO_fork, and reasonably easy to do by just chopping the tests up into pieces.

I really don't like looking to see if problems are 'relevant or not'. I think any tests we want people to check when they submit a change should just be "all tests should pass". I realise that is hard with the way GAP tests work right now.

olexandr-konovalov · 2015-09-26T23:45:20Z

@ChrisJefferson regarding my suggestion to see if problems are 'relevant or not' - perhaps there is a misunderstanding here.

We do tell people that both versions of make testinstall, with/without packages should pass if they use the selection of packages from the latest official release of GAP. Since GAP 4.5 we have a strict rule NEVER to release GAP if make testinstall and make teststandard do not run cleanly in both configurations.

Of course, one can not ensure that things will run cleanly if one uses a mix of packages including development versions of some packages - then your mileage may vary, and it is the developer to decide whether the problems are serious or not. One of approaches could be to have various clones of the repository for various purposes.

olexandr-konovalov · 2018-05-05T11:28:42Z

Assigned this to myself. Primarily I want to restore the ability once we had to monitor the runtime of the tests, and start this from GAP 4.9.1 onwards. I will also revise what's left from above.

P.S. Ah, there is also #305 specifically about timings.

fingolfin · 2024-01-09T23:40:11Z

Most of this is resolved, or covered by other issues:

testinstall is fast enough right now on my M1 MBP (of course YMMV), again
make testinstall and friends are gone, no more running testinstall twice
we don't use -m 100 or other memory limits for running the test suites
renaming the tests is tracked also in Rename "testinstall" and "teststandard" to something more clear #544

fingolfin mentioned this issue Oct 28, 2015

Improve coverage of the test suite #314

Closed

fingolfin added the topic: tests issues or PRs related to tests label Jan 12, 2017

fingolfin added the kind: enhancement Label for issues suggesting enhancements; and for pull requests implementing enhancements label May 9, 2017

fingolfin mentioned this issue Apr 22, 2018

Regression in testextra/grpauto.tst (memory usage) #2377

Closed

olexandr-konovalov self-assigned this May 5, 2018

fingolfin mentioned this issue Aug 8, 2018

Add NormalSubgroups methods for symmetric and alternating permutation groups #2684

Merged

ruthhoffmann added the gapdays2020-spring Issues and PRs that arose in relation to https://www.gapdays.de/gapdays2020-spring label Nov 6, 2019

fingolfin removed the gapdays2020-spring Issues and PRs that arose in relation to https://www.gapdays.de/gapdays2020-spring label Mar 30, 2020

fingolfin closed this as completed Jan 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve the test setup #260

Improve the test setup #260

fingolfin commented Sep 26, 2015 •

edited by wilfwilson

Loading

olexandr-konovalov commented Sep 26, 2015

ChrisJefferson commented Sep 26, 2015

olexandr-konovalov commented Sep 26, 2015

olexandr-konovalov commented May 5, 2018 •

edited

Loading

fingolfin commented Jan 9, 2024

Improve the test setup #260

Improve the test setup #260

Comments

fingolfin commented Sep 26, 2015 • edited by wilfwilson Loading

Speed

Memory

Names of the test suites

olexandr-konovalov commented Sep 26, 2015

ChrisJefferson commented Sep 26, 2015

olexandr-konovalov commented Sep 26, 2015

olexandr-konovalov commented May 5, 2018 • edited Loading

fingolfin commented Jan 9, 2024

fingolfin commented Sep 26, 2015 •

edited by wilfwilson

Loading

olexandr-konovalov commented May 5, 2018 •

edited

Loading