Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the test setup #260

Closed
fingolfin opened this issue Sep 26, 2015 · 5 comments
Closed

Improve the test setup #260

fingolfin opened this issue Sep 26, 2015 · 5 comments
Assignees
Labels
kind: enhancement Label for issues suggesting enhancements; and for pull requests implementing enhancements topic: tests issues or PRs related to tests

Comments

@fingolfin
Copy link
Member

fingolfin commented Sep 26, 2015

This was prompted by some questions on PR #248. I think our test setup can be improved. There are some things with it that bother me; and some ideas for improvements; and I hope others have more ideas.

Speed

My understanding so far was that make testinstall is meant to run fast. It is not clear what that should mean.... testinstall.g claims this: "The test requires about 750MB of memory and runs about one minute on an Intel Core 2 Duo / 2.53 GHz machine."

For me, testinstall runs in slightly more than two minutes on my 2012er MBP, which has a 2.3 GHz Intel Core i7 -- so this is a much faster machine than in the comment I just quoted, yet it takes longer than the promised one minute.

I think staying below one minute on some fixed reference machine is a good idea. The faster these tests are, the easier it is to convince devs to regularly run them. Let's make them faster then!

For starters, make testinstall (attempts to) run the test twice; the second time with "LoadAllPackages()". Can we please get rid of that? For me it always errors out anyway, and the chance for that increases if you have various in-development packages floating around.

I realize that this might be a useful test for Jenkins, but then there should be a second targets testinstall_all_packages or so.

Going beyond that, I'd actually prefer if the tests run in, say, under 10 seconds, but that might be unrealistic. Still, I wonder whether e.g. trans.tst and grppcnrm.tst (the two slowest tests on my system) really "need" to do everything they do? Maybe they could be split into a "quick" version in "testinsall", and a "long" version in "teststandard"?

Memory

We run all tests with -m 100m. Why? I never would run my GAP with so little memory, it just slows things down. Why don't we just use whatever default gap.sh runs with? Surely that default should be "good" for everybody on average, and since it is what most users will end up with by default, we should use it, no? If we are concerned about comparing timings (which we shouldn't), or about reproducibility, we could simply let the tests print the active settings at the start, no?

Names of the test suites

Perhaps rename testinstall to testquick and teststandard to testall? Or even test-quick and test-all, which I find much more readable?

@olexandr-konovalov
Copy link
Member

Thank you for initiating this discussions. Just some details on where the current setup comes from.

The claim about 1 minute runtime was made a while ago, and after that some more tests were added (and even more things changed after that tests were rearranged). So it's not a speed regression. This of course should be updated. 750 MB memory comes from the -o option used in the Makefile - it is less than default, but we want it to be conservative to catch for regressions in memory usage.

I agree that keeping testinstall.g short would help to convince developers to run it more often, so maybe we should not put each quick test into testinstall - otherwise we may end up with a dozen of 5-second tests and get an extra minute. But it would be good at least to ensure that different areas of GAP functionality are exercised in testinstall. The idea with quick and long versions of some tests may be good for this.

I do not think that it is worth to split testinstall into two targets, one of them testinstall_all_packages. If you have made modifications in the core system, just start the tests like gapdev -r -A tst/testinstall.g to avoid loading packages, and this will give you exactly the 1st half of testinstall (except memory options). The -r option is also used in make testinstall so it will ignore your settings. If the version with all packages still fails, you may look first into the test log without packages and then decide yourself whether the problems in the other file are relevant or not.

Formerly, in GAP 4.4.12 testinstall.g was called testall.g, so it was decided not to use testall for the much longer test in GAP 4.5 to avoid confusions. I don't mind renaming now, as 4.4.12 is not around for several years already.

@ChrisJefferson
Copy link
Contributor

I agree having a rearrange seems like a good idea, moving tests (or parts of tests) into teststandard instead of testinstall.

One option for improving matters would be to use parallelisation. This would be easy to do with the IO package and IO_fork, and reasonably easy to do by just chopping the tests up into pieces.

I really don't like looking to see if problems are 'relevant or not'. I think any tests we want people to check when they submit a change should just be "all tests should pass". I realise that is hard with the way GAP tests work right now.

@olexandr-konovalov
Copy link
Member

@ChrisJefferson regarding my suggestion to see if problems are 'relevant or not' - perhaps there is a misunderstanding here.

We do tell people that both versions of make testinstall, with/without packages should pass if they use the selection of packages from the latest official release of GAP. Since GAP 4.5 we have a strict rule NEVER to release GAP if make testinstall and make teststandard do not run cleanly in both configurations.

Of course, one can not ensure that things will run cleanly if one uses a mix of packages including development versions of some packages - then your mileage may vary, and it is the developer to decide whether the problems are serious or not. One of approaches could be to have various clones of the repository for various purposes.

@fingolfin fingolfin added the topic: tests issues or PRs related to tests label Jan 12, 2017
@fingolfin fingolfin added the kind: enhancement Label for issues suggesting enhancements; and for pull requests implementing enhancements label May 9, 2017
@olexandr-konovalov olexandr-konovalov self-assigned this May 5, 2018
@olexandr-konovalov
Copy link
Member

olexandr-konovalov commented May 5, 2018

Assigned this to myself. Primarily I want to restore the ability once we had to monitor the runtime of the tests, and start this from GAP 4.9.1 onwards. I will also revise what's left from above.

P.S. Ah, there is also #305 specifically about timings.

@ruthhoffmann ruthhoffmann added the gapdays2020-spring Issues and PRs that arose in relation to https://www.gapdays.de/gapdays2020-spring label Nov 6, 2019
@fingolfin fingolfin removed the gapdays2020-spring Issues and PRs that arose in relation to https://www.gapdays.de/gapdays2020-spring label Mar 30, 2020
@fingolfin
Copy link
Member Author

Most of this is resolved, or covered by other issues:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: enhancement Label for issues suggesting enhancements; and for pull requests implementing enhancements topic: tests issues or PRs related to tests
Projects
None yet
Development

No branches or pull requests

4 participants