Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix parallel quick tests #7143

Merged
merged 1 commit into from Sep 16, 2018
Merged

Conversation

drwells
Copy link
Member

@drwells drwells commented Sep 3, 2018

In-progress fix for #6589.

There are two separate issues here: I have fixed just one so far.

  1. CMake essentially generates the rule
test:
    ctest quick_tests

so the number of processes we start make (or ninja) with is irrelevant: there is no way AFAICT to pass along the number of available jobs to ctest. The first commit just uses the number of CPUs, which is nearly always what we want (we could divide this by 2 if we wanted to be more conservative).

  1. For reasons I do not fully understand, we cannot do make -j2 test when we link the example programs against deal_II.g (deal_II doesn't work either). Linking against other libraries seems fine. Its irrelevant whether or not we need the libraries; just adding deal_II.g to the link line makes cmake misbehave.

@drwells
Copy link
Member Author

drwells commented Sep 4, 2018

Interestingly, make -j2 test works when compiling just base, but fails the moment we add another sublibrary (lac).

@drwells
Copy link
Member Author

drwells commented Sep 13, 2018

I managed to make a much simpler MWE: https://github.com/drwells/cmake-bug-check. This is probably a bug in CMake since the generated makefiles are not quite right.

The quick tests are currently set up as a custom target which immediately
launches ctest. The difficulty with this is that its not possible (AFAICT) to
get the number of jobs available from make at that point, so there is no way to
run these in parallel without giving an explicit number of jobs to ctest.

This PR sets the number of jobs to the number of CPUs on the machine: while
hardcoding this is not ideal, this is the right usage pattern for a build script
like

    make -j20
    make test

where all processors on the machine should be available.
@drwells
Copy link
Member Author

drwells commented Sep 13, 2018

I recommend that we go ahead with the current version of this PR that parallelizes the number of tests run at once. I do not like hardcoding the parallelization scheme but I don't see any other way to get this, and it is almost always what we want to do anyway.

@tjhei
Copy link
Member

tjhei commented Sep 14, 2018

I don't quite understand, we have a simple rule that the Makefile target test will run ctest quick_tests. Why does it matter if you run the make command with -j?

Is make somehow passing the -j flag down so that make that is run from inside ctest gets them too?

Is this problem related to the fact that we are overwriting the "test" target, which is discouraged in cmake?

@tjhei
Copy link
Member

tjhei commented Sep 14, 2018

and it is almost always what we want to do anyway.

Except when we run out of memory this way. I don't mind too much that make test is slow, to be honest.

@drwells
Copy link
Member Author

drwells commented Sep 14, 2018

Why does it matter if you run the make command with -j?

I don't know: I am pretty sure that this is a bug in CMake.

Renaming the target to check doesn't seem to help (nor does renaming it to foobar), so I don't think it is related (though naming it test is bad for other reasons).

I do not believe that flag is propagated in any way to ctest.

Except when we run out of memory this way. I don't mind too much that make test is slow, to be honest.

Perhaps we could have two such targets, and one runs in parallel? I have only ever wanted to run it in parallel, personally.

@tjhei
Copy link
Member

tjhei commented Sep 14, 2018

Perhaps we could have two such targets, and one runs in parallel? I have only ever wanted to run it in parallel, personally.

I don't care too much about it, I just thought I would point out that waiting a few more minutes vs OOM might be better for an inexperienced user.

@tjhei tjhei merged commit 5360be7 into dealii:master Sep 16, 2018
@drwells
Copy link
Member Author

drwells commented Sep 16, 2018

I went ahead and checked the memory usage and I think its okay. Its difficult to profile these things, but if I compile and run every test at once on my machine (after clearing all OS caches) then I use about 5 GB of RAM: I think any machine with at least 22 virtual cores has that much memory available. If we wanted to be conservative we could divide the number of jobs in 2, but I don't think its necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants