Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Psi4b5 ictce #433

Closed
wants to merge 1 commit into from
Closed

Psi4b5 ictce #433

wants to merge 1 commit into from

Conversation

wpoely86
Copy link
Member

No description provided.

@wpoely86
Copy link
Member Author

I can't run the test suite on the MPI version. It builds without issue but I don't have a mpi setup to run all the tests.

@boegel
Copy link
Member

boegel commented Sep 13, 2013

Can you specify how to run the test suite?
Can't you just run the test suite locally on a node, or is use of multiple nodes enforced?
You have access to the HPC-UGent infrastructure, no? ;-)

@wpoely86
Copy link
Member Author

The easybuild runs the test suite for you.

I can run it on a HPC node but it uses ictce-5.5.0 which is not on the HPC currently, so It will take some time.

Anyway, we want the non mpi version ;-)

I shall put things in motion on a raichu node.

@boegel
Copy link
Member

boegel commented Sep 13, 2013

@wpoely86: I'm installing ictce/5.5.0 as a module as we speak on raichu, so don't bother installing it locally. :-)

@boegel
Copy link
Member

boegel commented Sep 13, 2013

mt build works like a charm

mpi build seems to be hanging during tests:

vsc40003 104880  0.0  0.0   9192  1288 ?        S    11:17   0:00          \_ /bin/bash /user/home/gent/vsc400/vsc40003/easybuild_easyinstalled/bin/eb /vscmnt/gent_vulpix/_/user/home/gent/vsc400/vsc40003/PSI-4.0b5-ictce-5.5.0.eb --debug --logtostdout
vsc40003 104883  0.0  0.0 147460 23340 ?        S    11:17   0:01              \_ python /user/home/gent/vsc400/vsc40003/easybuild_easyinstalled/lib/python2.6/site-packages/easybuild_framework-1.8.0dev-py2.6.egg/easybuild/main.py /vscmnt/gent_vulpix/_/user/home/gent/vsc400/vsc40003/PSI-4.0b5-ictce-5.5.0.eb --debug --logtostdout
vsc40003  92941  0.0  0.0   4372  1044 ?        S    11:38   0:00                  \_ make tests TESTFLAGS=-u -q
vsc40003  92942  0.0  0.0   9344  1328 ?        S    11:38   0:00                      \_ /bin/sh -c (cd tests; echo Running test suite...; make) || exit 1;
vsc40003  92943  0.0  0.0   9348   888 ?        S    11:38   0:00                          \_ /bin/sh -c (cd tests; echo Running test suite...; make) || exit 1;
vsc40003  92944  0.0  0.0   4640  1344 ?        S    11:38   0:00                              \_ make
vsc40003 102345  0.0  0.0   9344  1352 ?        S    12:12   0:00                                  \_ /bin/sh -c make -C mpn-bh; true
vsc40003 102346  0.0  0.0   4372  1056 ?        S    12:12   0:00                                      \_ make -C mpn-bh
vsc40003 102355  0.0  0.0  25888  2348 ?        S    12:12   0:00                                          \_ perl ../runtest.pl /tmp/easy/PSI/4.0b5/ictce-5.5.0/psi4.0b5/tests/mpn-bh/input.dat mpn-bh.test false
vsc40003 102356  0.1  0.1 1606824 105496 ?      Sl   12:12   0:12                                              \_ ../../bin/psi4 /tmp/easy/PSI/4.0b5/ictce-5.5.0/psi4.0b5/tests/mpn-bh/input.dat output.dat

Have you even seen this? I'll restart the build to see whether it's consistent.

@wpoely86
Copy link
Member Author

I'm seeing the same thing. It hangs.

I will take a closer look when I can.

We also found a couple of bugs in the mt version when you create a plugin (incorrect paths and stuff). Patches are on the way ;-)

@boegel
Copy link
Member

boegel commented Sep 13, 2013

Maybe it's worthwhile to pull out the mpi easyconfig in a separate PR, so the -mt version can be merged in?

@wpoely86
Copy link
Member Author

Yeah, good idea. But the -mt version is also not yet ready to merge. I will send new pull request when they are.

@wpoely86 wpoely86 closed this Sep 13, 2013
@boegel
Copy link
Member

boegel commented Sep 13, 2013

Why is the -mt version not ready? Because it requires an (extra) patch file to fix the plugin issues?

@wpoely86
Copy link
Member Author

When you run psi4 --new-plugin, it also generates a makefile for you but the path's in there are incorrect because of a MakeVars that still points to the original build directory.

I will create 2 seperate branches and pull requests for the -mt and the mpi version?

@boegel
Copy link
Member

boegel commented Sep 13, 2013

Sounds good. Just open a separate PR for the mpi easyconfig that is still causing problems, and include any mpi specific patch files in there.

@wpoely86
Copy link
Member Author

@boegel The mpi tests have finished after 3.5h on raichu with all tests passed. Are you still getting hangs?

@boegel
Copy link
Member

boegel commented Sep 13, 2013

I've resubmitted the job, will come back to you on this...

@boegel
Copy link
Member

boegel commented Sep 13, 2013

It's hanging for me on a delcatty node... Weird. Will retry on raichu.

@boegel
Copy link
Member

boegel commented Sep 14, 2013

Build has been running for over 20 hours on a raichu node now, hanging again...

Does it works consistently for you?

@wpoely86
Copy link
Member Author

I've run it again on a delcatty node and everything ends successful.

On my personal machine, it also hangs. Does it hang consistenly on the same test for you?

@boegel
Copy link
Member

boegel commented Sep 16, 2013

Yes, I remember it being the same test every time (but I don't remember which test :-/). Where is it hanging for you?

It seems like this may be triggered by differences in the environment in which the tests are run, since it clearly not tied to a particular system.

@wpoely86
Copy link
Member Author

On my machine, it hangs on mpn-bh.

Do you use a different build environment on the HPC then regular users do? We shouldn't see any difference on the HPC between your and my builds.

@boegel
Copy link
Member

boegel commented Sep 16, 2013

I fiddle a lot with my environment to test stuff, so it's not standard.
But, problems like this can be triggered by defining an extra environment variable (any variable), which changes the environment size. That may be sufficient to trigger the problem or not...
Let me retry to see where it gets stuck for me.

@boegel
Copy link
Member

boegel commented Sep 16, 2013

Actually, it's still hanging... It's stuck in rasci-h2o.

@wpoely86 wpoely86 mentioned this pull request Oct 6, 2013
@wpoely86 wpoely86 deleted the psi4b5-ictce branch November 11, 2013 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants