-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to pass parallel make tests #1
Comments
Hi @scottneuhoff, thank you for your interest in trying our async VOL connector and provide the feedback. I have updated the README with the correct path for the installed libraries as you suggested. For the errors you are seeing with "./async_test_serial_event_set_error_stack.exe", it is likely that the HDF5_VOL_CONNECTOR environment variable is not set properly, with the test program not using the async connector, can you do "echo $HDF5_VOL_CONNECTOR" before your run and make sure it is "async under_vol=0;under_info={}". Also it would be good to update the HDF5 code and async code to the latest version with "git pull", since we have been fixing bugs. For parallel tests, can you check the content in async_vol_test.err, or just run with "mpirun -np 4 ./async_test_parallel.exe" and see if there are any errors. What version of Linux system are you running on? You can check with "cat /etc/os-release". |
@houjun , thanks for your feedback, directory structures look much better. However, I am still unable to pass the multifile test. When I run
Running the multifile test individually and/or checking async_vol_test.err yields the same message: These were after double checking my directories, environment variables, and pulling the most recent updates from git. I am currently running SUSE linux, SLES12.
|
@scottneuhoff - This looks like the bug I fixed on the async_vol_register_optional branch if the HPC-IO org's HDF5 git repo last week. Can you please pull the latest HDF5 code from that branch, rebuild & install, then try this test again? |
@scottneuhoff I just pushed a change to the parallel testing that removes H5Pset_vol_async() calls which are not necessary, can you try again? If there's still errors, maybe we can find a time for a zoom session to go through the tests together? |
@qkoziol @houjun Thanks for your quick responses; I switched to another machine that I hoped would be less complicated to work with (Red Hat Linux) and went through the process from a fresh directory; this ensured that I had cloned the most recent git clones so all the changes you refer to should be in. However, I get to exactly the same place as before - I set those environment variables, go into
Where again, running
Interestingly, I also found that earlier in the install process when running
It's the same |
@scottneuhoff - Although the assert is the same, I have a feeling that these have different root causes. I'm happy to "pair debug" in a call also, I'll send you and Tang an email to set something up. |
Closing this as we solved the problem in the call. |
I am trying the Async vol today and am getting the same Not sure if the same cause as this issue, but the failure looks the same, so perhaps the solution is the same? |
Hi @gsjaardema, the previous problem was due to an environment variable setting, HDF5_PLUGIN_PATH should be set to "$VOL_DIR/src" instead of "$VOL_DIR", can you check the HDF5_PLUGIN_PATH value in your environment? (Also please update the HDF5 library as well as the async vol to the latest version.) |
The |
This also seems similar to a problem that I fixed in the incoming branch. Can you try again with the 'async_vol_register_optional' branch of both the HDF5 and vol-async repos from the HPC-IO org's forks: https://github.com/hpc-io/hdf5/tree/async_vol_register_optional |
I am on the I made a simple C program to just do a Not sure why it can't open the |
OK, I reran with I'm not sure why it isn't finding the library without |
Ah, very cool! Annoying about the LD_LIBRARY_PATH though - I would tend to agree with you, it should have been linked into the async VOL connector and shouldn't need to be added to the dynamic library path. |
Hello vol-async team,
I'm trying to get this HDF5 Asynchronous I/O VOL Connector installed on my system and I can get it to a point where it is passing the serial tests (in vol-async/test/pytest.py) but never the parallel ones; I think there may be some inconsistencies with the directory structures / paths as written so hopefully we can clear this up together. Let me walk you through how I got here:
./configure --prefix=$H5_DIR/install --enable-parallel --enable-threadsafe --enable-unsupported CC=mpicc
using my systems HPE MPT installation for MPI.make install
with no issues, switched to $ABT_DIR, ran./autogen.sh && CC=cc ./configure --prefix=$ABT_DIR/build && make install
with no issuesNotice these are not as written in repo's README: I had to add
/install
on the end of HDF5_DIR for it to find the correct header files, if I did not do this, it would complain thathdf5dev.h
could not be found (as it should, that header file is not in$H5_DIR
asMakefile.summit
would have you believe)6. After editing that Makefile, I run
make
and it completes smoothly. Next, I runalthough, here again I find that
$H5_DIR/lib
doesn't exist, perhaps it should be$H5_DIR/install/lib
7. I copy
Makefile.summit
toMakefile
and again edit it so that:make
with no issuesmake check
(my Python is version 3.7.0), I get the following:Running
async_test_multifile.exe
alone gives me:In my other attempts changing various things I was able to get it to pass all the way to here:
Running that test individually gives:
I am wondering if there is anything here that is obviously inconsistent with how I should be installing things. Let me know, thanks!
The text was updated successfully, but these errors were encountered: