Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Bus error" with hetergeneous forcing data hdf5 file on NERSC #47

Closed
zexuanxu opened this issue Jul 9, 2020 · 7 comments
Closed

"Bus error" with hetergeneous forcing data hdf5 file on NERSC #47

zexuanxu opened this issue Jul 9, 2020 · 7 comments
Assignees

Comments

@zexuanxu
Copy link
Contributor

zexuanxu commented Jul 9, 2020

Some ats-0.87 (or modif4chemistry) builds have “bus error” issues when a heterogeneous forcing data hdf5 file is read and used on NERSC, maybe some issue with the recent TPLs. This issue was not seen on other computational platforms (watershedfx, Lawrencium). David thinks it could be related to the hugepages module on NERSC.

The builds (on NERSC) work well:
/project/projectdirs/m2398/ideas/ats/install/cori/mpich-7.6.2-gnu-7.1.0/Release-TPLs-0.94.12/amanzi-0.87-180309/bin/ats
/project/projectdirs/m2398/ideas/ats/install/cori/mpich2-7.7.0-gnu-7.3.0/Release-amanzi-0.87-TPLs-0.94.12/modif4chemistry-181201/bin/ats

The builds don't work (with bus error msg):
/project/projectdirs/m2398/ideas/ats/install/cori/mpich2-7.7.6-gnu-8.2.0/Release-amanzi-0.87-TPLs-0.94.12/modif4chemistry-190730/bin/ats
and all later builds.

The latest reproducer (with the error msg copied) can be found on NERSC at: /project/projectdirs/m2398/zexuan/coppercreek/hydrology/p_et3_sm0r_kc_heter
with the hetergeneous met forcing data at:
/project/projectdirs/m2398/zexuan/coppercreek/metforcing

@ecoon
Copy link
Collaborator

ecoon commented Jul 15, 2020

Do you need 0.87? Can you reproduce this on a newer release? We've definitely updated HDF5 versions since then.

@ecoon
Copy link
Collaborator

ecoon commented Jul 15, 2020

Note you'll have to update your input file using the script here: https://github.com/amanzi/ats/blob/master/tools/input_converters/xml-0.87-0.88.py

There were no changes that should affect you going from 0.88 to 1.0 or master at this point (I'm fairly sure!)

@ecoon
Copy link
Collaborator

ecoon commented Jul 15, 2020

Actually your best bet is to grab the rock creek example for 1.0, change to your mesh, and drop in the forcing file (does that have the physics you need?) Let's see if we can reproduce this in the new code first.

I recognize that if this works, it won't help you if you need chemistry, but our chemistry merge is 99.9% ready to be merged and we can bug @ajkhattak to get it done if this fixes the problem in flow only.

@zexuanxu
Copy link
Contributor Author

Hi Ethan,

Thanks for helping out. I tried it with 1.0 (with the modified version from your rock creek example), but I reproduce the same issue as seen in 0.87. Again, this issue is not reproduced on our local workstation.

I copied the reproducer at /project/projectdirs/m2398/zexuan/coppercreek/hydrology/transient-1.0, with the met focing dataset I used at /project/projectdirs/m2398/zexuan/coppercreek/metforcing/pr_et3_sm0r_10yrs_heter15_64_10days.h5

@ecoon
Copy link
Collaborator

ecoon commented Jul 21, 2020

Great, I should be able to take a look there. We may have a better way of doing this anyway, at least in 1.0 (definitely not in 0.87).

@ecoon
Copy link
Collaborator

ecoon commented Jul 23, 2020

Just confirming that @jd-moulton is working on this?

@zexuanxu
Copy link
Contributor Author

This issue has been resolved with the dynamic link maser build on NERSC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants