-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
e3sm_diags takes long time on cori-knl #314
Comments
When I used to create the environmental YAML files, we'd sometimes have dumps with the |
@tangq were you using KNL or Haswell nodes? In my own experience, e3sm_diags runs perfectly on Haswell nodes and performs miserably on KNL nodes. I assumed it was related to some module or environment difference. I haven't tried e3sm_diags on KNL in a while though, so maybe the fix @zshaheen pointed out fixed the issue I saw. |
I used cori-knl nodes, which worked fine before. |
It might be related, And worth situation on |
July sounds like the same time when I noticed this issue. |
I'm wondering if this has nothing to do with |
The e3sm_diags jobs created as parts of the post-processing bundling tool take forever to complete on cori.
One job ran out of time (2 hours). The other ran 4+ hours and still did not complete, so I killed it. The same job only uses <30 minutes on compy.
The log shows something like: OpenBLAS blas_thread_init: pthread_create failed for thread 108 of 128: Resource temporarily unavailable
The script, output, and log files are at /global/cscratch1/sd/tang30/E3SM_analysis/20200701.v1like.f2010.northamericax4v1pg2_r0125_northamericax4v1pg2.cori-knl/post/scripts
The text was updated successfully, but these errors were encountered: