Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault when structure is large #217

Closed
iloncaric opened this issue Sep 4, 2023 · 4 comments
Closed

Segfault when structure is large #217

iloncaric opened this issue Sep 4, 2023 · 4 comments

Comments

@iloncaric
Copy link

Describe the bug
For structures with more than a few 100s of atoms there is a Segmentation fault (core dumped) even with huge available RAM (>100s GB)

To Reproduce
Steps to reproduce the behaviour:

Minimal ASE input script:

from dftd4.ase import DFTD4
from ase.io import read
a=read('test.xyz')
calc = DFTD4(method="r2scan")
b1.calc = calc
b1.get_potential_energy()

Output:
python -q -X faulthandler test_mem.py
Fatal Python error: Segmentation fault

Current thread 0x00007f60b515f740 (most recent call first):
File "/lib/python3.10/site-packages/dftd4/library.py", line 74 in handle_error
File "/lib/python3.10/site-packages/dftd4/interface.py", line 341 in get_dispersion
File "/lib/python3.10/site-packages/dftd4/ase.py", line 258 in Segmentation fault (core dumped)

It seems that DFTD4 requires massive RAM at the beginning for larger structures that is not needed.
Input files are attached.

test.zip

@iloncaric iloncaric changed the title Segfault for ASE when structure is large Segfault when structure is large Sep 4, 2023
@iloncaric
Copy link
Author

I tested it also with executable: "dftd4 --func r2scan --grad dftd4.txt POSCAR" and it also gives segfault. Is there a hardcoded limit to the number of atoms in the unit cell?

@e-kwsm
Copy link
Contributor

e-kwsm commented Sep 4, 2023

Segmentation fault occurs at

dftd4/src/dftd4/ncoord.f90

Lines 171 to 173 in 3844dc1

!$omp parallel do default(none) reduction(+:cn, dcndr, dcndL) &
!$omp shared(mol, trans, cutoff2, rcov, en) &
!$omp private(jat, itr, izp, jzp, r2, rij, r1, rc, countf, countd, sigma, den)

even if OMP_NUM_THREADS is set to 1, and OMP_STACKSIZE to 1G (gfortran 13.2.1, ifort 2021.10.0, and ifx 2023.2.0).

@awvwgk
Copy link
Member

awvwgk commented Sep 5, 2023

Did you set the stacksize for the system as well ulimit -s unlimited?

@iloncaric
Copy link
Author

When I noticed this problem "ulimit -s unlimited" was set, but then when I was making minimal test on PC it was not set. So minimal test could actually work if "ulimit -s unlimited" is set. But if you try to increase the size of the system a bit more, then again there is the segfault. This can be apparently solved by increasing OMP_STACKSIZE.
Since OMP_STACKSIZE cannot be set to unlimited, one might get segfaults until OMP_STACKSIZE is large enough.

You might want to check if there is another way of managing memory to avoid the need for a very large OMP_STACKSIZE. For 10 000 atoms cell OMP_STACKSIZE has to be on the order of OMP_STACKSIZE=3G with 8 threads.

In any case, thanks a lot for the prompt help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants