Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assert(psi_norm>0) error with ks_solver dav #4068

Closed
16 tasks
pxlxingliang opened this issue Apr 29, 2024 · 9 comments
Closed
16 tasks

assert(psi_norm>0) error with ks_solver dav #4068

pxlxingliang opened this issue Apr 29, 2024 · 9 comments
Assignees
Labels
Bugs (Exclude input and output) Bugs that only solvable with sufficient knowledge of DFT Useful Information Useful information for others to learn/study

Comments

@pxlxingliang
Copy link
Collaborator

Describe the bug

I have some cases suffer below error:

Traceback (most recent call last):
  File "/home/input_lbg-471-12232015/tmp/inputs/artifacts/dflow_python_packages/opt/mamba/lib/python3.10/site-packages/dflow/python/utils.py", line 337, in try_to_execute
    output = op_obj.execute(input)
  File "/home/input_lbg-471-12232015/tmp/inputs/artifacts/dflow_python_packages/opt/mamba/lib/python3.10/site-packages/dflow/python/op.py", line 136, in wrapper_exec
    op_out = func(self, op_in)
  File "/home/input_lbg-471-12232015/tmp/inputs/artifacts/dflow_python_packages/opt/mamba/lib/python3.10/site-packages/fpop/run_fp.py", line 170, in execute
    backward_dir_name = self.run_task(backward_dir_name,log_name,backward_list,run_image_config,optional_input)
  File "/home/input_lbg-471-12232015/tmp/inputs/artifacts/dflow_python_packages/opt/mamba/lib/python3.10/site-packages/fpop/abacus.py", line 637, in run_task
    raise TransientError(
dflow.python.python_op_template.TransientError: ('abacus failed\n', 'out msg', '', '\n', 'err msg', "WARNING: Total thread number on this node mismatches with hardware availability. This may cause poor performance.\nInfo: Local MPI proc number: 16,OpenMP thread number: 1,Total thread number: 16,Local thread limit: 32\nabacus: /abacus-develop/source/module_hsolver/diago_david.cpp:934: void hsolver::DiagoDavid<>::SchmitOrth(const int &, const int, const int, psi::Psi<T, Device> &, const T *, T *, const int, const int) [T = std::complex<double>, Device = psi::DEVICE_CPU]: Assertion `psi_norm > 0.0' failed.\nabacus: /abacus-develop/source/module_hsolver/diago_david.cpp:934: void hsolver::DiagoDavid<>::SchmitOrth(const int &, const int, const int, psi::Psi<T, Device> &, const T *, T *, const int, const int) [T = std::complex<double>, Device = psi::DEVICE_CPU]: Assertion `psi_norm > 0.0' failed.\nabacus: /abacus-develop/source/module_hsolver/diago_david.cpp:934: void hsolver::DiagoDavid<>::SchmitOrth(const int &, const int, const int, psi::Psi<T, Device> &, const T *, T *, const int, const int) [T = std::complex<double>, Device = psi::DEVICE_CPU]: Assertion `psi_norm > 0.0' failed.\nabacus: /abacus-develop/source/module_hsolver/diago_david.cpp:934: void hsolver::DiagoDavid<>::SchmitOrth(const int &, const int, const int, psi::Psi<T, Device> &, const T *, T *, const int, const int) [T = std::complex<double>, Device = psi::DEVICE_CPU]: Assertion `psi_norm > 0.0' failed.\nabacus: /abacus-develop/source/module_hsolver/diago_david.cpp:934: void hsolver::DiagoDavid<>::SchmitOrth(const int &, const int, const int, psi::Psi<T, Device> &, const T *, T *, const int, const int) [T = std::complex<double>, Device = psi::DEVICE_CPU]: Assertion `psi_norm > 0.0' failed.\nabacus: /abacus-develop/source/module_hsolver/diago_david.cpp:934: void hsolver::DiagoDavid<>::SchmitOrth(const int &, const int, const int, psi::Psi<T, Device> &, const T *, T *, const int, const int) [T = std::complex<double>, Device = psi::DEVICE_CPU]: Assertion `psi_norm > 0.0' failed.\nabacus: /abacus-develop/source/module_hsolver/diago_david.cpp:934: void hsolver::DiagoDavid<>::SchmitOrth(const int &, const int, const int, psi::Psi<T, Device> &, const T *, T *, const int, const int) [T = std::complex<double>, Device = psi::DEVICE_CPU]: Assertion `psi_norm > 0.0' failed.\nabacus: /abacus-develop/source/module_hsolver/diago_david.cpp:934: void hsolver::DiagoDavid<>::SchmitOrth(const int &, const int, const int, psi::Psi<T, Device> &, const T *, T *, const int, const int) [T = std::complex<double>, Device = psi::DEVICE_CPU]: Assertion `psi_norm > 0.0' failed.\nabacus: /abacus-develop/source/module_hsolver/diago_david.cpp:934: void hsolver::DiagoDavid<>::SchmitOrth(const int &, const int, const int, psi::Psi<T, Device> &, const T *, T *, const int, const int) [T = std::complex<double>, Device = psi::DEVICE_CPU]: Assertion `psi_norm > 0.0' failed.\nabacus: /abacus-develop/source/module_hsolver/diago_david.cpp:934: void hsolver::DiagoDavid<>::SchmitOrth(const int &, const int, const int, psi::Psi<T, Device> &, const T *, T *, const int, const int) [T = std::complex<double>, Device = psi::DEVICE_CPU]: Assertion `psi_norm > 0.0' failed.\nabacus: /abacus-develop/source/module_hsolver/diago_david.cpp:934: void hsolver::DiagoDavid<>::SchmitOrth(const int &, const int, const int, psi::Psi<T, Device> &, const T *, T *, const int, const int) [T = std::complex<double>, Device = psi::DEVICE_CPU]: Assertion `psi_norm > 0.0' failed.\nabacus: /abacus-develop/source/module_hsolver/diago_david.cpp:934: void hsolver::DiagoDavid<>::SchmitOrth(const int &, const int, const int, psi::Psi<T, Device> &, const T *, T *, const int, const int) [T = std::complex<double>, Device = psi::DEVICE_CPU]: Assertion `psi_norm > 0.0' failed.\nabacus: /abacus-develop/source/module_hsolver/diago_david.cpp:934: void hsolver::DiagoDavid<>::SchmitOrth(const int &, const int, const int, psi::Psi<T, Device> &, const T *, T *, const int, const int) [T = std::complex<double>, Device = psi::DEVICE_CPU]: Assertion `psi_norm > 0.0' failed.\nabacus: /abacus-develop/source/module_hsolver/diago_david.cpp:934: void hsolver::DiagoDavid<>::SchmitOrth(const int &, const int, const int, psi::Psi<T, Device> &, const T *, T *, const int, const int) [T = std::complex<double>, Device = psi::DEVICE_CPU]: Assertion `psi_norm > 0.0' failed.\nabacus: /abacus-develop/source/module_hsolver/diago_david.cpp:934: void hsolver::DiagoDavid<>::SchmitOrth(const int &, const int, const int, psi::Psi<T, Device> &, const T *, T *, const int, const int) [T = std::complex<double>, Device = psi::DEVICE_CPU]: Assertion `psi_norm > 0.0' failed.\nabacus: /abacus-develop/source/module_hsolver/diago_david.cpp:934: void hsolver::DiagoDavid<>::SchmitOrth(const int &, const int, const int, psi::Psi<T, Device> &, const T *, T *, const int, const int) [T = std::complex<double>, Device = psi::DEVICE_CPU]: Assertion `psi_norm > 0.0' failed.\n", '\n')

V.zip
Cu.zip

Expected behavior

No response

To Reproduce

No response

Environment

No response

Additional Context

No response

Task list for Issue attackers (only for developers)

  • Verify the issue is not a duplicate.
  • Describe the bug.
  • Steps to reproduce.
  • Expected behavior.
  • Error message.
  • Environment details.
  • Additional context.
  • Assign a priority level (low, medium, high, urgent).
  • Assign the issue to a team member.
  • Label the issue with relevant tags.
  • Identify possible related issues.
  • Create a unit test or automated test to reproduce the bug (if applicable).
  • Fix the bug.
  • Test the fix.
  • Update documentation (if necessary).
  • Close the issue and inform the reporter (if applicable).
@pxlxingliang pxlxingliang added the Bugs (Exclude input and output) Bugs that only solvable with sufficient knowledge of DFT label Apr 29, 2024
@WHUweiqingzhou
Copy link
Collaborator

@haozhihan could you have a look?

@haozhihan
Copy link
Collaborator

#3643 FYI

@WHUweiqingzhou
Copy link
Collaborator

So, you do not think it is a bug for old dav method?

@haozhihan
Copy link
Collaborator

haozhihan commented Apr 30, 2024

I think, this should be a numerical problem for Schmidt orthogonalization, not a bug for old dav.

Currently, Schmidt orthogonalization is used in both the cg and dav methods of abacus, which leads to this issue.

@pxlxingliang
Copy link
Collaborator Author

More examples that suffer this error.
relaxation.zip

@WHUweiqingzhou
Copy link
Collaborator

@mohanchen and @dyzheng,
what is your opinion? Could we just leave this issue as a feature instead of a bug.

@dyzheng
Copy link
Collaborator

dyzheng commented May 7, 2024

I think it is not a bug, often caused by accuracy loss for Schmidt orthogonalization method.

@dyzheng dyzheng closed this as completed May 7, 2024
@haozhihan
Copy link
Collaborator

More examples that suffer this error. relaxation.zip

@pxlxingliang Can you provide related input files instead of just log and OUTPUT?

@pxlxingliang
Copy link
Collaborator Author

More examples that suffer this error. relaxation.zip

@pxlxingliang Can you provide related input files instead of just log and OUTPUT?

Please check the attached file of V inputs
v.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bugs (Exclude input and output) Bugs that only solvable with sufficient knowledge of DFT Useful Information Useful information for others to learn/study
Projects
None yet
Development

No branches or pull requests

4 participants