Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alert: After a conda update on linux, psi4 segfaults ... don't panic #1533

Open
loriab opened this issue Feb 14, 2019 · 3 comments

Comments

Projects
None yet
2 participants
@loriab
Copy link
Member

commented Feb 14, 2019

TL;DR If conda psi4 is acting up, update libint, too: conda update libint -c psi4/label/dev. If locally compiled psi4 is acting up, trigger a partial recompile: cd objdir && rm -rf external/upstream/libint/ && make.

We've been planning for a while to distribute conda psi4 with libint compiled at MAX_AM_ERI 8, not 6. I have rebuilt the libint package and now the v1.3rc1 will be the first psi4 package with extended AM. Unfortunately, both libint of different AM are not hot-swappable and AM not detectable at runtime. That is, a built psi4 is perfectly happy to link (in the ldd sense) to a libint.so that is inconsistent with fixed data dimensions in psi4 libmints, and libmints can't even throw an error for enlightenment.

I could throw additional constraints on the psi4 pkg to make sure it picked the right libint build, but that would be misusing the tools a bit, would constrain things in future, and would only solve a third of the problem. I believe you can handle this transition manually with a couple commands.

which libints are out there?

conda list

#                          v notice 5 vs 4 here
#                          v
libint:     1.2.1-hb4a4fd4_5  # AM ** 8 **, compatible with conda psi4 >= 1.3rc1, new!
libint:     1.2.1-h87b9b30_4  # AM ** 6 **, compatible with conda psi4  < 1.3rc1, 8 mo old
libint:     1.2.1-am8_1       # AM ** 8 **, compatible with compile-yourself psi4 only, deprecated, 19 mo old

what can go wrong with conda psi4?

  • Both latest psi4 and latest libint conda packages are compatible, so a fresh install conda update psi4 libint -c psi4/label/dev will be fine.
  • However, if you have a conda environment and you update psi4 but not libint, then you'll have a >=1.3rc1 psi4 and the _4 (see above) AM6 libint, which is trouble. Easy to solve by updating libint: conda update libint -c psi4/label/dev. After that, your conda list should say _5.
  • If you have a conda env and you update libint but not psi4 (uncommon), that's also a problem. Update psi4. If what you really want is for the psi4 to work and the libint upgrade was a mistake, downgrade the libint: conda install libint=1.2.1=h87b9b30_4 -c psi4.

what can go wrong with locally-built psi4?

  • A major use of the libint conda package is to be a detectable pre-built dependency for a from-source psi4 compilation. If the $CONDA_PREFIX/lib/libint.so to which a psi4 core.so is linked suddenly changes identity, psi4 will be deranged.
  • To fix this, cmake needs to detect the new libint headers and a half-dozen psi4 files need to rebuild.
> cd <objdir>
> rm -rf external/upstream/libint/
> make
# cmake says: -- Found Libint 8: /home/psilocaluser/toolchainconda/envs/p4dev37/lib/libint.so (found version 1.2.1)
  • If you need to avoid recompiling and instead revert your environment back to the AM6 libint, use the command in the last bullet of the previous section.

Notes

  • Only Linux affected. It took a week to build AM6 on the mac mini, so AM8 is not within reach. Thus, it's possible for the same job to run fine on linux binary but throw an angular momentum error on mac binary.
  • simint max is AM7 both past and near future. Only libint is changing.
  • Between higher AM and optimizing for multiple architectures, the new libint pkg is heavy, >120mb zipped. For psi4 CI, I'm planning to pull the old AM6 to lessen the strain on Travis. This won't help downstream projects that summon a built psi4 to test their addon.
  • I haven't posted the AM8 package yet, as I want the stack to finish building, and I wanted to give you all a heads-up. I'll post to this issue when the package goes up. Will only be in -c psi4/label/dev for now, not -c psi4.
@susilehtola

This comment has been minimized.

Copy link
Member

commented Feb 14, 2019

@loriab

This comment has been minimized.

Copy link
Member Author

commented Feb 14, 2019

I think for a locally compiled psi4 you also have to remove the install dir, if one exists.

You're right in general that cmake can find deps in the psi4 installation, which may be unexpected behavior. And it never hurts to remove the install dir. But in this case, should be safe b/c switching out the libint will be in the conda env, so the libintConfig.cmake will be there, too. The troublesome scenario I can think of is you have a conda env with libint and libxc, where you're only using the latter in the psi4 build. your libint is a local or cmake compile at am6. then you upgrade the pkgs in the conda env. if, in either the objdir or the install the rpath on the core.so isn't set so that the local is before the conda env, then even though cmake got it right, the runtime psi4 will get it wrong. of course, my imagination isn't strong enough to guess all the ways this could go wrong.

@loriab

This comment has been minimized.

Copy link
Member Author

commented Feb 14, 2019

as of a couple hours ago, the AM8 libint and psi4 were posted on -c psi4/label/dev. so the troubles of this issue are now possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.