Skip to content

Conversation

@roystgnr
Copy link
Member

We can't --enable-werror yet, and I haven't yet set up tests with these compilers and MPI, but this branch at least builds without errors for me.

@moosebuild
Copy link

moosebuild commented Oct 12, 2022

Job Coverage on eb82b2d wanted to post the following:

Coverage

Coverage did not change

Full coverage report

This comment will be updated on new commits.

@dschwen
Copy link
Member

dschwen commented Oct 13, 2022

Neat, have you tried building moose, too?

@dschwen
Copy link
Member

dschwen commented Oct 13, 2022

Looks like we'll need an Nvidia HPC compiler Civet target real soon...

@dschwen
Copy link
Member

dschwen commented Oct 13, 2022

@roystgnr can you share your configure options. For me this branch does not compile. I'm still getting those pesky incomplete type is not allowed errors:

"/usr/include/c++/11/bits/unique_ptr.h", line 83: error: incomplete type is not allowed
  	static_assert(sizeof(_Tp)>0,
  	                     ^
          detected during:
            instantiation of "void std::default_delete<_Tp>::operator()(_Tp *) const [with _Tp=libMesh::ShellMatrix<libMesh::Number>]" at line 361
            instantiation of "std::unique_ptr<_Tp, _Dp>::~unique_ptr() noexcept [with _Tp=libMesh::ShellMatrix<libMesh::Number>, _Dp=std::default_delete<libMesh::ShellMatrix<libMesh::Number>>]" at line 329 of "./include/libmesh/eigen_system.h"
            instantiation of class "libMesh::RBConstructionBase<Base> [with Base=libMesh::CondensedEigenSystem]" 

1 error detected in the compilation of "../src/reduced_basis/rb_construction_base.C".
make[1]: *** [Makefile:28185: src/reduced_basis/libmesh_opt_la-rb_construction_base.lo] Error 1

I tried to cherry pick my commit deed905, which removed this error, but I still see the incomplete type errors with NonlinearImplicitSystem, which had me throw the towel in my effort to get libmesh to build with nvc++.

@dschwen
Copy link
Member

dschwen commented Oct 13, 2022

Updated from version 22.7 to 22.9, but the errors persist.

@roystgnr
Copy link
Member Author

Currently working on getting libMesh warnings-clean, haven't yet started hacking at MOOSE.

I'm using
../configure PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/2022/compilers/bin:/home/roystgnr/miniconda3/bin:/home/roystgnr/miniconda3/condabin:/home/roystgnr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin CC=/opt/nvidia/hpc_sdk/Linux_x86_64/2022/compilers/bin/nvcc CXX=/opt/nvidia/hpc_sdk/Linux_x86_64/2022/compilers/bin/nvc++ --enable-everything --disable-mpi --disable-petsc --disable-metaphysicl

at the moment; obviously there's more work to be done before we have a really useful build.

Looks like you're getting some extra failures because you're building with PETSc+SLEPc already? Anyway, they're in the same category as existing failures: with Nvidia compilers, the generation of the destructor for a Foo<Bar> : public Bar templated class seems to want to be able to handle ~Bar itself, which means it needs full definitions for every merely-forward-declared class that's held in a unique_ptr in Bar. Adding an include of shell_matrix.h should handle that particular error, but I won't be surprised if you see more of the same.

@dschwen
Copy link
Member

dschwen commented Oct 13, 2022

Adding an include of shell_matrix.h should handle that particular error, but I won't be surprised if you see more of the same.

Yeah, that's what I did in deed905, but the nonlinear implicit system stuff (which I suppose I see because I'm not disabling PETSc) is harder to solve, due to a circular include situation.

@roystgnr
Copy link
Member Author

the nonlinear implicit system stuff (which I suppose I see because I'm not disabling PETSc) is harder to solve, due to a circular include situation.

Could you be more specific? With 28c36cc I'm now getting a PETSc-enabled build to work (though I still have yet to figure out what's causing MetaPhysicL-enabled builds to fail boost configuration...). And there shouldn't be any circular includes possible here; the workaround is to add additional includes to .C files, not to other include files.

This gives me warnings at bootstrap time, from the Fortran tests, but it
at least handles the case of the Nvidia HPC compilers I'm trying out,
where the C++ compiler supports -fopenmp but the C compiler screams and
dies if given that argument.
And rightly so.  So let's restrict it to original PGI compilers only, at
least until we have evidence that we should take it out entirely.
It seems to want to create its own destructors for the base classes
here, which means it whines about incompletely defined types for
any forward-declared type in a unique_ptr in those base classes.
@roystgnr
Copy link
Member Author

roystgnr commented Nov 3, 2022

I still get one failure in a unit test, but it looks like it's a compiler issue and I've pestered them about it. I get a compile failure with --enable-ifem in -O2 builds, but that definitely was a compiler regression and they've got a fix coming. This is a good enough checkpoint that I'll probably merge once CI is happy. Warning fixes will be in another PR.

@roystgnr roystgnr merged commit 3f05f40 into libMesh:devel Nov 7, 2022
@roystgnr roystgnr deleted the nvidia_fixes branch November 7, 2022 20:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants