Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot build master with either Intel18 or Intel19 (on stampede2) #15382

Closed
marcfehling opened this issue Jun 19, 2023 · 12 comments · Fixed by #15425
Closed

Cannot build master with either Intel18 or Intel19 (on stampede2) #15382

marcfehling opened this issue Jun 19, 2023 · 12 comments · Fixed by #15425

Comments

@marcfehling
Copy link
Member

marcfehling commented Jun 19, 2023

The stampede2 supercomputer provides the classic Intel compilers 16.0.3, 17.0.4, 18.0.0, 18.0.2, 19.1.1.

As part of the release process, I try to compile the library with both 18.0.2 and 19.1.1 in a basic configuration with the corresponding IntelMPI package. This time I can't build deal.II on master with either of these. I tried to build 0516371.


Intel 18.0.2 (fixed by #15390)

I get an error that invalid_dof_index does not exist, although we include deal.II/base/types.h in dof_handler_policy.cc. It is worth mentioning that this happens in a lambda function, and we used to have problems with Intel18 and lambda functions in the past.

/dealii/source/dofs/dof_handler_policy.cc(3533): error: namespace "dealii::numbers" has no member "invalid_dof_index"
                  if (*received_index != numbers::invalid_dof_index)
                                                  ^
          detected during:
            instantiation of function "lambda [](const auto &, const auto &)->auto [with <auto-1>=dealii::TriaActiveIterator<dealii::DoFCellAccessor<1, 1, false>>, <auto-2>=std::vector<dealii::types::global_dof_index={unsigned int}, std::allocator<dealii::types::global_dof_index={unsigned int}>>]" at line 2492 of "/opt/apps/gcc/6.3.0/include/c++/6.3.0/type_traits"
            instantiation of class "std::__result_of_impl<false, false, _Functor, _ArgTypes...> [with _Functor=lambda [](const auto &, const auto &)->auto &, _ArgTypes=<const dealii::TriaActiveIterator<dealii::DoFCellAccessor<1, 1, false>> &, const std::vector<dealii::types::global_dof_index={unsigned int}, std::allocator<dealii::types::global_dof_index={unsigned int}>> &>]" at line 2505 of "/opt/apps/gcc/6.3.0/include/c++/6.3.0/type_traits"
            instantiation of class "std::result_of<_Functor (_ArgTypes...)> [with _Functor=lambda [](const auto &, const auto &)->auto &, _ArgTypes=<const dealii::TriaActiveIterator<dealii::DoFCellAccessor<1, 1, false>> &, const std::vector<dealii::types::global_dof_index={unsigned int}, std::allocator<dealii::types::global_dof_index={unsigned int}>> &>]" at line 3563
            instantiation of "void dealii::internal::DoFHandlerImplementation::Policy::<unnamed>::communicate_dof_indices_on_marked_cells(const dealii::DoFHandler<dim, spacedim> &, std::vector<bool, std::allocator<bool>> &) [with dim=1, spacedim=1]" at line 3734
            instantiation of "dealii::internal::DoFHandlerImplementation::NumberCache dealii::internal::DoFHandlerImplementation::Policy::ParallelDistributed<dim, spacedim>::distribute_dofs() const [with dim=1, spacedim=1]" at line 24 of "/build-dealii-intel18-lld/source/dofs/dof_handler_policy.inst"

/dealii/source/dofs/dof_handler_policy.cc(3533): error: name followed by "::" must be a class or namespace name
                  if (*received_index != numbers::invalid_dof_index)
                                         ^
          detected during:
            instantiation of function "lambda [](auto &, auto)->auto [with <auto-1>=dealii::types::global_dof_index={unsigned int}, <auto-2>=dealii::types::global_dof_index={unsigned int} *]" at line 528 of "/dealii/include/deal.II/dofs/dof_accessor.templates.h"
            instantiation of "void dealii::internal::DoFAccessorImplementation::Implementation::process_object(const dealii::DoFHandler<dim, spacedim> &, unsigned int, unsigned int, dealii::types::fe_index={unsigned short}, const DoFMapping &, const std::integral_constant<int, structdim> &, dealii::types::global_dof_index={unsigned int} *&, const DoFProcessor &) [with dim=1, spacedim=1, structdim=0, DoFProcessor=lambda [](auto &, auto)->auto, DoFMapping=lambda [](auto)->auto]" at line 1183 of
                      "/dealii/include/deal.II/dofs/dof_accessor.templates.h"
            instantiation of "void dealii::internal::DoFAccessorImplementation::Implementation::DoFIndexProcessor<dim, spacedim>::process_vertex_dofs(dealii::DoFHandler<dim, spacedim> &, unsigned int, dealii::types::fe_index={unsigned short}, dealii::types::global_dof_index={unsigned int} *&, const DoFProcessor &) const [with dim=1, spacedim=1, DoFProcessor=lambda [](auto &, auto)->auto]" at line 1031 of "/dealii/include/deal.II/dofs/dof_accessor.templates.h"
            instantiation of "void dealii::internal::DoFAccessorImplementation::Implementation::process_dof_indices(const dealii::DoFAccessor<structdim, dim, spacedim, level_dof_access> &, const DoFIndicesType &, dealii::types::fe_index={unsigned short}, const DoFOperation &, const DoFProcessor &, bool) [with dim=1, spacedim=1, level_dof_access=false, structdim=1, DoFIndicesType=std::vector<dealii::types::global_dof_index={unsigned int}, std::allocator<dealii::types::global_dof_index={unsigned
                      int}>>, DoFOperation=dealii::internal::DoFAccessorImplementation::Implementation::DoFIndexProcessor<1, 1>, DoFProcessor=lambda [](auto &, auto)->auto]" at line 3563
            instantiation of "void dealii::internal::DoFHandlerImplementation::Policy::<unnamed>::communicate_dof_indices_on_marked_cells(const dealii::DoFHandler<dim, spacedim> &, std::vector<bool, std::allocator<bool>> &) [with dim=1, spacedim=1]" at line 3734
            instantiation of "dealii::internal::DoFHandlerImplementation::NumberCache dealii::internal::DoFHandlerImplementation::Policy::ParallelDistributed<dim, spacedim>::distribute_dofs() const [with dim=1, spacedim=1]" at line 24 of "/build-dealii-intel18-lld/source/dofs/dof_handler_policy.inst"



Intel 19.1.1 (fixed by #15425)

There is some problem with the bundled TBB. Could it be that our updated cmake configuration broke something?

/dealii/bundled/tbb-2018_U2/src/tbb/x86_rtm_rw_mutex.cpp(83): error: class "tbb::spin_rw_mutex_v3::scoped_lock" has no member "internal_set_mutex"
              s.my_scoped_lock.internal_set_mutex(NULL);
                               ^

/dealii/bundled/tbb-2018_U2/src/tbb/x86_rtm_rw_mutex.cpp(129): error: class "tbb::spin_rw_mutex_v3::scoped_lock" has no member "internal_set_mutex"
                  s.my_scoped_lock.internal_set_mutex(this);  // need mutex for release()
                                   ^

/dealii/bundled/tbb-2018_U2/src/tbb/x86_rtm_rw_mutex.cpp(172): error: class "tbb::spin_rw_mutex_v3::scoped_lock" has no member "internal_set_mutex"
                  s.my_scoped_lock.internal_set_mutex(this);  // need mutex for release()
                                   ^

compilation aborted for /dealii/bundled/tbb-2018_U2/src/tbb/x86_rtm_rw_mutex.cpp (code 2)
@bangerth
Copy link
Member

For the first of these issues, #12591 touched that same piece of code.

@marcfehling
Copy link
Member Author

For the second issue related to TBB, I ran different configurations on stampede2 and tried to identify the issue. You can find different logs here:

Nothing really catches my eye. The interface_mpi target has differences in the fields INTERFACE_LINK_LIBRARIES and INTERFACE_LINK_OPTIONS if you compare the Intel18 and Intel19 configurations on the deal.II master branch, but I believe that stems from a different Intel MPI implementation that SLURM provides (2018.2.199 vs 2020.4.304).

In the output of make (when actually building the library) I noticed that TBB deprecation warnings pop up with the failing configuration of Intel19 and deal.II master, which do not appear in the other, working configurations. Could this be a sign? For example:

TBB Warning: tbb/task.h is deprecated. For details, please see Deprecated Features appendix in the TBB reference manual.
TBB Warning: tbb/atomic.h is deprecated. For details, please see Deprecated Features appendix in the TBB reference manual.
TBB Warning: tbb/aligned_space.h is deprecated. For details, please see Deprecated Features appendix in the TBB reference manual.
TBB Warning: tbb/task_scheduler_init.h is deprecated. For details, please see Deprecated Features appendix in the TBB reference manual.

Does someone have an idea?

@marcfehling
Copy link
Member Author

marcfehling commented Jun 21, 2023

The error and the deprecation warnings also occur if I build deal.II master with Intel19 and nothing but the bundled packages (no MPI, no LAPACK). For comparison, the Intel18 configuration works.

@marcfehling
Copy link
Member Author

marcfehling commented Jun 21, 2023

I can reproduce the problem on the expanse supercomputer with Intel 19.1.1 and bundled packages only.

@tjhei
Copy link
Member

tjhei commented Jun 21, 2023

Did you try a newer TBB version yet? Our bundled version is quite old.

@marcfehling
Copy link
Member Author

I ran git bisect and found the first commit that breaks our bundled TBB with Intel19: e4f621c as part of #14803.

e4f621c35057603a83afc4dc843fd625b42a1d91 is the first bad commit
commit e4f621c35057603a83afc4dc843fd625b42a1d91
Author: Daniel Arndt <arndtd@ornl.gov>
Date:   Wed Feb 15 10:12:25 2023 -0500

    Include bundled include directories as CMake SYSTEM paths

 bundled/CMakeLists.txt                         | 5 -----
 cmake/macros/macro_deal_ii_add_library.cmake   | 5 +++++
 cmake/macros/macro_insource_setup_target.cmake | 5 +++--
 contrib/python-bindings/source/CMakeLists.txt  | 2 +-
 include/deal.II/arborx/bvh.h                   | 3 ---
 include/deal.II/arborx/distributed_tree.h      | 2 --
 include/deal.II/base/memory_space.h            | 2 --
 include/deal.II/base/memory_space_data.h       | 2 --
 tests/quick_tests/kokkos.cc                    | 3 +--
 9 files changed, 10 insertions(+), 19 deletions(-)

@masterleinad
Copy link
Member

Can you print the full failing compile line (make VERBOSE=1)? I would be curious if using isystem makes a difference.

@tamiko
Copy link
Member

tamiko commented Jun 22, 2023

@masterleinad I am pretty certain that the compiler runs into some funny internal compiler error due to -isystem.
I have just regained access to a local cluster with icc 19 and I will attempt to reproduce and come up with a fix for our cmake configuration. @marcfehling Thanks for the awesome git bisect!

@marcfehling
Copy link
Member Author

marcfehling commented Jun 22, 2023

bad: e4f621c detailed-bad.log

$ make VERBOSE=1 obj_tbb_debug
...
[100%] Building CXX object bundled/tbb-2018_U2/src/CMakeFiles/obj_tbb_debug.dir/tbb/x86_rtm_rw_mutex.cpp.o
cd /build-dealii/bundled/tbb-2018_U2/src && /cm/shared/apps/spack/cpu/opt/spack/linux-centos8-zen/gcc-8.3.1/intel-19.1.1.217-4d42ptjd6wsnh5bgbzcv6lp44vxpjwut/bin/icpc -DDEBUG -DDO_ITT_NOTIFY -DUSE_PTHREAD -D__TBB_BUILD=1 -I/build-dealii/bundled/tbb-2018_U2/src -I/dealii/bundled/tbb-2018_U2/src -I/dealii/bundled/tbb-2018_U2/src/rml/include -isystem /dealii/bundled/tbb-2018_U2/include -isystem /dealii/bundled/kokkos-3.7.00/algorithms/src -isystem /dealii/bundled/kokkos-3.7.00/containers/src -isystem /dealii/bundled/kokkos-3.7.00/core/src -isystem /dealii/bundled/kokkos-3.7.00/simd/src -isystem /dealii/bundled/kokkos-3.7.00/tpls/desul/include -isystem /dealii/bundled/muparser_v2_3_3/include -fpic -w2 -diag-disable=remark -diag-disable=16219 -wd21 -wd68 -wd135 -wd175 -wd177 -wd191 -wd193 -wd279 -wd327 -wd383 -wd854 -wd981 -wd1418 -wd1478 -wd1572 -wd2259 -wd2536 -wd2651 -wd3415 -wd15531 -wd15552 -wd111 -wd128 -wd185 -wd186 -wd280 -qopenmp-simd -pthread -Wno-parentheses -O0 -g -gdwarf-2 -grecord-gcc-switches -o CMakeFiles/obj_tbb_debug.dir/tbb/x86_rtm_rw_mutex.cpp.o -c /dealii/bundled/tbb-2018_U2/src/tbb/x86_rtm_rw_mutex.cpp
TBB Warning: tbb/task_scheduler_init.h is deprecated. For details, please see Deprecated Features appendix in the TBB reference manual.
TBB Warning: tbb/atomic.h is deprecated. For details, please see Deprecated Features appendix in the TBB reference manual.
/dealii/bundled/tbb-2018_U2/src/tbb/x86_rtm_rw_mutex.cpp(83): error: class "tbb::spin_rw_mutex_v3::scoped_lock" has no member "internal_set_mutex"
              s.my_scoped_lock.internal_set_mutex(NULL);
                               ^

/dealii/bundled/tbb-2018_U2/src/tbb/x86_rtm_rw_mutex.cpp(129): error: class "tbb::spin_rw_mutex_v3::scoped_lock" has no member "internal_set_mutex"
                  s.my_scoped_lock.internal_set_mutex(this);  // need mutex for release()
                                   ^

/dealii/bundled/tbb-2018_U2/src/tbb/x86_rtm_rw_mutex.cpp(172): error: class "tbb::spin_rw_mutex_v3::scoped_lock" has no member "internal_set_mutex"
                  s.my_scoped_lock.internal_set_mutex(this);  // need mutex for release()
                                   ^

compilation aborted for /dealii/bundled/tbb-2018_U2/src/tbb/x86_rtm_rw_mutex.cpp (code 2)
make[3]: *** [bundled/tbb-2018_U2/src/CMakeFiles/obj_tbb_debug.dir/build.make:498: bundled/tbb-2018_U2/src/CMakeFiles/obj_tbb_debug.dir/tbb/x86_rtm_rw_mutex.cpp.o] Error 2
make[3]: Leaving directory '/build-dealii'
make[2]: *** [CMakeFiles/Makefile2:1847: bundled/tbb-2018_U2/src/CMakeFiles/obj_tbb_debug.dir/all] Error 2
make[2]: Leaving directory '/build-dealii'
make[1]: *** [CMakeFiles/Makefile2:1854: bundled/tbb-2018_U2/src/CMakeFiles/obj_tbb_debug.dir/rule] Error 2
make[1]: Leaving directory '/build-dealii'
make: *** [Makefile:352: obj_tbb_debug] Error 2

good: 5f989f0 detailed-good.log

$ make VERBOSE=1 obj_tbb_debug
...
[100%] Building CXX object bundled/tbb-2018_U2/src/CMakeFiles/obj_tbb_debug.dir/tbb/x86_rtm_rw_mutex.cpp.o
cd /build-dealii/bundled/tbb-2018_U2/src && /cm/shared/apps/spack/cpu/opt/spack/linux-centos8-zen/gcc-8.3.1/intel-19.1.1.217-4d42ptjd6wsnh5bgbzcv6lp44vxpjwut/bin/icpc -DDEBUG -DDO_ITT_NOTIFY -DUSE_PTHREAD -D__TBB_BUILD=1 -I/build-dealii/bundled/tbb-2018_U2/src -I/dealii/bundled/tbb-2018_U2/src -I/dealii/bundled/tbb-2018_U2/include -I/dealii/bundled/kokkos-3.7.00/algorithms/src -I/dealii/bundled/kokkos-3.7.00/containers/src -I/dealii/bundled/kokkos-3.7.00/core/src -I/dealii/bundled/kokkos-3.7.00/simd/src -I/dealii/bundled/kokkos-3.7.00/tpls/desul/include -I/dealii/bundled/muparser_v2_3_3/include -I/dealii/bundled/tbb-2018_U2/src/rml/include -fpic -w2 -diag-disable=remark -diag-disable=16219 -wd21 -wd68 -wd135 -wd175 -wd177 -wd191 -wd193 -wd279 -wd327 -wd383 -wd854 -wd981 -wd1418 -wd1478 -wd1572 -wd2259 -wd2536 -wd2651 -wd3415 -wd15531 -wd15552 -wd111 -wd128 -wd185 -wd186 -wd280 -qopenmp-simd -pthread -Wno-parentheses -O0 -g -gdwarf-2 -grecord-gcc-switches -o CMakeFiles/obj_tbb_debug.dir/tbb/x86_rtm_rw_mutex.cpp.o -c /dealii/bundled/tbb-2018_U2/src/tbb/x86_rtm_rw_mutex.cpp

@marcfehling
Copy link
Member Author

Can you print the full failing compile line (make VERBOSE=1)? I would be curious if using isystem makes a difference.

It looks like -isystem only appears in the bad configuration. It is absent in the good one.

@tamiko
Copy link
Member

tamiko commented Jun 22, 2023

I am betting that the issue is the -isystem in front of the tbb includes.

I am guessing you did remove parts of the path for privacy reasons? Would you mind running the command (with the failing compiler) and just replacing the one instance of -iystem in front of the tbb include by -I?

Edit: reproduced

@tamiko
Copy link
Member

tamiko commented Jun 22, 2023

@marcfehling Triaged. Pull request incoming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants