-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tpetra::Distributor: Inheriting from VerboseObject<Distributor> causes crash in Import ctor w/ various Intel compiler versions #63
Labels
Comments
mhoemmen
pushed a commit
that referenced
this issue
Dec 24, 2015
@trilinos/tpetra A user reported odd crashes with Intel 15.0.4 + OpenMPI 1.8 and Intel 16 + Intel MPI. The crashes showed up as an infinite loop of SIGSEGV in the constructor of Teuchos::VerboseObject, as invoked by Tpetra::Distributor's three-argument constructor, as invoked by Tpetra::Import's usual two-argument (two Maps) constructor. It looked like the initializeVerboseObjectBase method was calling itself recursively, though it wasn't actually doing that in the code. I knew that the Intel compiler had some issues with name resolution in nested scopes involving templated classes, and guessed that this could be related. This is because VerboseObject has a template parameter which is supposed to be its child class, in an odd but apparently standard C++ design pattern. Distributor thus inherits from VerboseObject<Distributor>. If I find this confusing, surely the compiler could too, right? Distributor really didn't need to inherit from VerboseObject. This was something I had added a few years ago, to help me debug MPI communication issues. When I removed the inheritance, that fixed the issue. Thus, Distributor no longer inherits from Teuchos::VerboseObject. For backwards compatibility, it retains the "VerboseObject" sublist in its list of valid parameters (for setParameterList and the constructors that take a ParameterList), but it now ignores that sublist. This fixes Issue #63.
Github didn't pick up on my commit actually closing this issue, rather than just referencing it. |
Merged
bartlettroscoe
added a commit
that referenced
this issue
May 26, 2021
Origin repo remote tracking branch: 'github/master' Origin repo remote repo URL: 'github = git@github.com:TriBITSPub/TriBITS.git' At commit: commit 74f61da9a7a97742e941585166025e849c6dcfaf Author: Roscoe A. Bartlett <rabartl@sandia.gov> Date: Wed May 26 16:56:22 2021 -0600 Summary: Add link back to function call graph (#63)
bartlettroscoe
added a commit
that referenced
this issue
Jul 15, 2022
Origin repo remote tracking branch: 'github/master' Origin repo remote repo URL: 'github = git@github.com:TriBITSPub/TriBITS.git' Git describe: Vera4.0-RC1-start-1202-g24463542 At commit: commit e121d76729c0ebe67a6e09f9e7a6f2cb7c61b5ae Author: Roscoe A. Bartlett <rabartl@sandia.gov> Date: Thu Jul 14 19:46:24 2022 -0600 Summary: Add entry for <TPLNAME>_LIB_ENABLED_DEPENDENCIES to build ref (#63, #299, #494)
bartlettroscoe
added a commit
that referenced
this issue
Aug 11, 2022
Origin repo remote tracking branch: 'github/master' Origin repo remote repo URL: 'github = git@github.com:TriBITSPub/TriBITS.git' Git describe: Vera4.0-RC1-start-1241-gd807b172 At commit: commit b00ab335494761e6ac48d50971444daa5a502927 Author: Roscoe A. Bartlett <rabartl@sandia.gov> Date: Thu Aug 11 07:35:38 2022 -0600 Summary: Fix test dependencies for subpackages and subpackage tests/examples enabless (#63, #268, #299, #510)
bartlettroscoe
added a commit
to bartlettroscoe/Trilinos
that referenced
this issue
Feb 9, 2023
When ParMEITS is found by an upstream cmake project (e.g. KokkosKerenls), the value of HAVE_PARMETIS_VERSION_4_0_3l does not get written into the ParMETISConfig.cmake file so Zoltan2 was configuring with an error. But this is a check for Zoltan2, not other Trilinos packages anyway so this check should be in Zoltan2, not in the FindTPLParMETIS.cmake NOTE: If we want to support exporting various variables into the generated <tplName>Config.cmake files for TriBITS TPLs generated using tribits_tpl_find_include_dirs_and_libraries(), then we will need to do some refactoring in TriBITS to make that possible. But that should not be too hard. This was just not needed in this case.
bartlettroscoe
added a commit
to bartlettroscoe/Trilinos
that referenced
this issue
Mar 29, 2023
Origin repo remote tracking branch: 'github/master' Origin repo remote repo URL: 'github = git@github.com:TriBITSPub/TriBITS.git' Git describe: Vera4.0-RC1-start-1488-gd9a337d8 At commit: commit 014a1538939b783602d2af6033b2175edfb51a96 Author: Roscoe A. Bartlett <rabartl@sandia.gov> Date: Wed Mar 29 10:41:41 2023 -0600 Summary: Remove unused legacy RELATIVE_PATH code (trilinos#63, trilinos#560)
jmgate
pushed a commit
to tcad-charon/Trilinos
that referenced
this issue
Apr 1, 2023
…s:develop' (28a7b37). * trilinos-develop: Have Kokkos TriBITS build set compiler options as target properties (trilinos#11545) Update logic for TPL_ENABLE_Kokkos=ON (trilinos#11545) TrilinosInstallTests_find_package_Trilinos: Run in own subdir Move check for ParMETS version for Zoltan2 to Zoltan2 (trilinos#63) Have Kokkos TriBITS build properly export options to package config files (trilinos#11545)
jmgate
pushed a commit
to tcad-charon/Trilinos
that referenced
this issue
Apr 1, 2023
…s:develop' (28a7b37). * trilinos-develop: Have Kokkos TriBITS build set compiler options as target properties (trilinos#11545) Update logic for TPL_ENABLE_Kokkos=ON (trilinos#11545) TrilinosInstallTests_find_package_Trilinos: Run in own subdir Move check for ParMETS version for Zoltan2 to Zoltan2 (trilinos#63) Have Kokkos TriBITS build properly export options to package config files (trilinos#11545)
bartlettroscoe
added a commit
that referenced
this issue
May 13, 2023
Origin repo remote tracking branch: 'github/master' Origin repo remote repo URL: 'github = git@github.com:TriBITSPub/TriBITS.git' Git describe: vera-release-3.5-start-1789-gb669da78 At commit: commit 9e6f4b999cfc38fcc40475491dc962ae514c08f2 Author: Roscoe A. Bartlett <rabartl@sandia.gov> Date: Thu May 11 13:36:23 2023 -0600 Summary: Fix some old misspellings caught by codespell tests (#63)
jwillenbring
referenced
this issue
in jwillenbring/Trilinos
Jun 12, 2023
Origin repo remote tracking branch: 'github/master' Origin repo remote repo URL: 'github = git@github.com:TriBITSPub/TriBITS.git' Git describe: Vera4.0-RC1-start-1488-gd9a337d8 At commit: commit 014a1538939b783602d2af6033b2175edfb51a96 Author: Roscoe A. Bartlett <rabartl@sandia.gov> Date: Wed Mar 29 10:41:41 2023 -0600 Summary: Remove unused legacy RELATIVE_PATH code (#63, trilinos#560)
jwillenbring
referenced
this issue
in jwillenbring/Trilinos
Jun 12, 2023
When ParMEITS is found by an upstream cmake project (e.g. KokkosKerenls), the value of HAVE_PARMETIS_VERSION_4_0_3l does not get written into the ParMETISConfig.cmake file so Zoltan2 was configuring with an error. But this is a check for Zoltan2, not other Trilinos packages anyway so this check should be in Zoltan2, not in the FindTPLParMETIS.cmake NOTE: If we want to support exporting various variables into the generated <tplName>Config.cmake files for TriBITS TPLs generated using tribits_tpl_find_include_dirs_and_libraries(), then we will need to do some refactoring in TriBITS to make that possible. But that should not be too hard. This was just not needed in this case.
bartlettroscoe
added a commit
that referenced
this issue
Jun 27, 2023
Origin repo remote tracking branch: 'github/master' Origin repo remote repo URL: 'github = git@github.com:TriBITSPub/TriBITS.git' Git describe: vera-release-3.5-start-1802-g6d9eaef0 At commit: commit 24e96fe74bfca6c42b8aedb96468ee4ed22a8062 Author: Roscoe A. Bartlett <rabartl@sandia.gov> Date: Tue Jun 27 08:14:20 2023 -0600 Summary: Print out <Package>_DIR and error out if not set (#63)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@trilinos/tpetra @amklinv
With Intel 15.0.4 and Intel 16, using either Intel MPI or OpenMPI 1.8, the Tpetra::Import constructor crashes. Micah chased this down and found out that the Tpetra::Distributor instance that Import creates inherits from Teuchos::VerboseObjectTpetra::Distributor (looks weird, I know, but this is how one is supposed to use VerboseObject), and that VerboseObject's constructor was getting into a weird multiply segfaulting infinite loop when calling initializeVerboseObjectBase().
Some versions of the Intel compiler have trouble with name resolution in certain cases. On a hunch, I made Distributor no longer inherit from VerboseObject, and that fixed the problem. Yay!
Thus, the fix is to remove Distributor's inheritance from VerboseObject. This was pretty easy to do but I want to make sure that Distributor's ParameterList still accepts a "VerboseObject" sublist (see e.g., line 430 of Tpetra_Distributor.cpp).
The text was updated successfully, but these errors were encountered: