Skip to content

Conversation

@dcodoni
Copy link
Contributor

@dcodoni dcodoni commented Aug 25, 2025

The Trilinos implementation is updated to use the latest packages available in the library. (address #422 )

Current situation

Currently the Trilinos implementation relies on archived, not supported or deprecated packages. In particular, it is based on the following packages:

  • Epetra: legacy linear algebra library in Trilinos for dense and sparse matrices and vectors. It has limited flexibility, making it harder to maintain and extend, for example if a custom preconditioner needs to be implemented in the trilinos environment (Planned Archival/Deprecation)
  • AztecOO: iterative linear solver package in Trilinos, it lacks some linear solvers such as FGMRES (Planned Archival/Deprecation)
  • Ifpack: legacy preconditioner package for Epetra-based matrices (Planned Archival/Deprecation)
  • ML: multilevel preconditioner package (Planned Archival/Deprecation)
    Since Ifpack and ML are deprecated, they are not developed anymore and new optimized preconditioners will not be available in our solver.

Release Notes

Packages updated:

  • Epetra -> Tpetra: actively developed, templated, and Kokkos-enabled for GPU/CPU portability.
  • AztecOO -> Belos: supported modern iterative solver framework with flexible operator support.
  • Ifpack -> Ifpack2: supported templated preconditioner package compatible with Tpetra.
  • ML -> MueLu: actively developed multigrid preconditioner framework with advanced scalability (parameters are tuned for working on a general fsi problem).
    The core package Teuchos is used for memory management, communication utilities, parameter lists, and interoperability across Trilinos packages.

All updated packages are wrapped under a Kokkos Node abstraction, enabling the same code to target different execution spaces. This provides the possibility of running for example on GPUs in the future, while still supporting efficient CPU execution when GPU usage is not required.

The workflow files are updated to use the latest docker image which contains the updated Trilinos built.
The dockerfiles are updated also for reference.

Documentation

Documentation about Trilinos linear algebra should be added.

Testing

  • existing tests using Trilinos are updated.
  • a new test is added for the fluid iliac artery to use Trilinos ML preconditioner

Code of Conduct & Contributing Guidelines

* main packages updates: Epetra->Tpetra; AztecOO->Belos; Ifpack->Ifpack2; ML->MueLu
* refactor linear algebra backend to use Teuchos and Kokkos:
- wrap Tpetra vectors and matrices with Teuchos::RCP for safer memory management and automatic reference counting.
- use Kokkos (via Tpetra::Node/KokkosDeviceWrapperNode) to abstract parallel execution (same code to run efficiently on CPU and GPU theoretically)
* removed all static objects
* change of IC/ICT preconditioners in favor of RILUK(0)/RILUK(1)
* removed ubuntu20 and ubuntu22 folders. Created ubunut/ folder that contains dockerfile based on ubuntu-latest
* modified the solver/dockerfile to use the ubuntu-latest docker container as a base
…ner:

* simvascular/libraries:latest is the image containing the latest trilinos built
* increased the maximum iteration number for linear solver
* add the fluid case iliac artery with ml preconditioner
@dcodoni dcodoni requested a review from ktbolt August 25, 2025 23:23
Copy link
Collaborator

@ktbolt ktbolt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcodoni Nice work !

Make sure to update the Trilinos build instructions in the README.

* kokkos::finalize has been moved inside the TrilinosImpl class, fixing the compilation errors whenever trilinos is not used
* trilinos linear solver residual test has been properly set, fixing first non-linear iteration problem of maxing out the linear iterations reducing excessively and unnecessarily the residual
Copy link
Collaborator

@ktbolt ktbolt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good !

dcodoni and others added 7 commits August 26, 2025 15:57
* string length was computed incorrectly, leading to silent buffer overflow.
* this only surfaced as a critical memory crash when linking with the latest Trilinos libraries.
- Added --ignore-errors mismatch to avoid geninfo line mismatch errors
- Added --rc geninfo_unexecuted_blocks=1 to silence GCC warnings
@ktbolt
Copy link
Collaborator

ktbolt commented Sep 16, 2025

@dcodoni What's the plan here ?

@dcodoni
Copy link
Contributor Author

dcodoni commented Sep 16, 2025

@ktbolt macos now is failing the ustruct tests even though they were succeeding before. The coverage problem I honestly have no idea yet!

@ktbolt
Copy link
Collaborator

ktbolt commented Sep 16, 2025

@dcodoni MacOS tests failed because of an MPI error related (maybe) to a network time out, they now complete.

Ubuntu is failing because of a make error related (I think) to computing coverage.

@dcodoni
Copy link
Contributor Author

dcodoni commented Sep 16, 2025

@ktbolt the ubuntu test fails during generation of code coverage with error: geninfo: ERROR: mismatched end line. I tried then to ignore this type of errors by including in the CmakeLists the following flags: --ignore-errors mismatch --rc geninfo_unexecuted_blocks=1. This made it go further but at the end I got another make error, which I haven't investigated still.

@mrp089
Copy link
Member

mrp089 commented Sep 16, 2025

Should we also exclude the tests/unitTests folder from coverage? That's currently included in computing overall coverage.

@dcodoni
Copy link
Contributor Author

dcodoni commented Sep 16, 2025

@mrp089 that is actually a good idea, and I think it may solve the problem I have been having

@codecov
Copy link

codecov bot commented Sep 16, 2025

Codecov Report

❌ Patch coverage is 73.29545% with 94 lines in your changes missing coverage. Please review.
✅ Project coverage is 67.35%. Comparing base (44fc5ec) to head (3b34dd5).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
Code/Source/solver/trilinos_impl.cpp 73.80% 88 Missing ⚠️
Code/Source/solver/trilinos_impl.h 33.33% 4 Missing ⚠️
Code/Source/solver/main.cpp 60.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #424      +/-   ##
==========================================
- Coverage   67.47%   67.35%   -0.13%     
==========================================
  Files         170      169       -1     
  Lines       32622    32620       -2     
  Branches     5740     5718      -22     
==========================================
- Hits        22012    21970      -42     
- Misses      10472    10513      +41     
+ Partials      138      137       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dcodoni
Copy link
Contributor Author

dcodoni commented Sep 17, 2025

@ktbolt please review the changes in Code/Source/solver/CMakeLists.txt. In particular this flag: --ignore-errors mismatch avoided the line mismatch type of error I was having. The alternative is to keep the original CMake and exclude the unit tests from coverage as @mrp089 was saying.

@ktbolt
Copy link
Collaborator

ktbolt commented Sep 17, 2025

@dcodoni I think it is cleaner to exclude the unit tests from the coverage step, the macros used by the unit tests are probably causing problems.

* Built solver sources into a shared object library (solver_objs)
  so they can be reused by both the main executable and unit tests.
  This avoids recompiling solver sources twice when ENABLE_UNIT_TEST
  is enabled.
* Excluded unit tests from code coverage by removing coverage flags
  from the run_all_unit_tests target (-fno-profile-arcs, -fno-test-coverage).
  This prevents .gcda files from being generated for test code,
  ensuring coverage only reflects solver sources.
…e section:

* --ignore-errors mismatch --rc geninfo_unexecuted_blocks=1
* --ignore-errors unused
These flags allowed to save a coverage file even in presence of errors of mismatch type.
Fix some parametres setting in the Trilinos GMRES solver: krylov space, max iteration and max restarts.
Modified tests.yml to use the latest libraries container that has been built with latest Ubuntu and contains updated Trilinos library.
* add timer for the linear solver in Trilinos
* add the possibility to read the MueLu Options for the ML preconditioner from file mueluOptions.xml
  if file is not present built-in parameters are used
@dcodoni
Copy link
Contributor Author

dcodoni commented Oct 24, 2025

@ktbolt I found some allocation problems in the Trilinos implementation that showed up only at high number of cores and for CFD problems. I am trying to investigate it. Let's not merge it now.

@ktbolt
Copy link
Collaborator

ktbolt commented Oct 24, 2025

@dcodoni Yes I was wondering if something like that might show up.

* the localt to global map was created in sorted order, while the graph allocated in unsorted order
* the graph is now created using the sorted order like the map
@dcodoni
Copy link
Contributor Author

dcodoni commented Oct 28, 2025

@ktbolt I fixed the bug in the LHS matrix graph creation. The problem was in the allocation of the non-zero elements per row during graph creation. The local to global map of the matrix used the sorted ordering while the graph was using information in unsorted ordering. This mismatch caused the wrong allocation problem, since Tpetra does not allow for dynamic allocation unlike Epetra.

@ktbolt
Copy link
Collaborator

ktbolt commented Oct 28, 2025

@dcodoni Tricky bug to fix, good job !

@ktbolt ktbolt merged commit 27bda0f into SimVascular:main Oct 28, 2025
8 checks passed
kko27 pushed a commit to kko27/svMultiPhysics that referenced this pull request Dec 17, 2025
* Trilinos updated:
* main packages updates: Epetra->Tpetra; AztecOO->Belos; Ifpack->Ifpack2; ML->MueLu
* refactor linear algebra backend to use Teuchos and Kokkos:
- wrap Tpetra vectors and matrices with Teuchos::RCP for safer memory management and automatic reference counting.
- use Kokkos (via Tpetra::Node/KokkosDeviceWrapperNode) to abstract parallel execution (same code to run efficiently on CPU and GPU theoretically)
* removed all static objects
* change of IC/ICT preconditioners in favor of RILUK(0)/RILUK(1)

* Updating dockerfiles:
* removed ubuntu20 and ubuntu22 folders. Created ubunut/ folder that contains dockerfile based on ubuntu-latest
* modified the solver/dockerfile to use the ubuntu-latest docker container as a base

* Updated dockerfiles and workflow file to use the latest docker container:
* simvascular/libraries:latest is the image containing the latest trilinos built

* Update the input files of trilinos related test cases for fluid and fsi
* increased the maximum iteration number for linear solver
* add the fluid case iliac artery with ml preconditioner

* Modifications and improvements:
* kokkos::finalize has been moved inside the TrilinosImpl class, fixing the compilation errors whenever trilinos is not used
* trilinos linear solver residual test has been properly set, fixing first non-linear iteration problem of maxing out the linear iterations reducing excessively and unnecessarily the residual

* Updating Trilinos packages list in README.md

* Final update on dockerfile

* Fix buffer overflow in genBC_Integ_X (set_bc.cpp):
* string length was computed incorrectly, leading to silent buffer overflow.
* this only surfaced as a critical memory crash when linking with the latest Trilinos libraries.

* Changes to CMake to fix coverage and unit test mismatch

* Modifying cmake file t ofix the coverage error

* Fix coverage target for GCC (lcov/geninfo):
- Added --ignore-errors mismatch to avoid geninfo line mismatch errors
- Added --rc geninfo_unexecuted_blocks=1 to silence GCC warnings

* fix coverage: correct path expansion in 'lcov --remove ' command

* Removing the unit tests from coverage

* Refactor CMake: unify solver build, exclude unit tests from coverage
* Built solver sources into a shared object library (solver_objs)
  so they can be reused by both the main executable and unit tests.
  This avoids recompiling solver sources twice when ENABLE_UNIT_TEST
  is enabled.
* Excluded unit tests from code coverage by removing coverage flags
  from the run_all_unit_tests target (-fno-profile-arcs, -fno-test-coverage).
  This prevents .gcda files from being generated for test code,
  ensuring coverage only reflects solver sources.

* Removing the unit tests folder '*/CMakeFiles/run_all_unit_tests.dir/*'
from the coverage.

* Remove '*/CMakeFiles/run_all_unit_tests.dir/*' from the coverage list

* Adding the following flags in the Cmake file under the enable_coverage section:
* --ignore-errors mismatch --rc geninfo_unexecuted_blocks=1
* --ignore-errors unused
These flags allowed to save a coverage file even in presence of errors of mismatch type.
Fix some parametres setting in the Trilinos GMRES solver: krylov space, max iteration and max restarts.

* Updates in new trilinos implementation:
* add timer for the linear solver in Trilinos
* add the possibility to read the MueLu Options for the ML preconditioner from file mueluOptions.xml
  if file is not present built-in parameters are used

* Using fsils for the mesh equation in FSI test cases

* Fixed allocation problem in the graph creation:
* the localt to global map was created in sorted order, while the graph allocated in unsorted order
* the graph is now created using the sorted order like the map
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants