Skip to content

Add Parallel I/O Infrastructure#706

Open
SeanBryan51 wants to merge 11 commits intomainfrom
add-parallelio-infrastructure
Open

Add Parallel I/O Infrastructure#706
SeanBryan51 wants to merge 11 commits intomainfrom
add-parallelio-infrastructure

Conversation

@SeanBryan51
Copy link
Collaborator

@SeanBryan51 SeanBryan51 commented Mar 10, 2026

This change brings in the interface layer for working with the NetCDF Fortran and ParallelIO (PIO) libraries in CABLE. PIO allows for each MPI rank to participate in I/O operations collectively and is a first step in adding MPI support to the serial offline driver, and eventually, to replace the legacy MPI implementation (#358).

To keep CABLE dependencies as minimal as possible for running in serial mode (without MPI), the interface layer is designed such that PIO support is optional.

To build CABLE with PIO, PIO version 2.6.8 or greater is required.

Type of change

Please delete options that are not relevant.

  • New feature
  • New or updated documentation

Checklist

  • The new content is accessible and located in the appropriate section
  • I have checked that links are valid and point to the intended content
  • I have checked my code/text and corrected any misspellings

Testing

  • Are the changes bitwise-compatible with the main branch? If working on an optional feature, are the results bitwise-compatible when this feature is off? If yes, copy benchcab output showing successful completion of the bitwise compatibility tests or equivalent results below this line.
2026-03-17 12:43:56,608 - INFO - benchcab.benchcab.py:380 - Running comparison tasks...
2026-03-17 12:43:56,634 - INFO - benchcab.benchcab.py:381 - tasks: 168 (models: 2, sites: 42, science configurations: 4)
2026-03-17 12:46:38,124 - INFO - benchcab.benchcab.py:391 - 0 failed, 168 passed

📚 Documentation preview 📚: https://cable--706.org.readthedocs.build/en/706/

@SeanBryan51 SeanBryan51 force-pushed the add-parallelio-infrastructure branch 9 times, most recently from ffc3905 to 0138093 Compare March 10, 2026 21:18
@SeanBryan51 SeanBryan51 changed the title Add parallelio infrastructure Add Parallel I/O Infrastructure Mar 11, 2026
@SeanBryan51 SeanBryan51 force-pushed the add-parallelio-infrastructure branch 4 times, most recently from 05b0997 to c10b30a Compare March 16, 2026 08:48
@SeanBryan51 SeanBryan51 force-pushed the add-parallelio-infrastructure branch 2 times, most recently from f9fc997 to 1d8d61f Compare March 17, 2026 00:42
GCC was updated from 13.x to 14.x to allow for passing a factory
procedure with a polymorphic function result to each test case
subroutine.

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118372 for more
details.
@SeanBryan51 SeanBryan51 force-pushed the add-parallelio-infrastructure branch from 1d8d61f to 7f8d450 Compare March 17, 2026 00:48
@SeanBryan51 SeanBryan51 marked this pull request as ready for review March 17, 2026 01:13
@SeanBryan51
Copy link
Collaborator Author

SeanBryan51 commented Mar 17, 2026

Hi @Whyborn, now that #700 has been merged, these changes are ready to go in. Do you mind giving this your review?

Copy link
Contributor

@Whyborn Whyborn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of the internal functions are missing any commentary. I found a lot of the routines not particularly intuitive, so some commentary explaining human-readably what the routines are doing would be a big help I think. Doesn't have to be public facing documentation, more like internal developer comments.

I find the tests not very "human"- a lot of abstractions, that I didn't find very easy to reason about exactly what they're doing. This could probably be solved by some commentary as well.

Comment on lines +119 to +127
if [ $do_tests -eq 1 ]; then
# This is required to allow for passing a factory procedure
# with a polymorphic function result to various test cases. See
# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118372 for more
# details.
module add gcc/14.1.0
else
module add gcc/13.2.0
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for keeping gcc/13.2.0 at all? Seems a bit redundant to do the tests with one compiler, but do production compilation with another?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's important to know why a compiler constraint is needed and to not impose them if it isn't necessary as this could hurt portability in general. So I guess I wanted to highlight here that the compiler bump here is needed only to build the tests and not necessarily to build the model.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm personally not in favour of this- I'd much prefer to just up the default version of gcc to the higher version. I don't think updating the gcc compiler is a difficult barrier to get over. I like keeping the explanation as to why the version was bumped. This opens the potential for a development to pass the unit tests, but fail in production (although quite unlikely)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I overlooked the possibility that tests will pass for the more recent compiler, but fail for older compilers.

Happy to keep the compiler versions consistent between test and non-test builds.

I realised now that this is a bigger issue for the Fortuno dependency itself, which requires the more recent intel compilers when building from the ground up via FetchContent, which is an even bigger compiler jump.

CABLE/build.bash

Lines 108 to 114 in 7f57306

if [ $do_tests -eq 1 ]; then
# This is required to as Fortuno requires Intel Fortran
# 2024.0.0 or higher
module add intel-compiler-llvm/2024.0.2
else
module add intel-compiler/2019.5.281
fi

Intel Fortran compilers should have compatible ABIs with each other, so Fortuno can be built with a more recent compiler, but not necessarily have this enforced on the CABLE test build. I've managed to write a spack package recipe for Fortuno, so I'll try install fortuno via spack with the more recent intel compiler, and use find_package to link against it, and try building the CABLE tests with CABLE's default compiler.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just did some tests building the tests for various intel compilers using Fortuno built with Intel (ifx) 2025.2.0:

  • intel-compiler/2019.5.281 FAILED
  • intel-compiler/2020.3.304 FAILED
  • intel-compiler/2020.3.304 FAILED
  • intel-compiler/2021.4.0 FAILED
  • intel-compiler/2021.8.0 FAILED
  • intel-compiler/2021.10.0 SUCCESS

Looks like, for the most part, ifx and ifort modules are incompatible.

Copy link
Collaborator Author

@SeanBryan51 SeanBryan51 Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just did some tests building the tests for various intel compilers using Fortuno built with Intel (ifx) 2025.2.0:

Even some older ifx compilers are not compatible:

  • intel-compiler-llvm/2022.0.0 FAILED
  • intel-compiler-llvm/2023.0.0 FAILED
  • intel-compiler-llvm/2023.2.0 SUCCESS

So in summary, we need at least ifort 2021.10.0 or ifx 2023.2.0 to link against Fortuno built with Intel (ifx) 2025.2.0

integer :: i, tmp
do i = 1, size(dim_names)
if (dim_lens(i) == CABLE_NETCDF_UNLIMITED) then
call check_pio(pio_def_dim(this%pio_file_desc, dim_names(i), PIO_UNLIMITED, tmp))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we assign CABLE_NETCDF_UNLIMITED = PIO_UNLIMITED, so that we can drop the if (dim_lens(i) == CABLE_NETCDF_UNLIMITED) then clause?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna say no on this one, CABLE_NETCDF_UNLIMITED is a constant and I don't think each implementation should overwrite constants in cable_netcdf_mod in general.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PIO and NetCDF use different values for UNLIMITED?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PIO and NetCDF use different values for UNLIMITED?

They could be, it depends on what the PIO developers use, and whether that's consistent with NetCDF Fortran.

Having a separate constant in cable_netcdf_mod means we don't need to worry about this.

@SeanBryan51 SeanBryan51 requested a review from abhaasgoyal March 17, 2026 06:02
@SeanBryan51 SeanBryan51 force-pushed the add-parallelio-infrastructure branch from d1767af to 87f98fd Compare March 17, 2026 06:18
@SeanBryan51
Copy link
Collaborator Author

SeanBryan51 commented Mar 18, 2026

Thanks for taking a look Lachlan, I forgot to add docs to cable_netcdf_internal.F90. I realised cable_netcdf_internal.F90 was only really necessary for initialising the I/O handler, so I've trimmed down that file to only include the I/O handler initialisation and renamed it to cable_netcdf_init.F90 in 0f30366.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants