Skip to content

ci: switch to FESOM-org docker image + add fesom2_xios end-to-end run-test#902

Merged
JanStreffing merged 10 commits into
mainfrom
ci/fesom2-ci-image-and-xios-cell
May 3, 2026
Merged

ci: switch to FESOM-org docker image + add fesom2_xios end-to-end run-test#902
JanStreffing merged 10 commits into
mainfrom
ci/fesom2-ci-image-and-xios-cell

Conversation

@JanStreffing
Copy link
Copy Markdown
Collaborator

@JanStreffing JanStreffing commented May 1, 2026

  • Switches the build-test workflow container from the externally-owned ghcr.io/suvarchal/fesom2-ci:latest to the FESOM-org-owned ghcr.io/fesom/fesom2_docker:fesom2_ci-master, built from FESOM/FESOM2_Docker#3. Same toolchain (ubuntu 22.04 + GCC + OpenMPI + NetCDF/HDF5 + LAPACK + OASIS3-MCT prebuilt at /oasis); the recipe was reverse-engineered from suvarchal's image so behavior is identical for the existing five matrix cells (default, coupled, coupled_yac, recom, ifs_interface).
  • Adds a new fesom2_xios.yml end-to-end run-test mirroring the structure of fesom2_recom.yml / fesom2_cavities.yml. It runs inside ghcr.io/fesom/fesom2_docker:fesom2_test_refactoring-master (XIOS 2.5 install at /xios added in FESOM2_Docker#5; attached-mode finalize segv fixed in #6), compiles FESOM with ./configure.sh ubuntu -DFESOM_WITH_XIOS=ON, runs mkrun pi test_pi -m docker, stages a minimal standalone iodef.xml (using_server=false, no OASIS) plus a 6-field file_def_fesom.xml (sst, a_ice, temp, salt, u, v — matching the field set of the standard test_pi bit-identical check), runs ./job_docker_new, and verifies all six XIOS-output netcdf files were produced per rank.
  • Aligns fesom2_openmp.yml with the other run-test workflows by switching its container tag from fesom2_test_refactoring-nightly to -master. The nightly tag is only refreshed by a monthly cron, which let it lag behind master on the recent setuptools / pkg_resources fix from FESOM2_Docker#4.
  • Adds the missing std_dens axis declaration to docs/xios_xml/axis_def_fesom.xml. src/io_xios.F90:274 calls xios_set_axis_attr("std_dens", ...) unconditionally — to populate the dMOC density coordinate from std_dens_N / std_dens in oce_dens_MOC — so the axis must exist in the registry even when ldiag_dMOC=.false.. Without the declaration xios_close_context_definition aborts with CException: axis std_dens not found. Discovered by the new fesom2_xios.yml workflow.

Replaces the externally-owned ghcr.io/suvarchal/fesom2-ci:latest with
ghcr.io/fesom/fesom2_docker:fesom2_ci-master, the FESOM-org-owned image
built from FESOM/FESOM2_Docker that ships OASIS3-MCT prebuilt at /oasis
and XIOS 2.5 prebuilt at /xios.

Adds an 'xios' CMake preset (FESOM_WITH_XIOS=ON, otherwise standalone
defaults) and a corresponding matrix cell that copies /xios into the
workspace, exports XIOS_ROOT, and exercises the FESOM_WITH_XIOS=ON code
path which had no CI coverage before. The cell is build-only — XIOS
output isn't run in CI, so its ctest step is allowed to fail
(continue-on-error). fail-fast is set to false so a regression in one
matrix cell doesn't cancel the others.
@JanStreffing JanStreffing requested a review from suvarchal May 1, 2026 20:55
@JanStreffing JanStreffing self-assigned this May 1, 2026
…ings

The openmp workflow was the only one still pointing at the -nightly tag,
which is only refreshed by a monthly cron, so the setuptools<80 fix from
FESOM2_Docker#4 (which only rebuilt -master) hadn't reached it. Aligns
with fesom2_recom.yml, fesom2_cavities.yml etc.
…n-test

The xios cell in fesom2_build_tests was build-only (no XIOS server in CI),
duplicating coverage now provided by fesom2_xios.yml which both compiles
and runs FESOM with FESOM_WITH_XIOS=ON on the pi mesh, mirroring the
fesom2_recom / fesom2_cavities / fesom2_main pattern.

Drops the xios preset from CMakePresets.json (no other consumer) and
removes the xios matrix entry + the now-unused 'Copy XIOS directory'
step + XIOS_ROOT env from fesom2_build_tests.yml.

The new fesom2_xios.yml uses ghcr.io/fesom/fesom2_docker:fesom2_test_refactoring-master
(which now ships XIOS at /xios after FESOM2_Docker#5), copies the prebuilt
XIOS into the workspace, runs ./configure.sh ubuntu -DFESOM_WITH_XIOS=ON,
mkrun pi test_pi -m docker, stages docs/xios_xml/{context,field_def,file_def}_fesom.xml
plus a minimal standalone iodef.xml (using_server=false, no oasis), runs
./job_docker_new, and verifies XIOS produced sst.fesom*.nc.
XIOS context_fesom.xml references axis_def_fesom.xml, domain_def_fesom.xml,
and grid_def_fesom.xml in addition to field/file_def. The previous Stage
XIOS XMLs step only copied three of them, so XIOS aborted at startup with
'Can not open <./axis_def_fesom.xml> file'. Copy them all and overwrite
iodef.xml with the standalone variant.
io_xios.F90:274 unconditionally calls xios_set_axis_attr("std_dens", ...)
to populate the dMOC density coordinate, but the bundled axis_def_fesom.xml
only declared the vertical axes (nz, nz1). XIOS therefore threw CException
'axis std_dens not found' at xios_close_context_definition, aborting any
FESOM_WITH_XIOS=ON run that started from these reference XMLs.

Add std_dens as a top-level axis (not in the Z-axis group, since it carries
density not depth). Discovered by the new fesom2_xios.yml CI workflow.
Replaces the bundled docs/xios_xml/file_def_fesom.xml at runtime with a
6-field subset (sst, a_ice, temp, salt, u, v) matching what
mkfesom/settings/test_pi/setup.yml flags for the standard pi-mesh
bit-identical CI check. The bundled file_def lists ~50 fields, several
of which have stale grid_ref/prec/shape entries that crash
xios_send_field with an opaque CException.

Verified end-to-end on Levante (no-OASIS XIOS 2.5 + intel + openmpi,
2 ranks, pi mesh, 1 day, CORE2 forcing, woa18 climatology): all 6
xios-output netcdf files produced per rank, see SLURM job 24651037.
@JanStreffing JanStreffing changed the title ci: use FESOM-org docker image and add xios build matrix cell ci: switch to FESOM-org docker image + add fesom2_xios end-to-end run-test May 3, 2026
@JanStreffing JanStreffing requested a review from patrickscholz May 3, 2026 07:45
@JanStreffing JanStreffing added the enhancement New feature or request label May 3, 2026
Adds an inline python check that mirrors mkfesom's fcheck but handles
the per-rank XIOS file naming (<var>.fesom_<startyear>-<endyear>_<rank>.nc)
that fcheck doesn't know about: glob all rank files per variable, masked
concat, mean, compare to a hardcoded reference at abs<1e-3.

Reference means were measured on Levante (intel + openmpi, 2-rank pi
mesh, 1 day, CORE2 forcing, woa18 climatology, FESOM2_Docker#6 patched
XIOS) so the CI's gfortran-side numbers may differ at the ULP level —
the 1e-3 tolerance absorbs intel-vs-gfortran rounding on a first pass.
Tighten to 1e-12 once the gfortran-side reference is known.
@JanStreffing
Copy link
Copy Markdown
Collaborator Author

Ready for review. ty! :)

@JanStreffing JanStreffing merged commit 2a83155 into main May 3, 2026
20 checks passed
@sebastianbeyer sebastianbeyer deleted the ci/fesom2-ci-image-and-xios-cell branch May 4, 2026 07:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants