Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gaea Intel template C5 updates #44

Merged
merged 10 commits into from
Sep 1, 2022

Conversation

ceblanton
Copy link
Contributor

@ceblanton ceblanton commented Jul 8, 2022

#42 #45

Proposed updates (mostly from @nikizadehgfdl, @bensonr, and @thomas-robinson )

  • For prod, -xsse2 to -march=core-avx-2. This changes the ISA from xsse2 to avx2, and provides generic hardware support. It's believed avx2 can now support run-to-run reproducibility.

  • For repro and debug, -xsse2 to -march=core-avx-i. So far, this appears to provide complete reproducibility (across layouts, openmp, threads)

  • 3 templates: intel-oneapi, intel-classic, intel (ifort/icx)

  • For intel-classic, add -qno-opt-dynamic-align to FLAGS and CFLAGS, needed for run-to-run reproducibility when using vector ISAs.

  • For intel, add -qno-opt-dynamic-align to FFLAGS only (as icx does not support it)

  • For all, add -qopt-report-phase=vec and -qopt-report=2 to CFLAGS and FLAGS when VERBOSE is used

  • For intel-oneapi and intel, remove -sox from FFLAGS as ifx does not support it

  • For intel-oneapi and intel, remove -ftrapuv from CFLAGS_DEBUG as icx (clang) does not support it

  • For intel-oneapi and intel, set HAVE_GETTID macro as the GNU library seems to have gettid

  • Remove the unneeded and problematic (interferes with parallel make) SHELL variable

  • Remove MAKEFLAGS as it will request too much parallelism for batch jobs, and can be confusing as there are other switches in the FRE workflow to set the fremake parallelism

Chris Blanton and others added 8 commits July 7, 2022 14:01
C5 will use different Intel templates, but the
C3/C4 Intel template should remain the same to avoid answer changes.
… classic, and "production" or ifort/ifx

- For all 3, change -xsse2 to -march=core-avx2. This changes the ISA from xsse2 to avx2, and provides generic hardware support. It's believed the avx2 ISA can now support run-to-run reproducibility.
- For all 3, add -qopt-report-phase=vec and -qopt-report=2 to CFLAGS and FLAGS when VERBOSE is used
- For intel-classic and intel, add -qno-opt-dynamic-align, needed for run-to-run reproducibility when using vector ISAs.
- Also for intel-classic and intel, remove -ftrapuv from CFLAGS_DEBUG as icx (clang) does not support it
- For intel-oneapi, remove -sox from FFLAGS as ifx does not support it
ISA to FFLAGS as it is not a supported icx option
as C5's environment contains its own gettid Linux system call
available in the GNU standard library. The intel-classic
environment does not have the gettid system function defined.

More details: NOAA-GFDL/FMS#276

The error one sees is:
src/FMS/affinity/affinity.c:55:14: error: static declaration of 'gettid' follows non-static declaration
static pid_t gettid(void)
             ^
/usr/include/bits/unistd_ext.h:34:16: note: previous declaration is here
extern __pid_t gettid (void) __THROW;
               ^
1 error generated.
which interferes with parallel/recursive make
based on the number of cores available.

This setting is undesired and potentially dangerous for batch jobs,
as it assumes the job can use all cores available. Instead, for batch
jobs the paralellism should be the number of cores allocated to the job.
(Updates to FRE do this.) For interactive jobs, users can set
MAKEFLAGS themselves (e.g. in their shell init files) or as
the make -j option.
@ceblanton ceblanton marked this pull request as ready for review August 3, 2022 21:16
1. GNU Compiler Collection, not GNU
2. nvhpc instead of pgi
…and remove

unsupported "-h nosecond_underscore" option
@ceblanton ceblanton merged commit 4a99a8b into NOAA-GFDL:master Sep 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants