Skip to content

Regression in supported interconnects #63

@ocaisa

Description

@ocaisa

I was looking at the UCX configuration in 2020.12 and I noticed that it looks like we have a regression. From EESSI/compatibility-layer#49 (comment) it looks like we should have a UCX configuration like

configure: =========================================================
configure: UCX build configuration:
configure:       Build prefix:   /home/bob/ucx/inst
configure: Preprocessor flags:   -DCPU_FLAGS="|avx" -I${abs_top_srcdir}/src -I${abs_top_builddir} -I${abs_top_builddir}/src
configure:         C compiler:   x86_64-pc-linux-gnu-gcc -O3 -g -Wall -Werror -mavx
configure:       C++ compiler:   x86_64-pc-linux-gnu-g++ -O3 -g -Wall -Werror -mavx
configure:       Multi-thread:   enabled
configure:          MPI tests:   disabled
configure:      Devel headers:   no
configure:           Bindings:   < >
configure:        UCT modules:   < ib rdmacm cma >
configure:       CUDA modules:   < >
configure:       ROCM modules:   < >
configure:         IB modules:   < >
configure:        UCM modules:   < >
configure:       Perf modules:   < >
configure: =========================================================

but in the build log for UCX (on Zen2) I see

configure: =========================================================
configure: UCX build configuration:
configure:       Build prefix:   /cvmfs/pilot.eessi-hpc.org/2020.12/software/x86_64/amd/zen2/software/UCX/1.8.0-GCCcore-9.3.0
configure: Preprocessor flags:   -DCPU_FLAGS="|avx" -I${abs_top_srcdir}/src -I${abs_top_builddir} -I${abs_top_builddir}/src
configure:         C compiler:   gcc -O3 -g -Wall -Werror -mavx
configure:       C++ compiler:   g++ -O3 -g -Wall -Werror -mavx
configure:       Multi-thread:   enabled
configure:          MPI tests:   disabled
configure:      Devel headers:   no
configure:           Bindings:   < >
configure:        UCT modules:   < ib cma >
configure:       CUDA modules:   < >
configure:       ROCM modules:   < >
configure:         IB modules:   < >
configure:        UCM modules:   < >
configure:       Perf modules:   < >
configure: =========================================================

(note the missing rdmacm)

We should probably explicitly insert what we expect from the final build (--with-rdmacm) so that configure will fail rather than build regardless. UCX in particular is critical to the stack so could do with additonal checks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions