Easier cross-compiling for level 4? #5

stuarteberg · 2024-05-29T15:46:24Z

Comment:

The conda-forge docs for the microarch-optimized builds have an example that uses microarch_level: 4. But the README for this feedstock contains the following caveat:

When building packages on CI, level=4 will not be guaranteed, so you can only use level<=3 to build.

Indeed, when I tried to use level 4, I saw failures (in my case, it was on osx).

Nonetheless, I'd like to produce optimized builds for machines that support AVX-512 (level 4). This was possible by explicitly adding the necessary build flag in build.sh and then explicitly listing the appropriate run dependency:

# conda_build_config.yaml
microarch_level:
  - 1
  - 3  # [unix and x86_64]
  - 4  # [unix and x86_64]

# build.sh
if [[ "${microarch_level}" == "4" ]]; then
    CXXFLAGS="${CXXFLAGS} -march=x86-64-v4"
fi

# meta.yaml
requirements:
  run:
    - _x86_64-microarch-level 4  # [unix and x86_64 and microarch_level == 4]

Using that workaround, we were able to produce optimized binaries (including march=x86-64-v4 in the graph-tool feedstock (conda-forge/graph-tool-feedstock#140).

Would it be possible to make that easier for feedstock maintainers, perhaps by having the microarch-level-feedstock produce yet another output?

Right now this feedstock produces two packages for each arch, such as:

x86_64-microarch-level
a. Introduces the -march=x86-64-v${level} flag in CFLAGS etc.
b. Introduces a run_export to _x86_64-microarch-level
_x86_64-microarch-level
a. Introduces a run dependency to the appropriate __archspec virtual package.

...but it seems like cross-compilation would be easier if we were to split up the functionality from 1.a and 1.b. into two separate packages, so we could easily obtain the correct CFLAGS without pulling in the __archspec dependency. Perhaps we could offer two variants of the package: one that provides both 1.a and 1.b, and another variant that only provides 1.a. (I'm just splitballing here...)

Alternatively, we could just drop the run_exports from the {{ family }}-microarch-level recipe. In that case, feedstock maintainers could build level-4 packages without needing to add the compiler flag explicitly, but they would be forced to explicitly list the appropriate runtime dependency in their recipe, which could be annoying:

requirements:
  build:
    - x86_64-microarch-level {{ microarch_level }}  # [unix and x86_64]
    - ppc64le-microarch-level {{ microarch_level }}  # [unix and ppc64le]
  run:
    - _x86_64-microarch-level >={{ microarch_level }} # [unix and x86_64]
    - _ppc64le-microarch-level >={{ microarch_level }} # [unix and ppc64le]

The text was updated successfully, but these errors were encountered:

traversaro · 2024-05-29T15:48:32Z

This is probably related to the discussion in conda-forge/conda-forge.github.io#1261 .

isuruf · 2024-05-29T20:30:04Z

This is a deficiency of run_exports where strong run_exports in build -> host & run and we have no way of specifying build -> run only. I suggest doing ignore_run_exports_from and manually adding them in run.

stuarteberg added the question Further information is requested label May 29, 2024

stuarteberg mentioned this issue May 29, 2024

Attempt microarch build conda-forge/graph-tool-feedstock#140

Merged

5 tasks

jjhelmus mentioned this issue Jun 4, 2024

host microarch level leaks into run requirements #6

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Easier cross-compiling for level 4? #5

Easier cross-compiling for level 4? #5

stuarteberg commented May 29, 2024

traversaro commented May 29, 2024

isuruf commented May 29, 2024

Easier cross-compiling for level 4? #5

Easier cross-compiling for level 4? #5

Comments

stuarteberg commented May 29, 2024

Comment:

traversaro commented May 29, 2024

isuruf commented May 29, 2024