Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update known issue about lmod hook in host-injection #183

Merged
merged 5 commits into from
Jun 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 28 additions & 1 deletion docs/known_issues/eessi-2023.06.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

<p>This is an error that occurs with OpenMPI after updating to OFED 23.10.</p>

<p>Their is an upstream issue on this problem opened with EasyBuild.
<p>There is an upstream issue on this problem opened with EasyBuild.
See: https://github.com/easybuilders/easybuild-easyconfigs/issues/20233</p>

<b>Workarounds</b>
Expand All @@ -25,4 +25,31 @@ export OMPI_MCA_btl='^uct,ofi'
export OMPI_MCA_pml='ucx'
export OMPI_MCA_mtl='^ofi'
```

You may also set these additional environment variables via site-specific Lmod hooks:
```
require("strict")
local hook=require("Hook")

-- Fix Failed to modify UD QP to INIT on mlx5_0: Operation not permitted
function fix_ud_qp_init_openmpi(t)
local simpleName = string.match(t.modFullName, "(.-)/")
if simpleName == 'OpenMPI' then
setenv('OMPI_MCA_btl', '^uct,ofi')
setenv('OMPI_MCA_pml', 'ucx')
setenv('OMPI_MCA_mtl', '^ofi')
end
end

local function combined_load_hook(t)
if eessi_load_hook ~= nil then
eessi_load_hook(t)
end
fix_ud_qp_init_openmpi(t)
end

hook.register("load", combined_load_hook)
```
For more information about how to write and implement site-specific Lmod hooks, please check [EESSI Site Specific Configuration LMOD Hooks](site_specific_config/lmod_hooks.md)
</div>

16 changes: 8 additions & 8 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,12 @@ nav:
- Compatibility layer: compatibility_layer.md
- Software layer: software_layer.md
- Supported CPU targets: software_layer/cpu_targets.md
- Available software and repositories:
- Software: available_software/overview.md
- Repositories:
- Production: repositories/software.eessi.io.md
- RISC-V: repositories/riscv.eessi.io.md
- Pilot: repositories/pilot.md
- Installation:
- Is EESSI already installed?: getting_access/is_eessi_accessible.md
- Native: getting_access/native_installation.md
Expand All @@ -35,10 +41,6 @@ nav:
- Demos: using_eessi/eessi_demos.md
- Available software: available_software/overview.md
- Advanced usage:
- Repositories:
- Production: repositories/software.eessi.io.md
- RISC-V: repositories/riscv.eessi.io.md
- Pilot: repositories/pilot.md
- Setting up your Stratum: filesystem_layer/stratum1.md
- Building software with EESSI: using_eessi/building_on_eessi.md
- Test suite:
Expand All @@ -50,6 +52,8 @@ nav:
- Release notes: test-suite/release-notes.md
- Accelerators support:
- GPUs: gpu.md
- Known issues and workarounds:
- v2023.06: known_issues/eessi-2023.06.md
- Adding software to EESSI:
- Overview: adding_software/overview.md
- For contributors:
Expand All @@ -61,10 +65,6 @@ nav:
- Building software: adding_software/building_software.md
- Deploying software: adding_software/deploying_software.md
- Build nodes: software_layer/build_nodes.md
- Known issues:
- v2023.06: known_issues/eessi-2023.06.md
- v2022.02: []
- pilot: []
- Community and support:
- Getting support: support.md
- Meetings: meetings.md
Expand Down
Loading