Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2022 tutorial csc #127

Open
wants to merge 159 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
159 commits
Select commit Hold shift + click to select a range
b9ee40e
Initial commit, already contains the structure to work on the first t…
klust Mar 7, 2022
183ef3b
Updated the "What is EasyBuild" page of the tutorial for LUMI."
klust Mar 7, 2022
6541e8c
Updated the structure, initial updates of terminology and a new page …
klust Mar 9, 2022
b2a55cd
Additions to the Lmod section and some restructuring, and better info…
klust Mar 10, 2022
1692a80
Corrections to the README file.
klust Mar 10, 2022
82dbfae
Corrections to the overview for part I.
klust Mar 10, 2022
29d5ba5
Further work on the Lmod section.
klust Mar 11, 2022
2eae491
Finished the Lmod section.
klust Mar 14, 2022
8ca50b8
Continued merging of the new CSC tutorial in the structure of the rep…
klust Mar 14, 2022
71966b5
Tutorial page about the Cray PE, and correction of a typo.
klust Mar 14, 2022
91e7c86
Corrected a typo in mkdocs.yml
klust Mar 16, 2022
766a0f2
Adding an example of implementing a hierarchy to the Lmod section of …
klust Mar 16, 2022
70889a5
More information about hierarchy in the Cray PE, and definition of to…
klust Mar 21, 2022
e101d7b
Proposed change to mkdocs.yml to support mermaid.js graphs.
klust Mar 21, 2022
df081ae
Add a mermaid.js diagram with the toolchain hierarchy.
klust Mar 21, 2022
0fd57c5
Finished terminology session, extended the installation section with …
klust Mar 21, 2022
2aee856
Tutorial on EasyBuild configuration added, and some restructuring.
klust Mar 21, 2022
53d564d
Restructuring continued.
klust Mar 21, 2022
fea45d2
Finished reworking the basic usage section, except for the exercises.
klust Mar 30, 2022
c844b96
Additions to the LUMI software stack page of the tutorial.
klust Mar 30, 2022
cbdb67e
Integrating parts of the old tutorial into the new one, texts still n…
klust Mar 30, 2022
8d6b6f3
Troubleshooting section reworked for LUMI.
klust Mar 30, 2022
51c7e3d
Section on creating easyconfig files.
klust Apr 1, 2022
c14df7a
Adding in an additional section about external modules.
klust Apr 1, 2022
f89ee1f
Adding in a section taken from a previous tutorial and integrating so…
klust Apr 1, 2022
28741b4
Implementing EasyBlocks adapted for LUMI.
klust Apr 5, 2022
5f5898e
Correction of typos.
klust Apr 5, 2022
bbe836e
Using EasyBuild as a library corrected for LUMI.
klust Apr 5, 2022
272f2fe
Section about hooks extended with references to additional examples.
klust Apr 5, 2022
3234fd9
Slurm job submission from EasyBuild a bit reworked.
klust Apr 6, 2022
ad2a7f5
Some explanation in the overview of part 3
klust Apr 6, 2022
4f1dcab
GitHub integration section, mostly old text and not all suited for LUMI.
klust Apr 6, 2022
83d63d8
Adapt the structure and include an additional reading section.
klust Apr 6, 2022
d1aee2d
Additional reading section.
klust Apr 6, 2022
3cf8a3c
Correcting a number of links.
klust Apr 6, 2022
7b1f470
Restructuring for nicer navigation bar on the left.
klust Apr 7, 2022
724e059
Link corrections.
klust Apr 7, 2022
64775be
Correction of links.
klust Apr 7, 2022
d915944
Corrected a number of spelling mistakes.
klust Apr 8, 2022
22772b6
Removed a new line from a module file as it caused problems with the …
klust Apr 8, 2022
6ea43a8
Removed some TODOs to complete the tutorial.
klust Apr 21, 2022
d1f6c7b
Corrected and updated several links.
klust Apr 21, 2022
8e0353a
Various corrections (typos etc.) and minor additions.
klust May 9, 2022
dc0895f
Reworked the exercises of the troubleshooting section.
klust May 9, 2022
a9e0d3e
Reworked the exercises for the basic usage section.
klust May 9, 2022
d851ae1
Multiple minor corrections.
klust May 10, 2022
8f29ae4
Corrected two typos.
klust May 10, 2022
8f4dd15
Corrections to the example
klust May 11, 2022
430f584
Removed an unnecessary accent.
klust May 11, 2022
10a3be8
Corrected wrong termination of code block.
klust May 11, 2022
728e972
Initial commit, already contains the structure to work on the first t…
klust Mar 7, 2022
f20177b
Updated the "What is EasyBuild" page of the tutorial for LUMI."
klust Mar 7, 2022
df7f0f0
Updated the structure, initial updates of terminology and a new page …
klust Mar 9, 2022
d873a20
Additions to the Lmod section and some restructuring, and better info…
klust Mar 10, 2022
78d54a5
Corrections to the README file.
klust Mar 10, 2022
48d6d02
Corrections to the overview for part I.
klust Mar 10, 2022
04b642e
Further work on the Lmod section.
klust Mar 11, 2022
71b0dcb
Finished the Lmod section.
klust Mar 14, 2022
46fe516
Continued merging of the new CSC tutorial in the structure of the rep…
klust Mar 14, 2022
ec860bd
Tutorial page about the Cray PE, and correction of a typo.
klust Mar 14, 2022
8764aa3
Adding an example of implementing a hierarchy to the Lmod section of …
klust Mar 16, 2022
e24aafa
More information about hierarchy in the Cray PE, and definition of to…
klust Mar 21, 2022
4d47bb7
Add a mermaid.js diagram with the toolchain hierarchy.
klust Mar 21, 2022
dac3c94
Finished terminology session, extended the installation section with …
klust Mar 21, 2022
99a46fe
Tutorial on EasyBuild configuration added, and some restructuring.
klust Mar 21, 2022
6a59f94
Restructuring continued.
klust Mar 21, 2022
e277aa5
Finished reworking the basic usage section, except for the exercises.
klust Mar 30, 2022
a8c9b04
Additions to the LUMI software stack page of the tutorial.
klust Mar 30, 2022
7570644
Integrating parts of the old tutorial into the new one, texts still n…
klust Mar 30, 2022
abfbf61
Troubleshooting section reworked for LUMI.
klust Mar 30, 2022
5881892
Section on creating easyconfig files.
klust Apr 1, 2022
0ba449e
Adding in an additional section about external modules.
klust Apr 1, 2022
a778c98
Adding in a section taken from a previous tutorial and integrating so…
klust Apr 1, 2022
ca73ca7
Implementing EasyBlocks adapted for LUMI.
klust Apr 5, 2022
bdc3324
Correction of typos.
klust Apr 5, 2022
87138ea
Using EasyBuild as a library corrected for LUMI.
klust Apr 5, 2022
492683e
Section about hooks extended with references to additional examples.
klust Apr 5, 2022
8e734c5
Slurm job submission from EasyBuild a bit reworked.
klust Apr 6, 2022
aff9979
Some explanation in the overview of part 3
klust Apr 6, 2022
855101c
GitHub integration section, mostly old text and not all suited for LUMI.
klust Apr 6, 2022
89e01df
Adapt the structure and include an additional reading section.
klust Apr 6, 2022
ed863b9
Correcting a number of links.
klust Apr 6, 2022
557ba61
Restructuring for nicer navigation bar on the left.
klust Apr 7, 2022
df0d8e5
Some improvments to the module naming schemes section based on the IS…
klust May 19, 2022
36934df
Several minor corrections, including a new least of EasyBuild communi…
klust Jun 2, 2022
f62ed31
Minor corrections to the Lmod section.
klust Jun 3, 2022
28c00ba
Updated the "What is EasyBuild" page of the tutorial for LUMI."
klust Mar 7, 2022
2086896
Updated the structure, initial updates of terminology and a new page …
klust Mar 9, 2022
c4d368f
Additions to the Lmod section and some restructuring, and better info…
klust Mar 10, 2022
861690e
Corrections to the overview for part I.
klust Mar 10, 2022
cd69082
Further work on the Lmod section.
klust Mar 11, 2022
0f401f5
Finished the Lmod section.
klust Mar 14, 2022
290a375
Continued merging of the new CSC tutorial in the structure of the rep…
klust Mar 14, 2022
744d0a7
Tutorial page about the Cray PE, and correction of a typo.
klust Mar 14, 2022
8496f45
Corrected a typo in mkdocs.yml
klust Mar 16, 2022
d5b2647
Adding an example of implementing a hierarchy to the Lmod section of …
klust Mar 16, 2022
7eda98d
More information about hierarchy in the Cray PE, and definition of to…
klust Mar 21, 2022
e1bde80
Add a mermaid.js diagram with the toolchain hierarchy.
klust Mar 21, 2022
a0f4141
Finished terminology session, extended the installation section with …
klust Mar 21, 2022
6338bb1
Tutorial on EasyBuild configuration added, and some restructuring.
klust Mar 21, 2022
42d40a4
Restructuring continued.
klust Mar 21, 2022
863b933
Finished reworking the basic usage section, except for the exercises.
klust Mar 30, 2022
4a51424
Additions to the LUMI software stack page of the tutorial.
klust Mar 30, 2022
20f05ed
Integrating parts of the old tutorial into the new one, texts still n…
klust Mar 30, 2022
158c78d
Troubleshooting section reworked for LUMI.
klust Mar 30, 2022
202a53b
Section on creating easyconfig files.
klust Apr 1, 2022
71e04e4
Adding in an additional section about external modules.
klust Apr 1, 2022
da4ad40
Adding in a section taken from a previous tutorial and integrating so…
klust Apr 1, 2022
fa09f7d
Implementing EasyBlocks adapted for LUMI.
klust Apr 5, 2022
1c6894a
Correction of typos.
klust Apr 5, 2022
81274e3
Using EasyBuild as a library corrected for LUMI.
klust Apr 5, 2022
4e5ac8f
Section about hooks extended with references to additional examples.
klust Apr 5, 2022
db5cdec
Slurm job submission from EasyBuild a bit reworked.
klust Apr 6, 2022
cc30e75
Some explanation in the overview of part 3
klust Apr 6, 2022
7a9f3c7
GitHub integration section, mostly old text and not all suited for LUMI.
klust Apr 6, 2022
9a01408
Adapt the structure and include an additional reading section.
klust Apr 6, 2022
fe44f87
Correcting a number of links.
klust Apr 6, 2022
064afbe
Restructuring for nicer navigation bar on the left.
klust Apr 7, 2022
7145870
Correction of links.
klust Apr 7, 2022
4a7d65d
Multiple minor corrections.
klust May 10, 2022
6b0ebef
Initial commit, already contains the structure to work on the first t…
klust Mar 7, 2022
0445175
Updated the "What is EasyBuild" page of the tutorial for LUMI."
klust Mar 7, 2022
e438194
Updated the structure, initial updates of terminology and a new page …
klust Mar 9, 2022
7ec3ad4
Additions to the Lmod section and some restructuring, and better info…
klust Mar 10, 2022
feae23b
Corrections to the overview for part I.
klust Mar 10, 2022
6513998
Further work on the Lmod section.
klust Mar 11, 2022
f41da00
Finished the Lmod section.
klust Mar 14, 2022
5415f72
Tutorial page about the Cray PE, and correction of a typo.
klust Mar 14, 2022
0f19843
Adding an example of implementing a hierarchy to the Lmod section of …
klust Mar 16, 2022
70f23f1
More information about hierarchy in the Cray PE, and definition of to…
klust Mar 21, 2022
ad10f45
Add a mermaid.js diagram with the toolchain hierarchy.
klust Mar 21, 2022
36439fc
Finished terminology session, extended the installation section with …
klust Mar 21, 2022
0e2a846
Tutorial on EasyBuild configuration added, and some restructuring.
klust Mar 21, 2022
b5ea26d
Restructuring continued.
klust Mar 21, 2022
555afb7
Finished reworking the basic usage section, except for the exercises.
klust Mar 30, 2022
d24d31b
Additions to the LUMI software stack page of the tutorial.
klust Mar 30, 2022
bbec7d0
Integrating parts of the old tutorial into the new one, texts still n…
klust Mar 30, 2022
6dbfb52
Troubleshooting section reworked for LUMI.
klust Mar 30, 2022
7f4a8a7
Section on creating easyconfig files.
klust Apr 1, 2022
1d27441
Adding in an additional section about external modules.
klust Apr 1, 2022
4fd6638
Adding in a section taken from a previous tutorial and integrating so…
klust Apr 1, 2022
0b02bd1
Implementing EasyBlocks adapted for LUMI.
klust Apr 5, 2022
cca0812
Correction of typos.
klust Apr 5, 2022
ed39db4
Using EasyBuild as a library corrected for LUMI.
klust Apr 5, 2022
1a1c143
Section about hooks extended with references to additional examples.
klust Apr 5, 2022
5da4db5
Slurm job submission from EasyBuild a bit reworked.
klust Apr 6, 2022
ad1142f
Some explanation in the overview of part 3
klust Apr 6, 2022
7f742b9
GitHub integration section, mostly old text and not all suited for LUMI.
klust Apr 6, 2022
10280f8
Adapt the structure and include an additional reading section.
klust Apr 6, 2022
bf1dd5e
Additional reading section.
klust Apr 6, 2022
50a3aa1
Correcting a number of links.
klust Apr 6, 2022
5119b5e
Restructuring for nicer navigation bar on the left.
klust Apr 7, 2022
806738e
Correction of links.
klust Apr 7, 2022
a1b93c9
Corrected a number of spelling mistakes.
klust Apr 8, 2022
8492956
Corrected and updated several links.
klust Apr 21, 2022
5bb56bd
Multiple minor corrections.
klust May 10, 2022
aabc649
Updated the first page of the CSC course to integrate with the regula…
klust Nov 4, 2022
d897f35
Corrected two typos.
klust Nov 4, 2022
f5bf0a8
Correction after rebase.
klust May 3, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 9 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<p align="center"><img src="./docs/img/easybuild_logo_alpha.png" width="300px"/></p>
å<p align="center"><img src="./docs/img/easybuild_logo_alpha.png" width="300px"/></p>

Welcome to the repository that hosts the sources of the official **[EasyBuild](https://easybuild.io)
tutorial**, see https://easybuilders.github.io/easybuild-tutorial.
Expand All @@ -20,13 +20,17 @@ which makes it very easy to preview the result of the changes you make locally.

* Start the MkDocs built-in dev-server to preview the tutorial as you work on it:

make preview
```bash
make preview
```

or
or

mkdocs serve
```bash
mkdocs serve
```

Visit http://127.0.0.1:8000 to see the local live preview of the changes you make.
Visit http://127.0.0.1:8000 to see the local live preview of the changes you make.

* If you prefer building a static preview you can use ``make`` or ``mkdocs build``,
which should result in a ``site/`` subdirectory that contains the rendered documentation.
Expand Down
5 changes: 5 additions & 0 deletions docs/2021-LUST/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# EasyBuild tutorial for LUST

Overview page of the introductory tutorial on [EasyBuild](https://easybuild.io) for the *[LUMI](https://www.lumi-supercomputer.eu) User Support team (LUST)*.

The tutorial is available on [the EasyBuilders tutoral web site](https://easybuilders.github.io/easybuild-tutorial/2021-lust/).
22 changes: 1 addition & 21 deletions docs/2021-lust/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,4 @@

Overview page of the introductory tutorial on [EasyBuild](https://easybuild.io) for the *[LUMI](https://www.lumi-supercomputer.eu) User Support team (LUST)*.

- [Part I: **Introduction to EasyBuild**](part1_intro.md) *(Tue March 9th 2021, 9am-12 CET)*
* [What is EasyBuild?](what_is_easybuild.md)
* [Terminology](terminology.md)
* [Installation](installation.md) *(hands-on)*
* [Configuration](configuration.md) *(hands-on)*
* [Basic usage](basic_usage.md) *(hands-on)*
- [Part II: **Using EasyBuild**](part2_using.md) *(Tue March 23rd 2021, 9am-12 CET)*
* [Troubleshooting](troubleshooting.md) *(hands-on)*
* [Creating easyconfig files](creating_easyconfig_files.md) *(hands-on)*
* [Implementing easyblocks](implementing_easyblocks.md) *(hands-on)*
- [Part III: **Advanced topics**](part3_advanced.md) *(Tue March 30th 2021, 9am-12 CEST)*
* [Using EasyBuild as a library](easybuild_library.md) *(hands-on)*
* [Using hooks to customise EasyBuild](hooks.md) *(hands-on)*
* [Submitting installations as Slurm jobs](slurm_jobs.md) *(hands-on)*
* [Module naming schemes (incl. hierarchical)](module_naming_schemes.md) *(hands-on)*
* [GitHub integration to facilitate contributing to EasyBuild](github_integration.md) *(hands-on)*
- [Part IV: **EasyBuild on Cray systems**](part4_cray.md) *(Friday June 18th 2021, 09-12 CEST)*
* [Introduction to Cray Programming Environment](cray/introduction.md) *(hands-on)*
* [Cray External Modules](cray/external_modules.md) *(hands-on)*
* [Cray Custom Toolchains](cray/custom_toolchains.md) *(hands-on)*
* [EasyBuild at CSCS](cray/easybuild_at_cscs.md) *(hands-on)*
The tutorial is available on [the EasyBuilders tutoral web site](https://easybuilders.github.io/easybuild-tutorial/2021-lust/).
320 changes: 320 additions & 0 deletions docs/2022-CSC_and_LO/1_Intro/1_01_what_is_easybuild.md

Large diffs are not rendered by default.

825 changes: 825 additions & 0 deletions docs/2022-CSC_and_LO/1_Intro/1_02_Lmod.md

Large diffs are not rendered by default.

417 changes: 417 additions & 0 deletions docs/2022-CSC_and_LO/1_Intro/1_03_CPE.md

Large diffs are not rendered by default.

131 changes: 131 additions & 0 deletions docs/2022-CSC_and_LO/1_Intro/1_04_LUMI_software_stack.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
# LUMI software stacks (technical)

*[[back: The Cray Programming Environment]](1_03_CPE.md)*

---

The user-facing documentation on how to use the LUMI software stacks is
available in [the LUMI documentation](https://docs.lumi-supercomputer.eu/computing/softwarestacks/).
On this page we focus more on the technical implementation behind it.

---

# An overview of LUMI

LUMI has different node types providing compute resources:

- LUMI has 16 login nodes, though many of those are reserved for special purposes and not
available to all users. These login nodes have a zen2 CPU. These nodes have a SlingShot 10
interconnect.
- There are 1536 regular CPU compute nodes in a partition denoted as LUMI-C. These
compute nodes have a zen3 CPU and run a reduced version of SUSE Linux optimised
by Cray to reduce OS jitter. These nodes will in the future be equipped with a
SlingShot 11 interconnect card.
- There are 2560 GPU compute nodes in a partition denoted as LUMI-G. These nodes have
a single zen3-based CPU with optimised I/O die linked to 4 AMD MI250X GPUs. Each node
has 4 SlingShot 11 interconnect cards, one attached to each GPU.
- The interactive data analytics and visualisation partition is really two different partitions
from the software point-of-view:
- 8 nodes are CPU-only but differ considerably from the regular compute nodes,
not only in the amount of memory. These nodes are equipped with zen2 CPUs
and in that sense comparable to the login nodes. They also have local SSDs
and are equipped with SlingShot 10 interconnect cards (2 each???)
- 8 nodes have zen2 CPUs and 8 NVIDIA A40 GPUs each, and have 2 SlingShot 10
interconnect cards each.
- The early access platform (EAP) has 14 nodes equipped with a single 64-core
zen2 CPU and 4 AMD MI100 GPUS. Each node has a single SlingShot 10 interconnect
and also local SSDs.

SlingShot 10 and SlingShot 11 are different software-wise. SlingShot 10 uses a
Mellanox CX5 NIC that support both OFI and UCX, and hence can also use the
UCX version of Cray MPICH. SlingShot 11 uses a NIC code-named Cassini and
supports only OFI with an OFI provider specific for the Cassini NIC. However,
given that the nodes that are equipped with SlingShot 10 cards are not meant
to be used for big MPI jobs, we build our software stack solely on top of
libfabric and Cray MPICH.


---

## CrayEnv and LUMI modules

On LUMI, two types of software stacks are currently offered:

- ``CrayEnv`` (module name) offers the Cray PE and enables one to use
it completely in the way intended by HPE-Cray. The environment also offers a
limited selection of additional tools, often in updated versions compared to
what SUSE Linux, the basis of the Cray Linux environment, offers. Those tools
are installed and managed via EasyBuild. However, EasyBuild is not available
in that partition.

It also rectifies a problem caused by the fact that there is only one
configuration file for the Cray PE on LUMI, so that starting a login shell
will not produce an optimal set of target modules for all node types.
The ``CrayEnv`` module recognizes on which node type it is running and
(re-)loading it will trigger a reload of the recommended set of target
modules for that node.

- ``LUMI`` is an extensible software stack that is mostly managed through
[EasyBuild][easybuild]. Each version of the LUMI software stack is based on
the version of the Cray Programming Environment with the same version
number.

A deliberate choice was made to only offer a limited number of software
packages in the globally installed stack as the setup of redundancy on LUMI
makes it difficult to update the stack in a way that is guaranteed to not
affect running jobs and as a large central stack is also hard to manage, especially
as we expect frequent updates to the OS and compiler infrastructure in
the first years of operation.
However, the EasyBuild setup is such that users can easily install
additional software in their home or project directory using EasyBuild build
recipes that we provide or they develop, and that software will fully
integrate in the central stack (even the corresponding modules will be made
available automatically).

Each ``LUMI`` module will also automatically activate a set of application
modules tuned to the architecture on which the module load is executed. To
that purpose, the ``LUMI`` module will automatically load the ``partition``
module that is the best fit for the node. After loading a version of the
``LUMI`` module, users can always load a different version of the ``partition``
module.

Note that the ``partition`` modules are only used by the ``LUMI`` module. In the
``CrayEnv`` environment, users should overwrite the configuration by loading their
set of target modules after loading the ``CrayEnv`` module.


---

## The ``partition`` module

The ``LUMI`` module currently supports five partition modules, but that number may
be reduced in the future:

| Partition | CPU target | Accelerator |
|:------------------|-----------------------|:----------------------------|
| ``partition/L`` | ``craype-x86-rome`` | ``craype-accel-host`` |
| ``partition/C`` | ``craype-x86-milan`` | ``craype-accel-host`` |
| ``partition/G`` | ``craype-x86-trento`` | ``craype-accel-amd-gfx90a`` |
| ``partition/D`` | ``craype-x86-rome`` | ``craype-accel-nvidia80`` |
| ``partition/EAP`` | ``craype-x86-rome`` | ``craype-accel-amd-gfx908`` |

All ``partition`` modules also load `craype-network-ofi``.

``pattition/D`` may be dropped in the future as it seems we have no working CUDA setup
and can only use the GPU nodes in the LUMI-D partition for visualisation and not with CUDA.

Furthermore if it would turn out that there is no advantage in optimizing for Milan
specifically, or that there are no problems at all in running Milan binaries on Rome
generation CPUs, ``partition/L`` and ``partition/C`` might also be united in a single
partition.







---

*[[next: Terminology]](1_05_terminology.md)*

Loading