Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to LUMI PyTorch Example 09/2023 #41

Merged
merged 2 commits into from Sep 25, 2023
Merged

Conversation

Chroxvi
Copy link
Contributor

@Chroxvi Chroxvi commented Sep 15, 2023

Update to the LUMI PyTorch examples to the most recent stable versions of Python, ROCm, and PyTorch as of 09/2023. Also includes updates to the LUMI SLURM scripts since the eap partition is no longer available on LUMI.

I have tested these examples on LUMI using cotainr build lumi_pytorch_rocm_demo.sif --base-image docker://rocm/dev-ubuntu-22.04:5.6.1-complete --conda-env py311_rocm542_pytorch.yml, since the --system=lumi-g option still provides the "rocm-terminal" image which does not include all the ROCm pieces needed for the PyTorch wheels.

A few notes about performance of the examples:

```

## Running the PyTorch examples on LUMI using the built container

Copy everything to LUMI and submit one of the SLURM batch scripts:
Copy everything to LUMI, update the `--account=project_<your_project_id>` SBATCH directive in the SLURM batch scripts, and submit one of the SLURM batch scripts:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we have the student sbatch wrapper we should link it here.

@eskech eskech merged commit 6b238f2 into main Sep 25, 2023
11 checks passed
@eskech eskech deleted the lumi_pytorch_example_202309 branch September 25, 2023 07:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants