Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 15 additions & 8 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,17 @@ This file is used to list changes made in each version of the AWS ParallelCluste
------

**ENHANCEMENTS**
- Remove UnkillableStepTimeout from slurm.conf and let slurm set this value.
- Add `build-image` support for kernel 6.12 of Amazon Linux 2023. The official ParallelCluster Amazon Linux 2023 AMIs use kernel 6.12.
- Add support for P6e-GB200 instances. ParallelCluster sets up Slurm topology plugin to handle P6e-GB200 UltraServers. See limitations section for important additional setup requirements.
- Add `build-image` support for Amazon Linux 2023 AMIs based on kernel 6.12 (in addition to 6.1).

**LIMITATIONS**
- P6e-GB200 instances are only tested on Amazon Linux 2023, Ubuntu 22.04 and Ubuntu 24.04.
- Using IMEX on P6e-GB200 requires additional setup. Please refer to <PLACE_HOLDER for the tutorial link>.

**CHANGES**
- Ubuntu 20.04 is no longer supported.
- Install nvidia-imex for all OSs except AL2.
- Remove `berkshelf`. All cookbooks are local and do not need `berkshelf` dependency management.
- Remove `UnkillableStepTimeout` from slurm.conf and let slurm set this value.
- Upgrade Slurm to version 24.11.6 (from 24.05.8).
- Upgrade EFA installer to 1.43.2 (from 1.41.0).
- Efa-driver: efa-2.17.2-1
Expand All @@ -20,21 +26,22 @@ This file is used to list changes made in each version of the AWS ParallelCluste
- Libfabric-aws: libfabric-aws-2.1.0-5
- Rdma-core: rdma-core-58.0-1
- Open MPI: openmpi40-aws-4.1.7-2 and openmpi50-aws-5.0.6-11
- Upgrade Cinc Client to version to 18.4.12 from 18.2.7.
- Upgrade Cinc Client to version 18.4.12 (from 18.2.7).
- Upgrade NVIDIA driver to version 570.172.08 (from 570.86.15) for all OSs except AL2.
- Upgrade CUDA Toolkit to version 12.8.1 (from 12.8.0) for all OSs except AL2.
- Upgrade DCGM to version 4.2.3 (from 3.3.6) for all OSs except AL2.
- Upgrade Python to 3.12.11 (from 3.12.8) for all OSs except AL2.
- Upgrade Python to 3.9.23 (from 3.9.20) for AL2.
- Upgrade Intel MPI Library to 2021.16.0 (from 2021.13.1).
- Addressed cluster id mismatch known issue by deleting the file `/var/spool/slurm.state/clustername` before configuring Slurm accounting.
- Upgrade DCV to version 2024.0-19030.
- Remove `berkshelf`. All cookbooks are local and do not need `berkshelf` dependency management.
- Add support for GB200 instance types.
- Install nvidia-imex for all OSs except AL2.
- Upgrade the official ParallelCluster Amazon Linux 2023 AMIs to kernel 6.12 (from 6.1).

**BUG FIXES**
- Fix a race condition in CloudWatch Agent startup that could cause nodes bootstrap failures.
- Fix cluster id mismatch issue by deleting the file `/var/spool/slurm.state/clustername` before configuring Slurm accounting.

**DEPRECATIONS**
- Ubuntu 20.04 is no longer supported.

3.13.2
------
Expand Down
Loading