Skip to content

AWS ParallelCluster v3.1.1

Compare
Choose a tag to compare
@enrico-usai enrico-usai released this 10 Feb 19:01
· 77 commits to release-3.1 since this release

We're excited to announce the release of AWS ParallelCluster 3.1.1

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

ENHANCEMENTS

  • Add support for multiple users cluster environments by integrating with Active Directory (AD) domains managed via AWS Directory Service.
  • Enable cluster creation in subnets with no internet access.
  • Add abbreviated flags for cluster-name (-n), region (-r), image-id (-i) and cluster-configuration / image-configuration (-c) to the cli.
  • Add support for multiple compute resources with same instance type per queue.
  • Add support for UseEc2Hostnames in the cluster configuration file. When set to true, use EC2 default hostnames (e.g. ip-1-2-3-4) for compute nodes.
  • Add support for GPU scheduling with Slurm on ARM instances with NVIDIA cards. Install NVIDIA drivers and CUDA library for ARM.
  • Add parallelcluster:compute-resource-name tag to LaunchTemplates used by compute nodes.
  • Add support for NEW_CHANGED_DELETED as value of FSx for Lustre AutoImportPolicy option.
  • Explicitly set cloud-init datasource to be EC2. This save boot time for Ubuntu and CentOS platforms.
  • Improve Security Groups created within the cluster to allow inbound connections from custom security groups when SecurityGroups parameter is specified for head node and/or queues.
  • Build Slurm with slurmrestd support.

CHANGES

  • Upgrade Slurm to version 21.08.5.
  • Upgrade NICE DCV to version 2021.3-11591.
  • Upgrade NVIDIA driver to version 470.103.01.
  • Upgrade CUDA library to version 11.4.4.
  • Upgrade NVIDIA Fabric manager to version 470.103.01.
  • Upgrade Intel MPI Library to 2021.4.0.441.
  • Upgrade PMIx to version 3.2.3.
  • Disable package update at instance launch time on Amazon Linux 2.
  • Enable possibility to suppress SlurmQueues and ComputeResources length validators.
  • Use compute resource name rather than instance type in compute fleet Launch Template name.
  • Disable EC2 ImageBuilder enhanced image metadata when building ParallelCluster custom images.
  • Remove dumping of failed compute nodes to /home/logs/compute. Compute nodes log files are available in CloudWatch
    and in EC2 console logs.

BUG FIXES

  • Redirect stderr and stdout to CLI log file to prevent unwanted text to pollute the pcluster CLI output.
  • Fix exporting of cluster logs when there is no prefix specified, previously exported to a None prefix.
  • Fix rollback not being performed in case of cluster update failure.
  • Do not configure GPUs in Slurm when NVIDIA driver is not installed.
  • Fix ecs:ListContainerInstances permission in BatchUserRole.
  • Fix RootVolume schema for the HeadNode by raising an error if unsupported KmsKeyId is specified.
  • Fix EfaSecurityGroupValidator. Previously, it may produce false failures when custom security groups were provided and EFA was enabled.
  • Fix FSx metrics not displayed in Cloudwatch Dashboard.