Skip to content

(3.8.0‐3.9.1) SharedStorageType: Efs not working on arm instances

Ryan Anderson edited this page May 8, 2024 · 3 revisions

(3.8.0 - 3.9.1) SharedStorageType: Efs not working on arm instances

The issue

When using the SharedStorageType config option Efs on arm instance types:

...\
HeadNode:\
  SharedStorageType: Efs\
  ...

ParallelCluster attempts to backup and mount the /opt/intel dir which doesn't exist for those instance types. The cluster creation will fail in CloudFormation and the chef-client log at /var/log/chef-client.log will show an error copying data from /opt/intel such as

STDERR: rsync: change_dir "/opt/intel" failed: No such file or directory (2)

Any cluster using arm instances will produce the error.

Affected versions (OSes, schedulers)

Affects all clusters using arm instance types, regardless of OS or scheduler in pcluster versions 3.8.0 - 3.9.1

Mitigation

To mitigate the issue, it is recommended for users to use a custom cookbook as a workaround which has updated clauses to exclude /opt/intel from the set of directories to mount for arm instances. The DevSettings section is mostly used for development and occasionally for workarounds such as this. It is not an officially supported mechanism and so is not publicly documented.

Here are the github links to the custom cookbooks for specific ParallelCluster versions:

The steps to use the custom cookbooks are as follows:

  1. To create the cluster, add the following DevSettings section to your ParallelCluster config file based on the version of pcluster you are using.
DevSettings:\
  Cookbook:\
    ChefCookbook: https://github.com/aws/aws-parallelcluster-cookbook/tarball/release-3.9
  1. Create the cluster. Our public documentation for creating clusters can be found here.
Clone this wiki locally