Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc issue: Setting cookbook node attribute for Closed Source Nvidia drivers in version 3.9.1? #6230

Closed
chrisdag opened this issue Apr 30, 2024 · 2 comments

Comments

@chrisdag
Copy link

For version 3.9.1 we are hitting the validation enforcement on p3 series node types with this error:

    {
      "level": "ERROR",
      "type": "InstanceTypeBaseAMICompatibleValidator",
      "message": "The instance type 'p3.2xlarge' is not supported by NVIDIA OpenRM drivers. OpenRM can only be used on any Turing or later GPU architectures. Please consider using a different instance type or building a custom AMI with closed source NVIDIA drivers."
    },

We still have an occasional need for the V100 GPU to support some older legacy scientific computing workflows so I'd like to enable p3.2xlarge support in pcluster version 3.9.1 on occasion.

The change in enforcement/validation is documented and the release notes say this:
Add possibility to choose between Open and Closed Source Nvidia Drivers when building an AMI, through the ['cluster']['nvidia']['kernel_open'] cookbook node attribute.

We do build custom AMIs so this is not an issue, however I can't find the documentation that says where and how to make the config change to flip between the Open and Closed Source Nvidia Drivers.

Is there a link or more info on how to pass in the proper "cookbook node attribute" to flip this config setting when building a custom AMI? The only docs I can find that reference "Cookbook node attributes" is the Chef repo over at https://github.com/aws/aws-parallelcluster-cookbook

@chrisdag chrisdag added the 3.x label Apr 30, 2024
Copy link

This issue has been automatically closed because there has been no response to our request for more information from the original author. With only the information that is currently in the issue, we don't have enough information to take action. Please reach out if you have or find the answers we need so that we can investigate further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants