

# AMD ROCm™ Release Notes v4.0

## AMD Instinct™ MI100

Revision 1217

Issue Date: December 2020

© 2020-21 Advanced Micro Devices, Inc. All Rights Reserved.

AMD Instinct<sup>TM</sup>
MI100

## **Specification Agreement**

This Specification Agreement (this "Agreement") is a legal agreement between Advanced Micro Devices, Inc. ("AMD") and "You" as the recipient of the attached AMD Specification (the "Specification"). If you are accessing the Specification as part of your performance of work for another party, you acknowledge that you have authority to bind such party to the terms and conditions of this Agreement. If you accessed the Specification by any means or otherwise use or provide Feedback (defined below) on the Specification, You agree to the terms and conditions set forth in this Agreement. If You do not agree to the terms and conditions set forth in this Agreement, you are not licensed to use the Specification; do not use, access or provide Feedback about the Specification. In consideration of Your use or access of the Specification (in whole or in part), the receipt and sufficiency of which are acknowledged, You agree as follows:

- 1. You may review the Specification only (a) as a reference to assist You in planning and designing Your product, service or technology ("Product") to interface with an AMD product in compliance with the requirements as set forth in the Specification and (b) to provide Feedback about the information disclosed in the Specification to AMD.
- 2. Except as expressly set forth in Paragraph 1, all rights in and to the Specification are retained by AMD. This Agreement does not give You any rights under any AMD patents, copyrights, trademarks or other intellectual property rights. You may not (i) duplicate any part of the Specification; (ii) remove this Agreement or any notices from the Specification, or (iii) give any part of the Specification, or assign or otherwise provide Your rights under this Agreement, to anyone else.
- 3. The Specification may contain preliminary information, errors, or inaccuracies, or may not include certain necessary information. Additionally, AMD reserves the right to discontinue or make changes to the Specification and its products at any time without notice. The Specification is provided entirely "AS IS." AMD MAKES NO WARRANTY OF ANY KIND AND DISCLAIMS ALL EXPRESS, IMPLIED AND STATUTORY WARRANTIES, INCLUDING BUT NOT LIMITED TO IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, NONINFRINGEMENT, TITLE OR THOSE WARRANTIES ARISING AS A COURSE OF DEALING OR CUSTOM OF TRADE. AMD SHALL NOT BE LIABLE FOR DIRECT, INDIRECT, CONSEQUENTIAL, SPECIAL, INCIDENTAL, PUNITIVE OR EXEMPLARY DAMAGES OF ANY KIND (INCLUDING LOSS OF BUSINESS, LOSS OF INFORMATION OR DATA, LOST PROFITS, LOSS OF CAPITAL, LOSS OF GOODWILL) REGARDLESS OF THE FORM OF ACTION WHETHER IN CONTRACT, TORT (INCLUDING NEGLIGENCE) AND STRICT PRODUCT LIABILITY OR OTHERWISE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
- 4. Furthermore, AMD's products are not designed, intended, authorized or warranted for use as components in systems intended for surgical implant into the body, or in other applications intended to support or sustain life, or in any other application in which the failure of AMD's product could create a situation where personal injury, death, or severe property or environmental damage may occur.
- 5. You have no obligation to give AMD any suggestions, comments or feedback ("Feedback") relating to the Specification. However, any Feedback You voluntarily provide may be used by AMD without restriction, fee or obligation of confidentiality. Accordingly, if You do give AMD Feedback on any version of the Specification, You agree AMD may freely use, reproduce, license, distribute, and otherwise commercialize Your Feedback in any product, as well as has the right to sublicense third parties to do the same. Further, You will not give AMD any Feedback that You may have reason to believe is (i) subject to any patent, copyright or other intellectual property



AMD Instinct<sup>TM</sup>
MI100

claim or right of any third party; or (ii) subject to license terms which seek to require any product or intellectual property incorporating or derived from Feedback or any Product or other AMD intellectual property to be licensed to or otherwise provided to any third party.

- 6. You shall adhere to all applicable U.S. import/export laws and regulations, as well as the import/export control laws and regulations of other countries as applicable. You further agree to not export, re-export, or transfer, directly or indirectly, any product, technical data, software or source code received from AMD under this license, or the direct product of such technical data or software to any country for which the United States or any other applicable government requires an export license or other governmental approval without first obtaining such licenses or approvals; or in violation of any applicable laws or regulations of the United States or the country where the technical data or software was obtained. You acknowledge the technical data and software received will not, in the absence of authorization from U.S. or local law and regulations as applicable, be used by or exported, re-exported or transferred to: (i) any sanctioned or embargoed country, or to nationals or residents of such countries; (ii) any restricted end-user as identified on any applicable government end-user list; or (iii) any party where the end-use involves nuclear, chemical/biological weapons, rocket systems, or unmanned air vehicles. For the most current Country Group listings, or for additional information about the EAR or Your obligations under those regulations, please refer to the U.S. Bureau of Industry and Security's website at <a href="http://www.bis.doc.gov/">http://www.bis.doc.gov/</a>.
- 7. The Software and related documentation are "commercial items", as that term is defined at 48 C.F.R. §2.101, consisting of "commercial computer software" and "commercial computer software documentation", as such terms are used in 48 C.F.R. §12.212 and 48 C.F.R. §227.7202, respectively. Consistent with 48 C.F.R. §12.212 or 48 C.F.R. §227.7202-1 through 227.7202-4, as applicable, the commercial computer software and commercial computer software documentation are being licensed to U.S. Government end users (a) only as commercial items and (b) with only those rights as are granted to all other end users pursuant to the terms and conditions set forth in this Agreement. Unpublished rights are reserved under the copyright laws of the United States.
- 8. This Agreement is governed by the laws of the State of California without regard to its choice of law principles. Any dispute involving it must be brought in a court having jurisdiction of such dispute in Santa Clara County, California, and You waive any defenses and rights allowing the dispute to be litigated elsewhere. If any part of this agreement is unenforceable, it will be considered modified to the extent necessary to make it enforceable, and the remainder shall continue in effect. The failure of AMD to enforce any rights granted hereunder or to take action against You in the event of any breach hereunder shall not be deemed a waiver by AMD as to subsequent enforcement of rights or subsequent actions in the event of future breaches. This Agreement is the entire agreement between You and AMD concerning the Specification; it may be changed only by a written document signed by both You and an authorized representative of AMD.

## DISCLAIMER

The information contained herein is for informational purposes only, and is subject to change without notice. In addition, any stated support is planned and is also subject to change. While every precaution has been taken in the preparation of this document, it may contain technical inaccuracies, omissions and typographical errors, and AMD is under no obligation to update or otherwise correct this information. Advanced Micro Devices, Inc. makes no representations or warranties with respect to the accuracy or completeness of the contents of this document, and assumes no liability of any kind, including the implied warranties of noninfringement, merchantability or fitness for particular purposes, with respect to the operation or use of AMD hardware, software or other products described herein. No license, including implied or arising by estoppel, to any intellectual property rights is granted by this document. Terms and limitations applicable to the purchase or use of AMD's products are as set forth in a signed agreement between the parties or in AMD's Standard Terms and Conditions of Sale.

AMD Instinct<sup>TM</sup>
MI100

## **Table of Contents**

| Table of Contents                                                       |    |
|-------------------------------------------------------------------------|----|
| SUPPORTED OPERATING SYSTEMS                                             | 6  |
| Supported Operating Systems                                             | 6  |
| List of Supported Operating Systems                                     | 6  |
| Fresh Installation of AMD ROCm v4.0 Recommended                         | 6  |
| ROCm Multi-Version Installation Update                                  | 7  |
| AMD ROCm V4.0 DOCUMENTATION UPDATES                                     | 8  |
| AMD ROCm Installation Guide                                             | 8  |
| HIP Documentation Updates                                               | 8  |
| ROCm-SMI API Updates                                                    | 8  |
| AMD ROCm General Documentation Links                                    | 9  |
| WHAT'S NEW IN THIS RELEASE                                              | 10 |
| Introducing AMD Instinct <sup>TM</sup> MI100                            | 10 |
| Key Features of AMD Instinct™ MI100                                     | 10 |
| Matrix Core Engines and GFX908 Considerations                           | 11 |
| References                                                              | 11 |
| RAS Enhancements                                                        | 12 |
| Using CMake with AMD ROCm                                               | 12 |
| AMD ROCm and Mesa Multimedia                                            | 12 |
| ROCm – System Management Interface                                      | 13 |
| Support for Printing PCle Information on AMD Instinct <sup>TM</sup> 100 | 13 |
| New API for xGMI                                                        | 13 |
| AMD GPU Debugger Enhancements                                           | 13 |
| KNOWN ISSUES                                                            | 14 |

## 

| 1217 Rev. December 2020                                       | AMD Instinct <sup>TM</sup> MI100 |
|---------------------------------------------------------------|----------------------------------|
| Upgrade to AMD ROCm v4.0 Not Supported                        | 14                               |
| DEPRECATIONS                                                  | 14                               |
| WARNING: Compiler-Generated Code Object Version 2 Deprecation | 14                               |
| ROCr Runtime Deprecations                                     | 14                               |
| Deprecated ROCr Runtime Functions                             | 14                               |
| Deprecated ROCr Runtime Enumerations                          | 15                               |
| Deprecated ROCr Runtime Structs                               | 15                               |
| AOMP Deprecation                                              | 15                               |
| HARDWARE AND SOFTWARE SUPPORT                                 | 16                               |
| Hardware Support                                              | 16                               |
| Supported Graphics Processing Units                           | 16                               |
| Supported CPUs                                                | 18                               |
| Not supported or limited support under ROCm                   | 19                               |

AMD Instinct<sup>TM</sup>
MI100

## SUPPORTED OPERATING SYSTEMS

This document describes the features, fixed issues, and information about downloading and installing the AMD ROCm<sup>TM</sup> software.

It also covers fixed defects and known issues in the AMD ROCm v4.0 release.

## SUPPORTED OPERATING SYSTEMS

## List of Supported Operating Systems

The AMD ROCm platform is designed to support the following operating systems:

- Ubuntu 20.04.1 (5.4 and 5.6-oem) and 18.04.5 (Kernel 5.4)
- CentOS 7.8 (3.10.0-1127) & RHEL 7.9 (3.10.0-1160.6.1.el7) (Using devtoolset-7 runtime support)
- CentOS 8.2 (4.18.0-193.el8) and RHEL 8.2 (4.18.0-193.1.1.el8) (devtoolset is not required)
- SLES 15 SP2

## FRESH INSTALLATION OF AMD ROCM V4.0 RECOMMENDED

A fresh and clean installation of AMD ROCm v4.0 is recommended. An upgrade from previous releases to AMD ROCm v4.0 is not supported. For more information, refer to the *AMD ROCm Installation Guide*.

**Note:** AMD ROCm release v3.3 or prior releases are not fully compatible with AMD ROCm v3.5 and higher versions. You must perform a fresh ROCm installation if you want to upgrade from AMD ROCm v3.3 or older to 3.5 or higher versions and vice-versa.

**Note:** *render group* is required only for Ubuntu v20.04. For all other ROCm supported operating systems, continue to use *video group*.

- For ROCm v3.5 and releases thereafter, the *clinfo* path is changed to */opt/rocm/opencl/bin/clinfo*.
- For ROCm v3.3 and older releases, the *clinfo* path remains /opt/rocm/opencl/bin/x86\_64/clinfo.

AMD Instinct<sup>TM</sup>
MI100

#### ROCM MULTI-VERSION INSTALLATION UPDATE

With the AMD ROCm v4.0 release, the following ROCm multi-version installation changes apply:

The meta packages *rocm-dkms*<*version*> are now deprecated for multi-version ROCm installs. For example, *rocm-dkms3.7.0*, *rocm-dkms3.8.0*.

- Multi-version installation of ROCm should be performed by installing *rocm-dev*<*version*> using each of the desired ROCm versions. For example, *rocm-dev3.7.0*, *rocm-dev3.8.0*, *rocm-dev3.9.0*.
- Version files must be created for each multi-version rocm <= 4.0.0
  - o **Command:** echo <version> | sudo tee /opt/rocm-<version>/.info/version
  - Example: echo 4.0.0 | sudo tee /opt/rocm-4.0.0/.info/version
- The rock-dkms loadable kernel modules should be installed using a single rock-dkms package.
- ROCm v3.9 and above will not set any *ldconfig* entries for ROCm libraries for multi-version installation. Users must set *LD\_LIBRARY\_PATH* to load the ROCm library version of choice.

**NOTE:** The single version installation of the ROCm stack remains the same. The *rocm-dkms* package can be used for single version installs and is not deprecated at this time.

AMD Instinct<sup>TM</sup>
MI100

## AMD ROCm V4.0 DOCUMENTATION UPDATES

## AMD ROCM INSTALLATION GUIDE

The AMD ROCm Installation Guide in this release includes the following updates:

- Supported Environments
- Installation Instructions for v4.0
- HIP Installation Instructions
- AMD ROCm and Mesa Multimedia Installation
- Using CMake with AMD ROCm

## HIP DOCUMENTATION UPDATES

• HIP Programming Guide v4.0

https://github.com/RadeonOpenCompute/ROCm/blob/master/HIP\_Programming\_Guide\_v4.0.pdf

• HIP API Guide v4.0

https://github.com/RadeonOpenCompute/ROCm/blob/master/HIP-API\_Guide\_v4.0.pdf

• HIP FAQ

For more information, see

https://rocmdocs.amd.com/en/latest/Programming\_Guides/HIP-FAQ.html#hip-faq

## **ROCM-SMI API UPDATES**

xGMI API

For more information, refer to the ROCm SMI API Guide at,

https://github.com/RadeonOpenCompute/ROCm/blob/master/ROCm\_SMI\_API\_Guide\_v4.0.pdf

AMD Instinct<sup>TM</sup>
MI100

## AMD ROCM GENERAL DOCUMENTATION LINKS

- For AMD ROCm documentation, see https://rocmdocs.amd.com/en/latest/
- For installation instructions on supported platforms, see https://rocmdocs.amd.com/en/latest/Installation\_Guide/Installation-Guide.html
- For AMD ROCm binary structure, see https://rocmdocs.amd.com/en/latest/Installation\_Guide/Software-Stack-for-AMD-GPU.html
- For AMD ROCm release history, see https://rocmdocs.amd.com/en/latest/Current\_Release\_Notes/ROCm-Version-History.html

AMD Instinct<sup>TM</sup>
MI100

## WHAT'S NEW IN THIS RELEASE

## INTRODUCING AMD INSTINCT<sup>TM</sup> MI100

The AMD Instinct<sup>TM</sup> MI100 accelerator is the world's fastest HPC GPU, and a culmination of the AMD CDNA architecture, with all-new Matrix Core Technology, and AMD ROCm<sup>TM</sup> open ecosystem to deliver new levels of performance, portability, and productivity. AMD CDNA is an all-new GPU architecture from AMD to drive accelerated computing into the era of exascale computing. The new architecture augments scalar and vector processing with new Matrix Core Engines and adds Infinity Fabric<sup>TM</sup> technology to scale up to larger systems. The open ROCm ecosystem puts customers in control and is a robust, mature platform that is easy to develop for and capable of running the most critical applications. The overall result is that the MI100 is the first GPU to break the 10TFLOP/s FP64 barrier designed as the steppingstone to the next generation of Exascale systems that will deliver pioneering discoveries in machine learning and scientific computing.

## Key Features of AMD Instinct<sup>TM</sup> MI100

Important features of the AMD Instinct<sup>TM</sup> MI100 accelerator include:

- Extended matrix core engine with Matrix Fused Multiply-Add (MFMA) for mixed-precision arithmetic and operates on KxN matrices (FP32, FP16, BF16, Int8)
- Added native support for the bfloat16 data type
- 3 Infinity fabric connections per GPU enable a fully connected group of 4 GPUs in a 'hive'



AMD Instinct<sup>TM</sup>
MI100

## Matrix Core Engines and GFX908 Considerations

The AMD CDNA architecture builds on GCN's foundation of scalars and vectors and adds matrices while simultaneously adding support for new numerical formats for machine learning and preserving backward compatibility for any software written for the GCN architecture. These Matrix Core Engines add a new family of wavefront-level instructions, the Matrix Fused MultiplyAdd or MFMA. The MFMA family performs mixed-precision arithmetic and operates on KxN matrices using four different types of input data: 8-bit integers (INT8), 16-bit half-precision FP (FP16), 16-bit brain FP (bf16), and 32-bit single-precision (FP32). All MFMA instructions produce either a 32-bit integer (INT32) or FP32 output, which reduces the likelihood of overflowing during the final accumulation stages of matrix multiplication.

On nodes with gfx908, MFMA instructions are available to substantially speed up matrix operations. This hardware feature is used only in matrix multiplications functions in rocBLAS and supports only three base types  $f16\_r$ ,  $bf16\_r$ ,  $and f32\_r$ .

- For half precision (f16\_r and bf16\_r) GEMM, use the function rocblas\_gemm\_ex, and set the compute\_type parameter to f32\_r.
- For single precision (f32\_r) GEMM, use the function rocblas\_sgemm.
- For single precision complex (f32 c) GEMM, use the function rocblas cgemm.

## References

- For more information about bfloat16, see https://rocblas.readthedocs.io/en/master/usermanual.html
- For more details about AMD Instinct<sup>TM</sup> MI100 accelerator key features, see
   https://www.amd.com/system/files/documents/instinct-mi100-brochure.pdf
- For more information about the AMD Instinct MI100 accelerator, refer to the following sources:
  - AMD CDNA whitepaper at https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf
  - MI100 datasheet at https://www.amd.com/system/files/documents/instinct-mi100-brochure.pdf
- AMD Instinct MI100/CDNA1 Shader Instruction Set Architecture (Dec. 2020) This document describes the
  current environment, organization, and program state of AMD CDNA "Instinct MI100" devices. It details the
  instruction set and the microcode formats native to this family of processors that are accessible to
  programmers and compilers.

AMD Instinct<sup>TM</sup>
MI100

#### RAS ENHANCEMENTS

RAS (Reliability, Availability, and Accessibility) features provide help with data center GPU management. It is a method provided to users to track and manage data points via options implemented in the ROCm-SMI Command Line Interface (CLI) tool.

For more information about rocm-smi, see

https://github.com/RadeonOpenCompute/ROC-smi

The command options are wrappers of the system calls into the device driver interface as described here:

https://dri.freedesktop.org/docs/drm/gpu/amdgpu.html#amdgpu-ras-support

## USING CMAKE WITH AMD ROCM

Most components in AMD ROCm support CMake 3.5 or higher out-of-the-box and do not require any special Find modules. A Find module is often used downstream to find the files by guessing locations of files with platform-specific hints. Typically, the Find module is required when the upstream is not built with CMake or the package configuration files are not available.

AMD ROCm provides the respective *config-file* packages, and this enables find\_package to be used directly. AMD ROCm does not require any Find module as the *config-file* packages are shipped with the upstream projects.

For more information, see

https://rocmdocs.amd.com/en/latest/Installation\_Guide/Using-CMake-with-AMD-ROCm.html

## AMD ROCM AND MESA MULTIMEDIA

AMD ROCm extends support to Mesa Multimedia. Mesa is an open-source software implementation of OpenGL, Vulkan, and other graphics API specifications. Mesa translates these specifications to vendor-specific graphics hardware drivers.

For detailed installation instructions, refer to

 $https://rocmdocs.amd.com/en/latest/Installation\_Guide/Mesa-Multimedia-Installation.html\\$ 

AMD Instinct<sup>TM</sup>
MI100

#### ROCM – SYSTEM MANAGEMENT INTERFACE

The following enhancements are made to ROCm System Management Interface (SMI)

## Support for Printing PCle Information on AMD Instinct<sup>TM</sup>100

AMD ROCm extends support for printing PCle information on AMD Instinct MI100.

To check the pp\_dpm\_pcie file, use "rocm-smi --showclocks".

/opt/rocm-4.0.0-6132/bin/rocm\_smi.py --showclocks

## New API for xGMI

Rocm\_smi\_lib now provides an API that exposes xGMI (inter-chip Global Memory Interconnect) throughput from one node to another. Refer to the rocm\_smi\_lib API documentation for more details.

https://github.com/RadeonOpenCompute/ROCm/blob/master/ROCm\_SMI\_API\_Guide\_v4.0.pdf

#### AMD GPU DEBUGGER ENHANCEMENTS

In this release, AMD GPU Debugger has the following enhancements:

- ROCm v4.0 ROCgdb is based on gdb 10.1
- Extended support for AMD Instinct<sup>TM</sup> MI100

AMD Instinct<sup>TM</sup>
MI100

## **KNOWN ISSUES**

The following are the known issues in this release.

## UPGRADE TO AMD ROCM V4.0 NOT SUPPORTED

An upgrade from previous releases to AMD ROCm v4.0 is not supported. A fresh and clean installation of AMD ROCm v4.0 is recommended.

## **DEPRECATIONS**

This section describes deprecations and removals in AMD ROCm.

## WARNING: COMPILER-GENERATED CODE OBJECT VERSION 2 DEPRECATION

Compiler-generated code object version 2 is no longer supported and will be removed shortly. AMD ROCm users must plan for the code object version 2 deprecation immediately.

Support for loading code object version 2 is also being deprecated with no announced removal release.

## ROCR RUNTIME DEPRECATIONS

The following ROCr Runtime enumerations, functions, and structs are deprecated in the AMD ROCm v4.0 release.

## Deprecated ROCr Runtime Functions

- hsa\_isa\_get\_info
- hsa\_isa\_compatible
- hsa\_executable\_create
- hsa\_executable\_get\_symbol
- hsa\_executable\_iterate\_symbols
- hsa\_code\_object\_serialize
- hsa\_code\_object\_deserialize
- hsa\_code\_object\_destroy
- hsa\_code\_object\_get\_info
- hsa\_executable\_load\_code\_object
- hsa\_code\_object\_get\_symbol
- hsa\_code\_object\_get\_symbol\_from\_name
- hsa\_code\_symbol\_get\_info
- hsa\_code\_object\_iterate\_symbols

AMD Instinct<sup>TM</sup>
MI100

## Deprecated ROCr Runtime Enumerations

- HSA\_ISA\_INFO\_CALL\_CONVENTION\_COUNT
- HSA\_ISA\_INFO\_CALL\_CONVENTION\_INFO\_WAVEFRONT\_SIZE
- HSA\_ISA\_INFO\_CALL\_CONVENTION\_INFO\_WAVEFRONTS\_PER\_COMPUTE\_UNIT
- HSA\_EXECUTABLE\_SYMBOL\_INFO\_MODULE\_NAME\_LENGTH
- HSA\_EXECUTABLE\_SYMBOL\_INFO\_MODULE\_NAME
- HSA\_EXECUTABLE\_SYMBOL\_INFO\_AGENT
- HSA\_EXECUTABLE\_SYMBOL\_INFO\_VARIABLE\_ALLOCATION
- HSA\_EXECUTABLE\_SYMBOL\_INFO\_VARIABLE\_SEGMENT
- HSA\_EXECUTABLE\_SYMBOL\_INFO\_VARIABLE\_ALIGNMENT
- HSA\_EXECUTABLE\_SYMBOL\_INFO\_VARIABLE\_SIZE
- HSA\_EXECUTABLE\_SYMBOL\_INFO\_VARIABLE\_IS\_CONST
- HSA\_EXECUTABLE\_SYMBOL\_INFO\_KERNEL\_CALL\_CONVENTION
- HSA\_EXECUTABLE\_SYMBOL\_INFO\_INDIRECT\_FUNCTION\_CALL\_CONVENTION
  - o hsa\_code\_object\_type\_t
  - o hsa\_code\_object\_info\_t
  - o hsa\_code\_symbol\_info\_t

## Deprecated ROCr Runtime Structs

- hsa\_code\_object\_t
- hsa\_callback\_data\_t
- hsa\_code\_symbol\_t

## **AOMP DEPRECATION**

As of AMD ROCm v4.0, AOMP (aomp-amdgpu) is deprecated. OpenMP support has moved to the openmp-extras auxiliary package, which leverages the ROCm compiler on LLVM 12.

For more information, refer to

https://rocmdocs.amd.com/en/latest/Programming\_Guides/openmp\_support.html

AMD Instinct<sup>TM</sup>
MI100

## HARDWARE AND SOFTWARE SUPPORT

#### HARDWARE SUPPORT

ROCm is focused on using AMD GPUs to accelerate computational tasks such as machine learning, engineering workloads, and scientific computing. In order to focus our development efforts on these domains of interest, ROCm supports the following targeted set of hardware configurations.

## Supported Graphics Processing Units

As the AMD ROCm platform has a focus on specific computational domains, AMD offers official support for a selection of GPUs that are designed to offer good performance and price in these domains.

**NOTE:** The integrated GPUs of Ryzen are not officially supported targets for ROCm.

ROCm officially supports AMD GPUs that use the following chips:

- GFX9 GPUs
  - "Vega 10" chips, such as on the AMD Radeon RX Vega 64 and Radeon Instinct MI25 "Vega 7nm" chips, such as on the Radeon Instinct MI50, Radeon Instinct MI60 or AMD Radeon VII.
- CDNA GPUs
  - o MI100 chips such as on the AMD Instinct<sup>TM</sup> MI100

ROCm is a collection of software ranging from drivers and runtimes to libraries and developer tools. Some of this software may work with more GPUs than the "officially supported" list above, though AMD does not make any official claims of support for these devices on the ROCm software platform.

The following list of GPUs is enabled in the ROCm software. However, full support is not guaranteed:

- GFX8 GPUs
  - o "Polaris 10" chips, such as on the AMD Radeon RX 580 and Radeon Instinct MI6
  - o "Polaris 11" chips, such as on the AMD Radeon RX 570 and Radeon Pro WX 4100
  - o "Polaris 12" chips, such as on the AMD Radeon RX 550 and Radeon RX 540
  - o "Fiji" chips, such as on the AMD Radeon R9 Fury X and Radeon Instinct MI8

AMD Instinct<sup>TM</sup>
MI100

## GFX7 GPUs

o "Hawaii" chips, such as the AMD Radeon R9 390X and FirePro W9100

As described in the next section, GFX8 GPUs require PCI Express 3.0 (PCIe 3.0) with support for PCIe atomics. This requires both CPU and motherboard support. GFX9 GPUs require PCIe 3.0 with support for PCIe atomics by default, but they can operate in most cases without this capability.

The integrated GPUs in AMD APUs are not officially supported targets for ROCm. As described below, "Carrizo", "Bristol Ridge", and "Raven Ridge" APUs are enabled in AMD upstream drivers and the ROCm OpenCL runtime. However, they are not enabled in the HIP runtime, and may not work due to motherboard or OEM hardware limitations. Note, they are not yet officially supported targets for ROCm.

### **GFX8 GPUS**

Note: The GPUs require a host CPU and platform with PCIe 3.0 with support for PCIe atomics.

| GFX8 GPUs                                                                                                                                                                  |                                                                                                                                                                                                                               |                                                                                      |                                                                                                                  |  |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------|--|
| Fiji<br>(AMD)                                                                                                                                                              | Polaris 10<br>(AMD)                                                                                                                                                                                                           | Polaris 11<br>(AMD)                                                                  | Polaris 12 (Lexa)<br>(AMD)                                                                                       |  |
| <ul> <li>Radeon R9 Fury</li> <li>Radeon R9 Nano</li> <li>Radeon R9 Fury X</li> <li>Radeon Pro Duo (Fiji)</li> <li>FirePro S9300 X2</li> <li>Radeon Instinct MI8</li> </ul> | <ul> <li>Radeon RX 470</li> <li>Radeon RX 480</li> <li>Radeon RX 570</li> <li>Radeon RX 580</li> <li>Radeon Pro Duo (Polaris)</li> <li>Radeon Pro WX 5100</li> <li>Radeon Pro WX 7100</li> <li>Radeon Instinct MI6</li> </ul> | <ul> <li>Radeon RX 460</li> <li>Radeon RX 560</li> <li>Radeon Pro WX 4100</li> </ul> | <ul> <li>Radeon RX 540</li> <li>Radeon RX 550</li> <li>Radeon Pro WX 2100</li> <li>Radeon Pro WX 3100</li> </ul> |  |

AMD Instinct<sup>TM</sup>
MI100

## **GFX9 GPUS**

ROCm offers support for two chips from AMD's most recent "gfx9" generation of GPUs.

| GFX9 GPUs                                                                                                                                                                                                                                                                                        |                                                                                            |  |  |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------|--|--|
| Vega 10<br>(AMD)                                                                                                                                                                                                                                                                                 | Vega 7nm<br>(AMD)                                                                          |  |  |
| <ul> <li>Radeon RX Vega 56</li> <li>Radeon RX Vega 64</li> <li>Radeon Vega Frontier Edition</li> <li>Radeon Pro WX 8200</li> <li>Radeon Pro WX 9100</li> <li>Radeon Pro V340</li> <li>Radeon Pro V340 MxGPU</li> <li>Radeon Instinct MI25</li> </ul> Note: ROCm does not support Radeon Pro SSG. | <ul> <li>Radeon VII</li> <li>Radeon Instinct MI50</li> <li>Radeon Instinct MI60</li> </ul> |  |  |

## SUPPORTED CPUS

As described above, GFX8 GPUs require PCIe 3.0 with PCIe atomics to run ROCm. In particular, the CPU and every active PCIe point between the CPU and GPU require support for PCIe 3.0 and PCIe atomics. The CPU root must indicate PCIe AtomicOp Completion capabilities and any intermediate switch must indicate PCIe AtomicOp Routing capabilities.

The current CPUs which support PCIe Gen3 + PCIe Atomics are:

- AMD Ryzen CPUs
- CPUs in AMD Ryzen APUs
- AMD Ryzen Threadripper CPUs
- AMD EPYC CPUs
- Intel Xeon E7 v3 or newer CPUs
- Intel Xeon E5 v3 or newer CPUs
- Intel Xeon E3 v3 or newer CPUs
- Intel Core i7 v4, Core i5 v4, Core i3 v4 or newer CPUs (i.e. Haswell family or newer)
- Some Ivy Bridge-E systems



AMD Instinct<sup>TM</sup>
MI100

Beginning with ROCm 1.8, GFX9 GPUs (such as Vega 10) no longer require PCIe atomics. We have similarly made more options available for many PCIe lanes. GFX9 GPUs can now be run on CPUs without PCIe atomics and on older PCIe generations, such as PCIe 2.0. This is not supported on GPUs below GFX9, e.g. GFX8 cards in the Fiji and Polaris families.

If you are using any PCIe switches in your system, please note that PCIe Atomics are only supported on some switches, such as Broadcom PLX. When you install your GPUs, make sure you install them in a PCIe 3.0 x16, x8, x4, or x1 slot attached either directly to the CPU's Root I/O controller or via a PCIe switch directly attached to the CPU's Root I/O controller.

In our experience, many issues stem from trying to use consumer motherboards which provide physical x16 connectors that are electrically connected as e.g. PCIe 2.0 x4, PCIe slots connected via the Southbridge PCIe I/O controller, or PCIe slots connected through a PCIe switch that does not support PCIe atomics.

If you attempt to run ROCm on a system without proper PCIe atomic support, you may see an error in the kernel log (dmesg):

kfd: skipped device 1002:7300, PCI rejects atomics

Experimental support for our Hawaii (GFX7) GPUs (Radeon R9 290, R9 390, FirePro W9100, S9150, S9170) does not require or take advantage of PCIe Atomics. However, AMD recommends that you use a CPU from the list provided above for compatibility purposes.

## NOT SUPPORTED OR LIMITED SUPPORT UNDER ROCM

## LIMITED SUPPORT

- ROCm 4.x should support PCIe 2.0 enabled CPUs such as the AMD Opteron, Phenom,
  Phenom II, Athlon, Athlon X2, Athlon II and older Intel Xeon and Intel Core Architecture
  and Pentium CPUs. However, we have done very limited testing on these configurations,
  since our test farm has been catering to CPUs listed above. This is where we need
  community support.
  - Please report these issues.
- Thunderbolt 1, 2, and 3 enabled breakout boxes should now be able to work with ROCm. Thunderbolt 1 and 2 are PCIe 2.0 based, and thus are only supported with GPUs that do not require PCIe 3.0 atomics (e.g. Vega 10). However, we have done no testing on this configuration and would need community support due to limited access to this type of equipment.

AMD Instinct<sup>TM</sup>
MI100

- AMD "Carrizo" and "Bristol Ridge" APUs are enabled to run OpenCL, but do not yet support HIP or our libraries built on top of these compilers and runtimes.
  - o As of ROCm 2.1, "Carrizo" and "Bristol Ridge" require the use of upstream kernel drivers.
  - In addition, various "Carrizo" and "Bristol Ridge" platforms may not work due to OEM and ODM choices when it comes to key configurations parameters such as inclusion of the required CRAT tables and IOMMU configuration parameters in the system BIOS.
  - o Before purchasing such a system for ROCm, please verify that the BIOS provides an option for enabling IOMMUv2 and that the system BIOS properly exposes the correct CRAT table. Inquire with your vendor about the latter.
- AMD "Raven Ridge" APUs are enabled to run OpenCL, but do not yet support HIP or our libraries built on top of these compilers and runtimes.
  - o As of ROCm 2.1, "Raven Ridge" requires the use of upstream kernel drivers.
  - In addition, various "Raven Ridge" platforms may not work due to OEM and ODM choices when it comes to key configurations parameters such as inclusion of the required CRAT tables and IOMMU configuration parameters in the system BIOS.
  - o Before purchasing such a system for ROCm, please verify that the BIOS provides an option for enabling IOMMUv2 and that the system BIOS properly exposes the correct CRAT table. Inquire with your vendor about the latter.

### **NOT SUPPORTED**

- "Tonga", "Iceland", "Vega M", and "Vega 12" GPUs are not supported.
- AMD does not support GFX8-class GPUs (Fiji, Polaris, etc.) on CPUs that do not have PCIe3.0 with PCIe atomics.
  - o AMD Carrizo and Kaveri APUs as hosts for such GPUs are not supported
  - Thunderbolt 1 and 2 enabled GPUs are not supported by GFX8 GPUs on ROCm. Thunderbolt 1 & 2 are based on PCIe 2.0.

In the default ROCm configuration, GFX8 and GFX9 GPUs require PCI Express 3.0 with PCIe atomics. The ROCm platform leverages these advanced capabilities to allow features such as user-level submission of work from the host to the GPU. This includes PCIe atomic Fetch and Add, Compare and Swap, Unconditional Swap, and AtomicOp Completion.

AMD Instinct<sup>TM</sup>
MI100

Current CPUs which support PCIe 3.0 + PCIe Atomics:

| AMD                                                                                                                         | INTEL                                                                                                                                                                                                                                                                                                                                                                   |
|-----------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Ryzen CPUs (Family 17h Model 01h-0Fh)  Ryzen 3 1300X Ryzen 3 2300X Ryzen 5 1600X Ryzen 5 2600X Ryzen 7 1800X Ryzen 7 2700X  | Intel Core i3, i5, and i7 CPUs from Haswell and beyond. This includes:  • Haswell CPUs such as the Core i7 4790K • Broadwell CPUs such as the Core i7 5775C • Skylake CPUs such as the Core i7 6700K • Kaby Lake CPUs such as the Core i7 7740X • Coffee Lake CPUs such as the Core i7 8700K • Xeon CPUs from "v3" and newer • Some models of "Ivy Bridge-E" processors |
| Ryzen APUs (Family 17h Model 10h-1Fh – previously code-named Raven Ridge) such as:  • Athlon 200GE • Ryzen 5 2400G          | Some models of Try Bridge 2 provides                                                                                                                                                                                                                                                                                                                                    |
| <b>Note:</b> The integrated GPU in these devices is not guaranteed to work with ROCm.                                       |                                                                                                                                                                                                                                                                                                                                                                         |
| Ryzen Threadripper Workstation CPUs (Family 17h Model 01h-0Fh) such as:  Ryzen Threadripper 1950X Ryzen Threadripper 2990WX |                                                                                                                                                                                                                                                                                                                                                                         |
| EPYC Server CPUs (Family 17h Model 01h-0Fh) such as:  • Epyc 7551P  • Epyc 7601                                             |                                                                                                                                                                                                                                                                                                                                                                         |