Skip to content

AMD ROCm™ Software

AMD ROCm software is AMD's Open Source stack for GPU computation.

To learn more about ROCm, check out our Documentation, Examples, and Developer Hub.

If you have questions or need help, reach out to us on GitHub.

Popular repositories Loading

  1. ROCm ROCm Public

    AMD ROCm™ Software - GitHub Home

    Shell 5.1k 415

  2. hip hip Public

    HIP: C++ Heterogeneous-Compute Interface for Portability

    C++ 3.9k 551

  3. MIOpen MIOpen Public

    AMD's Machine Intelligence Library

    Assembly 1.1k 243

  4. tensorflow-upstream tensorflow-upstream Public

    Forked from tensorflow/tensorflow

    TensorFlow ROCm port

    C++ 689 96

  5. HIPIFY HIPIFY Public

    HIPIFY: Convert CUDA to Portable C++ Code

    C++ 563 84

  6. ROCm-docker ROCm-docker Public

    Dockerfiles for the various software layers defined in the ROCm software platform

    Shell 453 69

Repositories

Showing 10 of 310 repositories
  • llvm-project Public Forked from llvm/llvm-project

    This is the AMD-maintained fork of the LLVM git repository. This repository accepts pull requests and issues related to AMD fork-specific topics (amd/*). For all other issues/PRs, please submit upstream at https://github.com/llvm/llvm-project.

    LLVM 140 13,241 16 5 Updated Mar 23, 2025
  • xla Public Forked from openxla/xla

    A machine learning compiler for GPUs, CPUs, and ML accelerators

    C++ 4 Apache-2.0 529 0 29 Updated Mar 24, 2025
  • MIOpen Public

    AMD's Machine Intelligence Library

    Assembly 1,133 243 248 (4 issues need help) 112 Updated Mar 24, 2025
  • aiter Public

    AI Tensor Engine for ROCm

    Python 97 MIT 19 8 12 Updated Mar 24, 2025
  • tensorflow-upstream Public Forked from tensorflow/tensorflow

    TensorFlow ROCm port

    C++ 689 Apache-2.0 91,077 25 74 Updated Mar 24, 2025
  • composable_kernel Public

    Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators

    C++ 367 165 30 (1 issue needs help) 67 Updated Mar 24, 2025
  • flash-attention Public Forked from Dao-AILab/flash-attention

    Fast and memory-efficient exact attention

    Python 162 BSD-3-Clause 1,567 14 2 Updated Mar 24, 2025
  • hipBLASLt Public

    hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditional BLAS library

    Assembly 83 MIT 111 11 92 Updated Mar 24, 2025
  • device-metrics-exporter Public

    Device Metrics Exporter exports metrics from AMD devices (GPUs) to collectors like Prometheus.

    Go 9 Apache-2.0 12 3 3 Updated Mar 24, 2025
  • C++ 21 MIT 14 8 7 Updated Mar 24, 2025