Skip to content

Port fused_conv_bias_relu to ROCm#295

Merged
pruthvistony merged 14 commits intomasterfrom
conv_bias_relu_rocm
Feb 4, 2026
Merged

Port fused_conv_bias_relu to ROCm#295
pruthvistony merged 14 commits intomasterfrom
conv_bias_relu_rocm

Conversation

@amd-sriram
Copy link
Collaborator

@amd-sriram amd-sriram commented Feb 4, 2026

Motivation

Nvidia/apex has fused_conv_bias_relu module which has not yet been ported to ROCm due to dependency on cuDNN. This PR ports this module to ROCm using MIOpen library calls.

Technical Details

This PR uses MIOpen's fusion APIs to fuse separate kernels into a single kernel in order to reduce off-chip memory access and avoid kernel launch overhead. Using MIOpen’s fusion API, you can specify operators that you want to fuse into a single kernel, compile that kernel, and then launch it.

Reference: https://rocm.docs.amd.com/projects/MIOpen/en/docs-7.1.1/how-to/use-fusion-api.html

The general workflow is:

  • Initialize an MIOpen handle object
  • Create a fusion plan
  • Create and add the convolution, bias, and activation operators
  • Compile the fusion plan
  • Set the runtime arguments for each operator
  • Run the fusion plan
  • Cleanup

Currently, the fusion API supports these operators:

  • Convolution forward
  • Activation forward
  • BatchNorm inference
  • Bias forward
image (Image taken from https://developer.nvidia.com/cudnn)

More information about the options for operators such as datatypes, strides, filters, etc. at https://rocm.docs.amd.com/projects/MIOpen/en/docs-7.1.1/how-to/use-fusion-api.html#supported-fusions

For ConvBias module, the Activation is set to CLAMP with the values of alpha and beta set to minimum and maximum float values.

It uses MIOpen for the forward calls and ATen for the backward calls since MIOpen doesn't support fusion for convolution for backward calls yet.

Test Plan

Compile apex and run unit test created specifically for retinanet.

python apex/contrib/test/conv_bias_relu/test_conv_bias_relu.py -k test_conv_bias_retinanet

Test Result

Apex compiles and unit test passes.

Submission Checklist

Copy link

@pruthvistony pruthvistony left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UT passing, LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants