[ROCm] skip two gaussian_blur tests on gfx90a by dnikolaev-amd · Pull Request #9509 · pytorch/vision

dnikolaev-amd · 2026-06-03T20:38:49Z

Summary

Skip two gaussian_blur CUDA tests on AMD gfx90a (MI200, MI250) that fail due to small numerical differences with reference values. Other GPUs and CPU paths are unchanged.

Refer this ROCM-19786 for more info.

Failures addressed

test_transforms_tensor.py::test_gaussian_blur[3-meth_kwargs4-cuda]
Failure: Batched GaussianBlur vs per-image calls disagree by 1 on a single uint8 pixel after rounding from fp32.
Cause: MIOpen conv2d returns batch and single results that differ by 1 float32 ULP at a half-integer (batched: 188.50000000, single: 188.50001526), so rounding gives 188 vs 189. Not a transform logic bug.
test_functional_tensor.py::test_gaussian_blur[gaussian_blur-sigma3-ksize2-dt3-large-cuda]
Failure: Output exceeds atol=1.0 vs stored OpenCV reference (max diff 1.125 at known pixels).
Cause: Looks like incorrect fp16 OpenCV reference value. CPU (174.0) and gfx90a (173.875) both differ from OpenCV (175.0) but agree with each other within ~0.125 (1 fp16 ULP)

Changes

Add gfx90a + ROCm + PYTEST_CURRENT_TEST guards to skip failed tests

cc @jeffdaily @jithunnair-amd

Skip two gaussian_blur CUDA tests on AMD gfx90a (MI200, MI250) that fail due to small numerical differences with reference values. Other GPUs and CPU paths are unchanged. 1. `test_transforms_tensor.py::test_gaussian_blur[3-meth_kwargs4-cuda]` Failure: Batched GaussianBlur vs per-image calls disagree by 1 on a single uint8 pixel after rounding from fp32. Cause: MIOpen conv2d returns batch and single results that differ by 1 float32 ULP at a half-integer (batched: 188.50000000, single: 188.50001526), so rounding gives 188 vs 189. Not a transform logic bug. 2. `test_functional_tensor.py::test_gaussian_blur[gaussian_blur-sigma3-ksize2-dt3-large-cuda]` Failure: Output exceeds atol=1.0 vs stored OpenCV reference (max diff 1.125 at known pixels). Cause: Looks like incorrect fp16 OpenCV reference value. CPU (174.0) and gfx90a (173.875) both differ from OpenCV (175.0) but agree with each other within ~0.125 (1 fp16 ULP) Add gfx90a + ROCm + PYTEST_CURRENT_TEST guards to skip failed tests

pytorch-bot · 2026-06-03T20:38:53Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9509

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-cla · 2026-06-03T20:38:53Z

Hi @dnikolaev-amd!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

NicolasHug · 2026-06-04T10:33:57Z

Hi @dnikolaev-amd , are these tests you can skip on your side in you own repo?

There are dozens of thousands of tests in torchvision, having a per-test skipping logic for out-of-core repos isn't going to be tractable.

dnikolaev-amd · 2026-06-04T17:24:32Z

Hi @NicolasHug,
we'll have to push this fix through all new versions. But we will have to do this if you don't like keeping fixes for specific architectures

About the tests:

test_transforms_tensor.py::test_gaussian_blur[3-meth_kwargs4-cuda] - can be fixed on gfx90a by changing seed value also. It's a one-time solution, it works for gfx90a but it may affect other architectures. Or the test needs to take into account the probability of a rounding error near a half-integer (e.g. 188.5)
test_functional_tensor.py::test_gaussian_blur[gaussian_blur-sigma3-ksize2-dt3-large-cuda] - The kernel is approximately normalized (sum ≈ 0.99951). It looks like OpenCV assumed the blur preserves the central pixel value (~175 at that tap), but a 23×23 weighted sum over a varying input region does not - the true conv2d output is ~173.99, not 175

pytorch-bot Bot added ciflow/rocm module: rocm labels Jun 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ROCm] skip two gaussian_blur tests on gfx90a#9509

[ROCm] skip two gaussian_blur tests on gfx90a#9509
dnikolaev-amd wants to merge 1 commit into
pytorch:mainfrom
dnikolaev-amd:skip_gaussian_blur_tests_for_gfx90a_upstream

dnikolaev-amd commented Jun 3, 2026 •

edited by pytorch-bot Bot

Loading

Uh oh!

pytorch-bot Bot commented Jun 3, 2026

Uh oh!

meta-cla Bot commented Jun 3, 2026

Uh oh!

NicolasHug commented Jun 4, 2026

Uh oh!

dnikolaev-amd commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dnikolaev-amd commented Jun 3, 2026 • edited by pytorch-bot Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Failures addressed

Changes

Uh oh!

pytorch-bot Bot commented Jun 3, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9509

Uh oh!

meta-cla Bot commented Jun 3, 2026

Action Required

Process

Uh oh!

NicolasHug commented Jun 4, 2026

Uh oh!

dnikolaev-amd commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dnikolaev-amd commented Jun 3, 2026 •

edited by pytorch-bot Bot

Loading