Skip to content

[ROCm] skip two gaussian_blur tests on gfx90a#9509

Open
dnikolaev-amd wants to merge 1 commit into
pytorch:mainfrom
dnikolaev-amd:skip_gaussian_blur_tests_for_gfx90a_upstream
Open

[ROCm] skip two gaussian_blur tests on gfx90a#9509
dnikolaev-amd wants to merge 1 commit into
pytorch:mainfrom
dnikolaev-amd:skip_gaussian_blur_tests_for_gfx90a_upstream

Conversation

@dnikolaev-amd
Copy link
Copy Markdown

@dnikolaev-amd dnikolaev-amd commented Jun 3, 2026

Summary

Skip two gaussian_blur CUDA tests on AMD gfx90a (MI200, MI250) that fail due to small numerical differences with reference values. Other GPUs and CPU paths are unchanged.

Refer this ROCM-19786 for more info.

Failures addressed

  1. test_transforms_tensor.py::test_gaussian_blur[3-meth_kwargs4-cuda]
    Failure: Batched GaussianBlur vs per-image calls disagree by 1 on a single uint8 pixel after rounding from fp32.
    Cause: MIOpen conv2d returns batch and single results that differ by 1 float32 ULP at a half-integer (batched: 188.50000000, single: 188.50001526), so rounding gives 188 vs 189. Not a transform logic bug.
  2. test_functional_tensor.py::test_gaussian_blur[gaussian_blur-sigma3-ksize2-dt3-large-cuda]
    Failure: Output exceeds atol=1.0 vs stored OpenCV reference (max diff 1.125 at known pixels).
    Cause: Looks like incorrect fp16 OpenCV reference value. CPU (174.0) and gfx90a (173.875) both differ from OpenCV (175.0) but agree with each other within ~0.125 (1 fp16 ULP)

Changes

Add gfx90a + ROCm + PYTEST_CURRENT_TEST guards to skip failed tests

cc @jeffdaily @jithunnair-amd

Skip two gaussian_blur CUDA tests on AMD gfx90a (MI200, MI250) that fail
due to small numerical differences with reference values. Other GPUs and
CPU paths are unchanged.

1. `test_transforms_tensor.py::test_gaussian_blur[3-meth_kwargs4-cuda]`
Failure: Batched GaussianBlur vs per-image calls disagree by 1 on a
single uint8 pixel after rounding from fp32.
Cause: MIOpen conv2d returns batch and single results that differ by 1
float32 ULP at a half-integer (batched: 188.50000000, single:
188.50001526), so rounding gives 188 vs 189. Not a transform logic bug.

2. `test_functional_tensor.py::test_gaussian_blur[gaussian_blur-sigma3-ksize2-dt3-large-cuda]`
Failure: Output exceeds atol=1.0 vs stored OpenCV reference (max diff
1.125 at known pixels).
Cause: Looks like incorrect fp16 OpenCV reference value. CPU (174.0) and
gfx90a (173.875) both differ from OpenCV (175.0) but agree with each
other within ~0.125 (1 fp16 ULP)

Add gfx90a + ROCm + PYTEST_CURRENT_TEST guards to skip failed tests
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Jun 3, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9509

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla
Copy link
Copy Markdown

meta-cla Bot commented Jun 3, 2026

Hi @dnikolaev-amd!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@NicolasHug
Copy link
Copy Markdown
Member

Hi @dnikolaev-amd , are these tests you can skip on your side in you own repo?

There are dozens of thousands of tests in torchvision, having a per-test skipping logic for out-of-core repos isn't going to be tractable.

@dnikolaev-amd
Copy link
Copy Markdown
Author

Hi @NicolasHug,
we'll have to push this fix through all new versions. But we will have to do this if you don't like keeping fixes for specific architectures

About the tests:

  • test_transforms_tensor.py::test_gaussian_blur[3-meth_kwargs4-cuda] - can be fixed on gfx90a by changing seed value also. It's a one-time solution, it works for gfx90a but it may affect other architectures. Or the test needs to take into account the probability of a rounding error near a half-integer (e.g. 188.5)
  • test_functional_tensor.py::test_gaussian_blur[gaussian_blur-sigma3-ksize2-dt3-large-cuda] - The kernel is approximately normalized (sum ≈ 0.99951). It looks like OpenCV assumed the blur preserves the central pixel value (~175 at that tap), but a 23×23 weighted sum over a varying input region does not - the true conv2d output is ~173.99, not 175

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants