invalid output with gfx803 and BUILD_WITH_TENSILE_HOST ON #1287

xuhuisheng · 2021-02-23T11:12:14Z

Dear:
Finally, I can reproduce gfx803 with BUILD_WITH_TENSILE_HOST errors.
Here is my environment:


OS	Ubuntu-20.04.1
linux	Linux 5.4.0-64-generic
ROCm	4.0.1
GPU	RX 580 8G
pytorch	1.7.1

test codes : https://github.com/xuhuisheng/rocm-build/blob/feature/check/check/test-pytorch-rocblas.py
In this code, there is one full connection layer Y = w0 * x0 + w1 * x1 + b

var	value
input num	2
output num	1
weight0	1
weight1	1
bias	10
features	800
batch	32

For simple case, I init weight with [1,1], bias with 10, so it should not change anything, loss always should be 0.
But on the 12th steps, loss changed to 50. It is reproducable on my computer.

I print the X, Y, output, loss, and find out that the first 16 of output is 12 which is correct. the last 12 of output is 2 which is wrong. And seems the last 16 of output miss the bias, I guess.

     X tensor([[1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.]], device='cuda:0')
     Y tensor([1.2000e+01, 1.2000e+01, 1.2000e+01, 1.2000e+01, 1.2000e+01, 1.2000e+01,
        1.2000e+01, 1.2000e+01, 1.2000e+01, 1.2000e+01, 1.2000e+01, 1.2000e+01,
        1.2000e+01, 1.2000e+01, 1.2000e+01, 1.2000e+01, 1.2000e+01, 1.2000e+01,
        1.2000e+01, 1.2000e+01, 1.2000e+01, 1.2000e+01, 1.2000e+01, 1.2000e+01,
        1.2000e+01, 1.2000e+01, 1.2000e+01, 1.2000e+01, 1.2000e+01, 1.2000e+01,
        1.2000e+01, 1.2000e+01], device='cuda:0')
weight tensor([[1., 1.]], device='cuda:0')
  bias tensor([10.], device='cuda:0')
output tensor([[1.2000e+01],
        [1.2000e+01],
        [1.2000e+01],
        [1.2000e+01],
        [1.2000e+01],
        [1.2000e+01],
        [1.2000e+01],
        [1.2000e+01],
        [1.2000e+01],
        [1.2000e+01],
        [1.2000e+01],
        [1.2000e+01],
        [1.2000e+01],
        [1.2000e+01],
        [1.2000e+01],
        [1.2000e+01],
        [2.0000e+00],
        [2.0000e+00],
        [2.0000e+00],
        [2.0000e+00],
        [2.0000e+00],
        [2.0000e+00],
        [2.0000e+00],
        [2.0000e+00],
        [2.0000e+00],
        [2.0000e+00],
        [2.0000e+00],
        [2.0000e+00],
        [2.0000e+00],
        [2.0000e+00],
        [2.0000e+00],
        [2.0000e+00]], device='cuda:0', grad_fn=<AddmmBackward>)
     l tensor(50., device='cuda:0', grad_fn=<MseLossBackward>)

If I used BUILD_WITH_TENSILE_HOST=OFF and re-compile rocBLAS, the error will gone.

I know the gfx803 had been dropped from offical support, But it is appreciate if someone could show some clue to me. I am totally not familiar with GCN assmebly. Thank you VERY MUCH.

The text was updated successfully, but these errors were encountered:

This was referenced Feb 23, 2021

invalid output with gfx803 and BUILD_WITH_TENSILE_HOST ON xuhuisheng/rocm-build#4

Open

ROCm 4.0 broke DaVinci Resolve 16 and 17 ROCm/ROCm#1345

Open

bragadeesh closed this as completed Nov 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

invalid output with gfx803 and BUILD_WITH_TENSILE_HOST ON #1287

invalid output with gfx803 and BUILD_WITH_TENSILE_HOST ON #1287

xuhuisheng commented Feb 23, 2021 •

edited

Loading

invalid output with gfx803 and BUILD_WITH_TENSILE_HOST ON #1287

invalid output with gfx803 and BUILD_WITH_TENSILE_HOST ON #1287

Comments

xuhuisheng commented Feb 23, 2021 • edited Loading

xuhuisheng commented Feb 23, 2021 •

edited

Loading