Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

conv2d计算结果与预期不符 #62429

Open
lh9171338 opened this issue Mar 5, 2024 · 7 comments
Open

conv2d计算结果与预期不符 #62429

lh9171338 opened this issue Mar 5, 2024 · 7 comments
Assignees

Comments

@lh9171338
Copy link

bug描述 Describe the Bug

# paddlepaddle = 2.5.2
import paddle

dim = 1024
paddle.seed(1)
x = paddle.randn([1, dim, 3, 3])
kernel = paddle.zeros((dim, dim, 3, 3))
for i in range(dim):
    kernel[i, i, 1, 1] = 1
out = F.conv2d(x, kernel, padding=1)
diff = out - x
print(diff.abs().max())
# 预期值是0,相同逻辑用PyTorch实现是结果为0
# Tensor(shape=[], dtype=float32, place=Place(gpu:0), stop_gradient=True,
#      0.00097394)

其他补充信息 Additional Supplementary Information

No response

@YuanRisheng
Copy link
Contributor

image
麻烦提供一下正确的代码,这个跑不起来

@lh9171338
Copy link
Author

# paddlepaddle = 2.5.2
import paddle
import paddle.nn.functional as F

dim = 1024
paddle.seed(1)
x = paddle.randn([1, dim, 3, 3])
kernel = paddle.zeros((dim, dim, 3, 3))
for i in range(dim):
    kernel[i, i, 1, 1] = 1
out = F.conv2d(x, kernel, padding=1)
diff = out - x
print(diff.abs().max())
# 预期值是0,相同逻辑用PyTorch实现是结果为0
# Tensor(shape=[], dtype=float32, place=Place(gpu:0), stop_gradient=True,
#      0.00097394)

@LokeZhou
Copy link
Contributor

LokeZhou commented Mar 6, 2024

# paddlepaddle = 2.5.2
import paddle
import paddle.nn.functional as F

dim = 1024
paddle.seed(1)
x = paddle.randn([1, dim, 3, 3])
kernel = paddle.zeros((dim, dim, 3, 3))
for i in range(dim):
    kernel[i, i, 1, 1] = 1
out = F.conv2d(x, kernel, padding=1)
diff = out - x
print(diff.abs().max())
# 预期值是0,相同逻辑用PyTorch实现是结果为0
# Tensor(shape=[], dtype=float32, place=Place(gpu:0), stop_gradient=True,
#      0.00097394)

本地用在v100 cuda11.7,paddle develop版本跑的结果符合预期,可以用paddle2.6试试,或者paddle官网的nighty develop版本
image

@lh9171338
Copy link
Author

我是A800跑的,cuda11.7,试了paddle2.6和nighty develop版本,结果都是0.00097394

@Ligoml
Copy link
Contributor

Ligoml commented Mar 11, 2024

@LokeZhou 辛苦再验证下 A800?

@LokeZhou
Copy link
Contributor

可以试一下这设置这些flag
export FLAGS_use_cuda_managed_memory=true
export FLAGS_allocator_strategy=auto_growth
export FLAGS_embedding_deterministic=1
export FLAGS_cudnn_deterministic=1
export NVIDIA_TF32_OVERRIDE=0

@lh9171338
Copy link
Author

设置这些flag之后输出结果是0了,感谢🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants