Skip to content

Conversation

@swolchok
Copy link
Contributor

Partial fix for #7748.

[ghstack-poisoned]
@swolchok
Copy link
Contributor Author

swolchok commented Jan 21, 2025

Stack from ghstack (oldest at bottom):

@pytorch-bot
Copy link

pytorch-bot bot commented Jan 21, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7807

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 11f4d2d with merge base 466d98f (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 21, 2025
swolchok added a commit that referenced this pull request Jan 21, 2025
Partial fix for #7748.

ghstack-source-id: f51e6f2
ghstack-comment-id: 2605752234
Pull Request resolved: #7807
@swolchok swolchok added the release notes: ops & kernels Changes to the opset and any new / changed kernel implementations label Jan 21, 2025
auto expected_grad_weight = tf.make({4, 3, 4, 2}, expected_grad_weight_data);
auto expected_grad_bias = tf.make({4}, expected_grad_bias_data);
if (DTYPE == ScalarType::Half || DTYPE == ScalarType::BFloat16) {
EXPECT_TENSOR_CLOSE_WITH_TOL(grad_input, expected_grad_input, 1e-2, 1e-8);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use defaults here? EXPECT_TENSOR_CLOSE_WITH_TOL should apply the right tolerance given the type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because the default rtol is 1e-5; rtol and atol are different

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, but in the same way that we have kDefaultHalfAtol and kDefaultBFloat16Atol I think we should have kDefaultHalfRtol and kDefaultBFloat16Rtol and set it to a proper value.
You seem to be using 1e-2 for most of these tests. Why not introduced kDefaultHalfRtol and kDefaultBFloat16Rtol with value 1e-2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not introduced kDefaultHalfRtol and kDefaultBFloat16Rtol with value 1e-2?

Because not all operators require the higher rtol.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not particularly uncommon to need to set rtol in pytorch core: https://github.com/search?q=repo%3Apytorch%2Fpytorch+%2Frtol%3D%5B1-9%5D%2F&type=code

@swolchok swolchok merged commit dabd72f into main Jan 23, 2025
44 of 47 checks passed
@swolchok swolchok deleted the gh/swolchok/158/head branch January 23, 2025 17:40
YIWENX14 pushed a commit that referenced this pull request Jan 28, 2025
zonglinpeng pushed a commit to zonglinpeng/executorch that referenced this pull request Jan 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. release notes: ops & kernels Changes to the opset and any new / changed kernel implementations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants