Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix nn_test.py on AVX512 builds #21541

Merged
merged 1 commit into from
Aug 11, 2018

Conversation

markdryan
Copy link
Contributor

This patch modifies the nn_test test case L2LossTest.testGradient
so that it passes on AVX512 builds. The test case is failing
as the error tolerance used in the test case is too strict.
The test case compares the difference of pairs of tensor reductions
to an expected result. If the comparison is out by more than 1e-11
the test case fails. The problem here is that the results of a
summation reduction of doubles of the same tensor can differ slightly
on different builds. AVX2, AVX512 and non vectorized versions of the
tensor contraction algorithm add the tensor's contents together in
different orders and this different ordering can produce slightly
different results due to rounding errors.

The accuracy of AVX512 tensor reduction is no worse than the AVX2
implementation. In fact, it's only luck that this test case passes
on AVX2 builds and fails on AVX512 builds. If the seed at the start of
the test is changed from 1 to 3, the test passes on AVX512 builds and
fails on AVX2 builds. Rather than trying to find a seed that allows
the test case to pass on all CPU architectures, it is better to relax
the test criteria a little bit.

Signed-off-by: Mark Ryan mark.d.ryan@intel.com

This patch modifies the nn_test test case L2LossTest.testGradient
so that it passes on AVX512 builds.  The test case is failing
as the error tolerance used in the test case is too strict.
The test case compares the difference of pairs of tensor reductions
to an expected result.  If the comparison is out by more than 1e-11
the test case fails.  The problem here is that the results of a
summation reduction of doubles of the same tensor can differ slightly
on different builds.  AVX2, AVX512 and non vectorized versions of the
tensor contraction algorithm add the tensor's contents together in
different orders and this different ordering can produce slightly
different results due to rounding errors.

The accuracy of AVX512 tensor reduction is no worse than the AVX2
implementation.  In fact, it's only luck that this test case passes
on AVX2 builds and fails on AVX512 builds.  If the seed at the start of
the test is changed from 1 to 3, the test passes on AVX512 builds and
fails on AVX2 builds.  Rather than trying to find a seed that allows
the test case to pass on all CPU architectures, it is better to relax
the test criteria a little bit.

Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>
@drpngx drpngx added the ready to pull PR ready for merge process label Aug 10, 2018
@drpngx drpngx self-assigned this Aug 10, 2018
@tensorflow-copybara tensorflow-copybara merged commit f135cec into tensorflow:master Aug 11, 2018
tensorflow-copybara pushed a commit that referenced this pull request Aug 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes ready to pull PR ready for merge process
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants