Skip to content

ARROW-5830: [C++] Stop using memcmp in TensorEquals for tensors with float values#5166

Closed
mrkn wants to merge 13 commits into
apache:masterfrom
mrkn:ARROW-5830
Closed

ARROW-5830: [C++] Stop using memcmp in TensorEquals for tensors with float values#5166
mrkn wants to merge 13 commits into
apache:masterfrom
mrkn:ARROW-5830

Conversation

@mrkn
Copy link
Copy Markdown
Member

@mrkn mrkn commented Aug 22, 2019

memcmp is an inappropriate way to examine the equality of float tensors because float tensors can have NaNs. In addition to fix this issue, I made the following fixes in this pull request:

  • Change the namespace of StridedIntegerTensorContentsEquals, that is renamed from StridedTensorContentsEquals, to anonymous because it isn't referred from the outside of compare.cc
  • Let the result of Tensor::Equals be false when two tensors has the different shapes.

@mrkn mrkn force-pushed the ARROW-5830 branch 3 times, most recently from 302b81e to 63db11b Compare August 24, 2019 02:58
Comment thread cpp/src/arrow/compare.cc Outdated
@mrkn mrkn changed the title WIP: ARROW-5830: [C++] Stop using memcmp in TensorEquals for tensors with float values ARROW-5830: [C++] Stop using memcmp in TensorEquals for tensors with float values Aug 24, 2019
@mrkn
Copy link
Copy Markdown
Member Author

mrkn commented Aug 24, 2019

I think this pull-request is ready for code review.

@wesm Could you please have a look?

@codecov-io
Copy link
Copy Markdown

codecov-io commented Aug 26, 2019

Codecov Report

Merging #5166 into master will increase coverage by 1.61%.
The diff coverage is 99.24%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5166      +/-   ##
==========================================
+ Coverage   87.65%   89.27%   +1.61%     
==========================================
  Files        1025      742     -283     
  Lines      146568   106041   -40527     
  Branches     1437        0    -1437     
==========================================
- Hits       128480    94668   -33812     
+ Misses      17726    11373    -6353     
+ Partials      362        0     -362
Impacted Files Coverage Δ
cpp/src/arrow/compare.h 100% <ø> (ø) ⬆️
cpp/src/arrow/tensor.cc 96.51% <100%> (+0.04%) ⬆️
cpp/src/arrow/tensor.h 91.42% <100%> (+0.25%) ⬆️
cpp/src/arrow/tensor_test.cc 100% <100%> (ø) ⬆️
cpp/src/arrow/compare.cc 81.24% <98.36%> (+1.33%) ⬆️
r/R/arrow-package.R 63.63% <0%> (-36.37%) ⬇️
cpp/src/arrow/util/iterator.h 84.84% <0%> (-15.16%) ⬇️
cpp/src/parquet/thrift.h 86.13% <0%> (-7.9%) ⬇️
cpp/src/arrow/dataset/file_base.h 87.5% <0%> (-5.36%) ⬇️
cpp/src/arrow/util/compression.cc 82.14% <0%> (-4.96%) ⬇️
... and 336 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0f9c4a9...a74ad03. Read the comment docs.

@wesm
Copy link
Copy Markdown
Member

wesm commented Aug 26, 2019

I'm traveling this week but I will try to review.

Note that I just opened https://issues.apache.org/jira/browse/ARROW-6359, your comments would be welcome

@mrkn
Copy link
Copy Markdown
Member Author

mrkn commented Aug 26, 2019

@wesm I noticed that TensorEquals should also have an optional parameter of EqualOptions.
I want to make a change to add it by this Wednesday, please wait a moment.

And I'll look at ARROW-6359. Thank you for telling me.

@mrkn
Copy link
Copy Markdown
Member Author

mrkn commented Aug 27, 2019

I've finished working for EqualOptions support in Tensor::Equals.
This is finally ready to review.

Copy link
Copy Markdown
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The size-based clause needs fixing as mentioned in the comments. Otherwise, LGTM.

@pitrou pitrou closed this in cef82b4 Aug 27, 2019
@pitrou
Copy link
Copy Markdown
Member

pitrou commented Aug 27, 2019

Thank you @mrkn !

@mrkn mrkn deleted the ARROW-5830 branch August 27, 2019 12:47
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 99.24812% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 89.27%. Comparing base (0f9c4a9) to head (a74ad03).

Files with missing lines Patch % Lines
cpp/src/arrow/compare.cc 98.36% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5166      +/-   ##
==========================================
+ Coverage   87.65%   89.27%   +1.61%     
==========================================
  Files        1025      742     -283     
  Lines      146568   106041   -40527     
  Branches     1437        0    -1437     
==========================================
- Hits       128480    94668   -33812     
+ Misses      17726    11373    -6353     
+ Partials      362        0     -362     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants