Skip to content

Conversation

@zeyuanxy
Copy link
Contributor

@zeyuanxy zeyuanxy commented Sep 2, 2019

Fix for JIRA issue https://issues.apache.org/jira/browse/ARROW-5471
Basically it adds offset for each buffer in the generated function to compute the correct index.

@zeyuanxy
Copy link
Contributor Author

zeyuanxy commented Sep 2, 2019

The failed test says "Address already in use" for Flight, which seems that it is not related with my changes. Is it possible to run the test again?

@kou kou changed the title ARROW-5471: [C++][Gandiva]Array offset is ignored in Gandiva projector ARROW-5471: [C++][Gandiva] Array offset is ignored in Gandiva projector Sep 2, 2019
@kou
Copy link
Member

kou commented Sep 2, 2019

#5253 will fix the Flight test failure.

Copy link
Contributor

@pravindra pravindra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR.

  • Using a single offset instead of an array of offsets will significantly simplify the change.
  • At the end of evaluating an expr, we compute the output bitmaps by doing an intersection of the input bitmaps (and some internally generated bitmaps). We'll need to fix this to take offset into account for the input bitmaps here.
  • can you please add filter tests ?

@zeyuanxy
Copy link
Contributor Author

zeyuanxy commented Sep 5, 2019

Thanks for reviewing! I replied the discussion of offset. And I will look at the output bitmap and add the filter test.

Copy link
Contributor Author

@zeyuanxy zeyuanxy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pravindra As for computing the bitmap, it seems a little bit complicated here, as with the offset, the bits may not be aligned any more. I am thinking maybe copy these source bit maps with offsets to make sure they are aligned, but this seems inefficient. Any suggestions here? Thanks!

@pravindra
Copy link
Contributor

@pravindra As for computing the bitmap, it seems a little bit complicated here, as with the offset, the bits may not be aligned any more. I am thinking maybe copy these source bit maps with offsets to make sure they are aligned, but this seems inefficient. Any suggestions here? Thanks!

agree with you. For this PR, let's copy the unaligned ones - that way, there is no additional cost for the common case.

@codecov-io
Copy link

codecov-io commented Sep 7, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@d818299). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #5255   +/-   ##
=========================================
  Coverage          ?   89.32%           
=========================================
  Files             ?      754           
  Lines             ?   109133           
  Branches          ?        0           
=========================================
  Hits              ?    97481           
  Misses            ?    11652           
  Partials          ?        0
Impacted Files Coverage Δ
cpp/src/gandiva/llvm_generator.h 100% <ø> (ø)
cpp/src/gandiva/compiled_expr.h 100% <ø> (ø)
cpp/src/gandiva/llvm_generator_test.cc 98.14% <100%> (ø)
cpp/src/gandiva/llvm_generator.cc 89.09% <100%> (ø)
cpp/src/gandiva/filter.cc 100% <100%> (ø)
cpp/src/gandiva/tests/projector_test.cc 100% <100%> (ø)
cpp/src/gandiva/bitmap_accumulator_test.cc 100% <100%> (ø)
cpp/src/gandiva/bitmap_accumulator.h 100% <100%> (ø)
cpp/src/gandiva/annotator.cc 100% <100%> (ø)
cpp/src/gandiva/eval_batch.h 100% <100%> (ø)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d818299...a0ebf9d. Read the comment docs.

if (op(left_reader.IsSet(), right_reader.IsSet())) {
writer.Set();
} else {
writer.Clear();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems the right thing to do to. we can now implement a & b & c by doing
dst = a & b
dst = dst & c

and it doesn't required the dst buffer to be zero-filled to begin with. however, it may impact perf for the compute kernels - @xhochy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we have two versions, one requires zero-filled and another one doesn't?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i prefer your current version. but, will wait to hear from others.

Copy link
Contributor

@pravindra pravindra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants