Skip to content

ARROW-4126: [Go] offset not used when accessing boolean array#3275

Closed
jpacik wants to merge 3 commits intoapache:masterfrom
jpacik:fix/go-boolean-slice
Closed

ARROW-4126: [Go] offset not used when accessing boolean array#3275
jpacik wants to merge 3 commits intoapache:masterfrom
jpacik:fix/go-boolean-slice

Conversation

@jpacik
Copy link
Copy Markdown
Contributor

@jpacik jpacik commented Dec 27, 2018

Closes #3273 .

@jpacik
Copy link
Copy Markdown
Contributor Author

jpacik commented Dec 28, 2018

@stuartcarnie can you review this?

@sbinet
Copy link
Copy Markdown
Contributor

sbinet commented Dec 29, 2018

just a quick first pass, but seems good.
could you rebase on top of the latest master ?
(so .travis.yml is up to date as well as other C++, etc... files -- then the other language builds won't kick in and travis should be happy.)

@codecov-io
Copy link
Copy Markdown

codecov-io commented Dec 29, 2018

Codecov Report

Merging #3275 into master will decrease coverage by 20.36%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master    #3275       +/-   ##
===========================================
- Coverage    88.5%   68.14%   -20.37%     
===========================================
  Files         539       58      -481     
  Lines       72968     4015    -68953     
===========================================
- Hits        64583     2736    -61847     
+ Misses       8278     1176     -7102     
+ Partials      107      103        -4
Impacted Files Coverage Δ
go/arrow/array/boolean.go 48.27% <100%> (+5.96%) ⬆️
go/arrow/math/float64_sse4_amd64.go 0% <0%> (-100%) ⬇️
go/arrow/memory/memory_sse4_amd64.go 0% <0%> (-100%) ⬇️
go/arrow/math/int64_sse4_amd64.go 0% <0%> (-100%) ⬇️
go/arrow/math/uint64_sse4_amd64.go 0% <0%> (-100%) ⬇️
go/arrow/math/uint64_amd64.go 33.33% <0%> (ø) ⬆️
go/arrow/math/int64_amd64.go 33.33% <0%> (ø) ⬆️
go/arrow/math/float64_amd64.go 33.33% <0%> (ø) ⬆️
python/pyarrow/ipc.pxi
cpp/src/arrow/csv/chunker-test.cc
... and 486 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9376d85...67c5d73. Read the comment docs.

Copy link
Copy Markdown
Contributor

@sbinet sbinet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

but let's wait for @stuartcarnie and/or @alexandreyc

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps store the value retuned by Value(int) and display it if tc.panic is true?
this way we make sure the compiler won't elide this piece of code?

Copy link
Copy Markdown
Contributor

@stuartcarnie stuartcarnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor nit, but otherwise looks good


func (a *Boolean) Value(i int) bool { return bitutil.BitIsSet(a.values, i) }
func (a *Boolean) Value(i int) bool {
if i < 0 || i >= a.array.data.length {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BitIsSet is going to panic if the index is out of bounds, so there is no need to perform this check here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stuartcarnie I don't think this is always true. For instance an array of 10 boolean values is stored as a byte array of length 2. The first 8 boolean values are represented in the first byte and the last 2 boolean values are represented in the second byte. Here BitIsSet will only panic if the given index falls outside of second byte, which means an index of 16 or higher.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've been focused on a performance-first approach with similar semantics to the C++ implementation and not performing bounds checks similar APIs, however, I think it is reasonable in this case, to avoid undefined behavior. We may wish to formalize where we make these tradeoffs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants