Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[query] TNDArray shouldn't store strides, fix show #9641

Merged
merged 4 commits into from Oct 27, 2020

Conversation

johnc1231
Copy link
Contributor

@johnc1231 johnc1231 commented Oct 26, 2020

Fixes #9640

CHANGELOG: Fixed bug where show output for ndarrays was not correct

TNDArray shouldn't store strides on it. It only existed because of the old world where PTypes weren't fully fleshed out. Especially since strides is a series of offsets in bytes, and at the virtual type level we have no idea what elements of the ndarray will actually be.

The way I've fixed this for now is by making all incoming ndarray literals row major and all Java Row representation style things use UnsafeIndexedSeqRowMajorView to make them indexable in arow major way. I mostly just did this since row by row is the way you'd want to display data for show. My eventual plan is reorganize the ndarray emitter so we know that all ndarrays are being emitted column major all the time, and then we won't have to do this view wrapper thing since we will know what striding to expect.

This change is also good because now that strides are off of TNDArray/literals, there should be nothing preventing us from passing ndarrays of arbitrary data types back to Python from the JVM (currently we only support collecting ndarrays of primitives).

@danking danking merged commit 8df2a9c into hail-is:main Oct 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NdArrays Incorrect results for show vs collect
3 participants