Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: strided interpretation for data with extra offsets #852

Merged
merged 19 commits into from
Mar 27, 2023

Conversation

ioanaif
Copy link
Collaborator

@ioanaif ioanaif commented Mar 7, 2023

No description provided.

@ioanaif ioanaif linked an issue Mar 7, 2023 that may be closed by this pull request
@ioanaif
Copy link
Collaborator Author

ioanaif commented Mar 20, 2023

pre-commit.ci autofix

@ioanaif
Copy link
Collaborator Author

ioanaif commented Mar 21, 2023

I compared the last issue of the non-primitive dtypes being passed on to awkward in the test added by this PR with tests/test_0034-generic-objects-in-ttrees.py::test_jagged_strided_awkward

I tracked where the byte order changes. This happens in interpretation.jagged.final_array():



print("content dtype ", cnt['fP/fX'].dtype) => content dtype  >f8
content = numpy.empty((before,), self.content.to_dtype)
before = 0
for cnt in contents:
    content[before : before + len(cnt)] = cnt
    before += len(cnt)

print(“content dtype after “, content['fP/fX'].dtype) => content dtype after  float64

For the test I added, in interpretation.numerical.final_array() this transformation doesn’t take place so the byteorder is preserved and thus awkward complains about the fact that it gets a non-native dtype. This is solved by adding output = output.newbyteorder('=') before self.hook_before_library_finalize.

Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only change (which we discussed on Zoom): hide the new (None, None) elements of members by determining _all_headers_prepended once for each AsStridedObjects in its __init__, and then remove the (None, None) from the AsStridedObjects._members that it presents to the world (through the members property).

@ioanaif ioanaif requested a review from jpivarski March 27, 2023 16:30
Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is much better. Previously, this logic was happening in AsDtype, but now it's in AsStridedObjects, and that's good because it can only affect this one subclass.

Everything looks ready to merge. Go ahead and do it when you're done with the git branch!

@@ -4,8 +4,6 @@
import uproot
import skhep_testdata

ROOT = pytest.importorskip("ROOT")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, because the file already exists. This will allow the test to be tested on more systems. Thanks for catching this!

@ioanaif ioanaif merged commit f6eb0bb into main Mar 27, 2023
@ioanaif ioanaif deleted the ioanaif/fix-interpreting-xyzvector-513 branch March 27, 2023 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Problem with interpreting XYZVector
2 participants