Improve CompaSO subsample loading #154
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The CompaSO subsample loading has been doing a few unnecessary steps, as documented in #7. In particular, the cleaned particles were being merged into their own contiguous table, reindexed, merged again with the original particles, and then unpacked (e.g. rvint to pos, vel). The simpler version is to never construct either of those intermediate tables and just do the unpacking directly into the final arrays, which is what this PR implements.
It's a few 10s of %s faster, depending on the subsample fields requested, and uses a few 10s of %s less memory. Even more satisfying, it's about 100 fewer lines of code. However, the overall issue of the catalog loading using way more RSS than expected is still present. I'm not sure we can make much progress on that until asdf-format/asdf#1011 is implemented.
All the tests still pass; the changes to the test reference results are to reflect that the unclean catalogs now only contain the L1 subsample particles corresponding to the unfiltered halos in the table. Previously, all subsample particles went in the table. The change to the clean halo reference is a detail about
npstartfor halos without subsamples.