Skip to content

Conversation

@lgarrison
Copy link
Member

@lgarrison lgarrison commented Sep 27, 2024

The CompaSO subsample loading has been doing a few unnecessary steps, as documented in #7. In particular, the cleaned particles were being merged into their own contiguous table, reindexed, merged again with the original particles, and then unpacked (e.g. rvint to pos, vel). The simpler version is to never construct either of those intermediate tables and just do the unpacking directly into the final arrays, which is what this PR implements.

It's a few 10s of %s faster, depending on the subsample fields requested, and uses a few 10s of %s less memory. Even more satisfying, it's about 100 fewer lines of code. However, the overall issue of the catalog loading using way more RSS than expected is still present. I'm not sure we can make much progress on that until asdf-format/asdf#1011 is implemented.

All the tests still pass; the changes to the test reference results are to reflect that the unclean catalogs now only contain the L1 subsample particles corresponding to the unfiltered halos in the table. Previously, all subsample particles went in the table. The change to the clean halo reference is a detail about npstart for halos without subsamples.

@lgarrison lgarrison linked an issue Sep 27, 2024 that may be closed by this pull request
@lgarrison lgarrison merged commit 2597bbd into main Sep 27, 2024
@lgarrison lgarrison deleted the compaso-loading branch September 27, 2024 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unpack cleaned subsamples directly into subsample table

2 participants