Skip to content

Test memory usage for particle data; add PSC_PLOT_DASK_CHUNK_SIZE#22

Merged
JamesMcClung merged 5 commits intomainfrom
test-mem
Apr 17, 2026
Merged

Test memory usage for particle data; add PSC_PLOT_DASK_CHUNK_SIZE#22
JamesMcClung merged 5 commits intomainfrom
test-mem

Conversation

@JamesMcClung
Copy link
Copy Markdown
Owner

Setting the dask index actually was incorrect, so this also fixes that.

JamesMcClung and others added 5 commits April 16, 2026 21:02
dd.read_hdf without chunksize creates one partition per file, and
set_index(sorted=True, divisions=...) locks partition boundaries to
file boundaries. Both together mean dask can't subdivide a 16GB file.
Pass the chunksize from config and let dd.concat carry partition-level
granularity through to downstream operations.
Helper and sanity test for producing synthetic PSC-format particle HDF5
files in tests. Matches the subset of fields lib.particle_util reads.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Verify that reducing PSC_PLOT_DASK_CHUNK_SIZE reduces peak memory during
the particle binning pipeline — i.e., that partitioning actually streams.
Runs each pipeline variant in a subprocess for a clean ru_maxrss reading.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@JamesMcClung JamesMcClung added bug Something isn't working enhancement New feature or request optimization Improves performance labels Apr 17, 2026
@JamesMcClung JamesMcClung merged commit f430bd6 into main Apr 17, 2026
1 check passed
@JamesMcClung JamesMcClung deleted the test-mem branch April 17, 2026 11:43
@JamesMcClung JamesMcClung added the testing Adds or improves tests label Apr 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request optimization Improves performance testing Adds or improves tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant