Change `read_pixels` to avoid passing `HATS Catalog` object to dask graph #982

smcguire-cmu · 2025-08-06T21:32:23Z

In working with the Zubercal dataset, users were running into issues with a memory leak during the read_pixel dask task. This led to workflows failing due to running out of memory. This change updates the read_hats behavior to avoid passing the hats Catalog object to the dask task graph, which seems to solve the memory leak problem.

My best guess is that the MOC object that is part of the hats Catalog could be causing this, since we've ran into issues with how they are cloud pickled before and how the Rust interaction works in dask distributed, but further investigation is necessary to confirm this.

github-actions · 2025-08-06T21:37:21Z

Before [`e7dd698`]	After [`1504048`]	Ratio	Benchmark (Parameter)
37.6±2ms	37.0±0.6ms	0.99	benchmarks.time_polygon_search
80.9±1ms	78.6±2ms	0.97	benchmarks.time_kdtree_crossmatch
23.6±0.8ms	22.7±0.2ms	0.96	benchmarks.time_box_filter_on_partition
952±5ms	882±10ms	0.93	benchmarks.time_create_midsize_catalog
6.38±0.01s	5.82±0.02s	0.91	benchmarks.time_create_large_catalog

Click here to view all benchmarks.

codecov · 2025-08-06T21:37:35Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.86%. Comparing base (e7dd698) to head (c1ea802).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #982   +/-   ##
=======================================
  Coverage   96.86%   96.86%           
=======================================
  Files          54       54           
  Lines        2554     2554           
=======================================
  Hits         2474     2474           
  Misses         80       80

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

dougbrn

This looks nicer to just define the paths ahead of the from_map call, thanks for looking into this Sean! This all looks good to me, I did notice that this slowed down two of the benchmarks by ~50%, I'm not sure how much to worry about that for this PR. Perhaps the original implementation is faster for smaller datasets, but at the cost of this scalability issue.

delucchi-cmu · 2025-08-07T13:39:55Z

Seconding the concern about scaling - we were originally passing the paths, but moved away from that intentionally, as the string construction was painfully slow for catalogs with 100k+ partitions. The regression in the benchmarks is a real one!

dougbrn · 2025-08-07T16:16:56Z

@delucchi-cmu Hmm, okay so on the one hand, the current implementation is causing some memory leak issues for specific datasets. While on the other hand the current implementation is generally faster. Perhaps merits some deeper investigation into why zubercal specifically is not releasing memory. I'm surprised to hear that passing the paths is slow, you said that's due to string construction, like the actual definition of the strings themselves?

delucchi-cmu · 2025-08-07T16:33:04Z

Yes - the string construction! In particular, calling the string construction method for each pixel, instead of using a method that will vectorize the construction.

https://github.com/lincc-frameworks/notebooks_lf/blob/6ba95d3a1284fa8b84e92ea776cca9d08d928e6c/sprints/2024/04_04/paths_optimization.ipynb

dougbrn · 2025-08-07T16:51:14Z

Thank you, that notebook is helpful. I'm going to poke at this a bit while Sean is out, I wonder if submitting the string construction to dask as it's own task layer will be the best of both worlds

delucchi-cmu

Please update the PR description.

smcguire-cmu requested a review from dougbrn August 6, 2025 21:54

dougbrn approved these changes Aug 6, 2025

View reviewed changes

smcguire-cmu added 3 commits August 13, 2025 17:26

remove read_pixel

4b664a6

lint

e6cafdd

construct path in read_pixel

706cb3c

smcguire-cmu force-pushed the sean/fix-memory-leak branch from d691f07 to 706cb3c Compare August 13, 2025 16:39

fix npix suffix type

c1ea802

smcguire-cmu marked this pull request as ready for review August 13, 2025 16:55

smcguire-cmu requested a review from delucchi-cmu August 13, 2025 16:55

delucchi-cmu requested changes Aug 13, 2025

View reviewed changes

smcguire-cmu changed the title ~~Change read_hats to load from paths instead of pixels~~ Change read_pixels to not pass hats Catalog object to dask graph Aug 13, 2025

smcguire-cmu changed the title ~~Change read_pixels to not pass hats Catalog object to dask graph~~ Change read_pixels to not pass HATS Catalog object to dask graph Aug 13, 2025

smcguire-cmu changed the title ~~Change read_pixels to not pass HATS Catalog object to dask graph~~ Change read_pixels to avoid passing HATS Catalog object to dask graph Aug 13, 2025

smcguire-cmu requested a review from delucchi-cmu August 13, 2025 18:28

delucchi-cmu approved these changes Aug 13, 2025

View reviewed changes

smcguire-cmu merged commit d9a7b95 into main Aug 13, 2025
12 of 13 checks passed

smcguire-cmu deleted the sean/fix-memory-leak branch August 13, 2025 18:29

delucchi-cmu mentioned this pull request Aug 14, 2025

Release: LSDB 0.6.4 #990

Closed

21 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Change `read_pixels` to avoid passing `HATS Catalog` object to dask graph #982

Change `read_pixels` to avoid passing `HATS Catalog` object to dask graph #982

Uh oh!

smcguire-cmu commented Aug 6, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Aug 6, 2025 •

edited

Loading

Uh oh!

codecov bot commented Aug 6, 2025 •

edited

Loading

Uh oh!

dougbrn left a comment

Uh oh!

delucchi-cmu commented Aug 7, 2025

Uh oh!

dougbrn commented Aug 7, 2025

Uh oh!

delucchi-cmu commented Aug 7, 2025

Uh oh!

dougbrn commented Aug 7, 2025

Uh oh!

delucchi-cmu left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Change read_pixels to avoid passing HATS Catalog object to dask graph #982

Change read_pixels to avoid passing HATS Catalog object to dask graph #982

Uh oh!

Conversation

smcguire-cmu commented Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

dougbrn left a comment

Choose a reason for hiding this comment

Uh oh!

delucchi-cmu commented Aug 7, 2025

Uh oh!

dougbrn commented Aug 7, 2025

Uh oh!

delucchi-cmu commented Aug 7, 2025

Uh oh!

dougbrn commented Aug 7, 2025

Uh oh!

delucchi-cmu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Change `read_pixels` to avoid passing `HATS Catalog` object to dask graph #982

Change `read_pixels` to avoid passing `HATS Catalog` object to dask graph #982

smcguire-cmu commented Aug 6, 2025 •

edited

Loading

github-actions bot commented Aug 6, 2025 •

edited

Loading

codecov bot commented Aug 6, 2025 •

edited

Loading