feat: dask chunking on frame level implemented #242

gatoniel · 2024-09-27T13:41:31Z

Hi,
I tried to implement the sub-frame chunking in the to_dask and _dask_block functions as mentioned in #85
I also added some new tests. It might need some more comments.
Best,
Niklas

codspeed-hq · 2024-09-27T13:48:48Z

CodSpeed Performance Report

Merging #242 will not alter performance

_{Comparing gatoniel:main (83b185a) with main (d04df9b)}

Summary

✅ 13 untouched benchmarks

tests/test_reader.py

codecov · 2024-09-27T14:04:15Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.03%. Comparing base (d04df9b) to head (83b185a).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #242      +/-   ##
==========================================
+ Coverage   94.97%   95.03%   +0.05%     
==========================================
  Files          18       18              
  Lines        2429     2458      +29     
==========================================
+ Hits         2307     2336      +29     
  Misses        122      122

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

tlambert03 · 2024-09-27T14:41:49Z

great! tests are starting to pass now. I'll take a closer look soon. Were you able to qualitatively confirm that this indeed speeds things up if you read a single subchunk of a massive xy image?

gatoniel · 2024-09-27T15:42:29Z

I am currently testing the speed up.

However, the execution time of to_dask is increasing linearly with the number of chunks. Is that expected?

tlambert03 · 2024-09-27T16:14:42Z

are you actually calling compute() there? or simply calling to_dask? and is that reading the full plane?

If the former (i.e. if you're computing the full plane), then I wouldn't be surprised at all about the time increasing as chunk size decreases. since there are more and more operations to complete. what would be important to verify, though, is that it should take less time to read a single chunk as the chunk size decreases (i.e. just to confirm that it can indeed efficiently access a subset of the data, rather than reading all the data and just cropping it down after the fact)

gatoniel · 2024-09-27T16:38:50Z

This is purely executing to_dask. It's not doing any reading yet. This is the code:

chunks = [
    (1024,),
    (512,) * 2,
    (256,) * 4,
    (128,) * 8,
    (64,) * 16,
    (32,) * 32,
    (16,) * 64,
]
chunkstr = [
    "(1024,)",
    "(512,)*2",
    "(256,)*4",
    "(128,)*8",
    "(64,)*16",
    "(32,)*32",
    "(16,)*64",
]
file = nd2.ND2File(path)
file.sizes  # {'P': 26, 'Z': 1263, 'C': 3, 'Y': 1024, 'X': 1024}
times = []
for c in chunks:
    start = timeit.default_timer()
    file.to_dask(
        frame_chunks=((3,), c, (1024,))
    )
    times.append(timeit.default_timer()-start)

fig, ax = plt.subplots(1, 1, figsize=[10, 6])

x = [len(c) for c in chunks]
ax.plot(x, times)

ax.set_ylabel("time of `to_dask` in s")
ax.set_xlabel("Chunks in y dimension")
ax.set_xticks(x, chunkstr, rotation=90)

gatoniel · 2024-09-27T17:58:17Z

I made some more simple tests. Still not with the dask feature yet.

Simply go through the frames in their native order.

%%timeit
with nd2.ND2File(path) as file:
    for i in range(500):
        new = file.read_frame(i).copy()

692 ms ± 15.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%%timeit
with nd2.ND2File(path) as file:
    for i in range(500):
        new = file.read_frame(i)[:128, :128].copy()

78.8 ms ± 1.51 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

With np.ravel_multi_index go through the first axis

%%timeit
with nd2.ND2File(path) as file:
    for i in range(file.shape[0]):
        j = np.ravel_multi_index((i, 0), file._coord_shape)
        new = file.read_frame(j).copy()

90.8 ms ± 339 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%%timeit
with nd2.ND2File(path) as file:
    for i in range(file.shape[0]):
        j = np.ravel_multi_index((i, 0), file._coord_shape)
        new = file.read_frame(j)[:128, :128].copy()

27.8 ms ± 560 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

With np.ravel_multi_index go through the second axis

%%timeit
with nd2.ND2File(path) as file:
    for i in range(file.shape[1]):
        j = np.ravel_multi_index((0, i), file._coord_shape)
        new = file.read_frame(j).copy()

566 ms ± 11.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%%timeit
with nd2.ND2File(path) as file:
    for i in range(file.shape[1]):
        j = np.ravel_multi_index((0, i), file._coord_shape)
        new = file.read_frame(j)[:128, :128].copy()

69.9 ms ± 597 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

The cropping on the read_frame level works and is always faster. However, jumping back in the file itself (1st and 3rd example vs 2nd example) seems to generate a lof of overhead that drastically reduces the speed differences.

I am going to also compare the subindexing of a region with to_dask and chunks and with using purely read_frame to test how much the dask chunking can speed up things. But I didn't have time today...

dask chunking on frame level implemented

3fe232e

tlambert03 changed the title ~~dask chunking on frame level implemented~~ feat: dask chunking on frame level implemented Sep 27, 2024

tlambert03 added the enhancement New feature or request label Sep 27, 2024

fix: test_value_error_dask_chunking called wrong fixture

7fc4787

tlambert03 reviewed Sep 27, 2024

View reviewed changes

tests/test_reader.py Outdated Show resolved Hide resolved

tlambert03 reviewed Sep 27, 2024

View reviewed changes

tests/test_reader.py Outdated Show resolved Hide resolved

test: simplify with @pytest.mark.parametrize

83b185a

tlambert03 linked an issue Sep 27, 2024 that may be closed by this pull request

Implement sub-frame read/chunking #85

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: dask chunking on frame level implemented #242

feat: dask chunking on frame level implemented #242

gatoniel commented Sep 27, 2024

codspeed-hq bot commented Sep 27, 2024 •

edited

Loading

codecov bot commented Sep 27, 2024 •

edited

Loading

tlambert03 commented Sep 27, 2024

gatoniel commented Sep 27, 2024

tlambert03 commented Sep 27, 2024 •

edited

Loading

gatoniel commented Sep 27, 2024

gatoniel commented Sep 27, 2024

feat: dask chunking on frame level implemented #242

Are you sure you want to change the base?

feat: dask chunking on frame level implemented #242

Conversation

gatoniel commented Sep 27, 2024

codspeed-hq bot commented Sep 27, 2024 • edited Loading

CodSpeed Performance Report

Merging #242 will not alter performance

Summary

codecov bot commented Sep 27, 2024 • edited Loading

Codecov Report

tlambert03 commented Sep 27, 2024

gatoniel commented Sep 27, 2024

tlambert03 commented Sep 27, 2024 • edited Loading

gatoniel commented Sep 27, 2024

gatoniel commented Sep 27, 2024

codspeed-hq bot commented Sep 27, 2024 •

edited

Loading

codecov bot commented Sep 27, 2024 •

edited

Loading

tlambert03 commented Sep 27, 2024 •

edited

Loading