Fix lagging 3d view for big data in auto color mode #6411

Czaki · 2023-11-04T17:00:09Z

References and relevant issues

Description

This array fixes fps performance issues in OpenGL introduced by #3308. In that PR, the texture type was changed to float32 in order to directly pass the labels values to the texture. It turns out that OpenGL performance for float32 textures is much worse than for uint8 textures.

Here we change the code to use uint8 whenever the final number of colors is less than 255 in automatic coloring mode, or uint16 if the number is less than 65535.

This is achieved by transforming original data using a modulo-like operation that avoids the background label landing on 0.

This PR introduces numba dependency, which might not be a long-term solution. We may try to move this utility to some package that already contains compiled code. We can revisit the decision if it causes issues (such as a delay in supporting newer Python versions), and perhaps push such a function to a compiled dependency such as scikit-image.

This PR also disables the caching strategy used to speed up painting until someone actually starts painting. It is a significant speedup in startup time and reduces memory usage.

codecov · 2023-11-05T18:17:43Z

Codecov Report

Merging #6411 (8e13fc7) into main (932054d) will decrease coverage by 0.06%.
The diff coverage is 95.65%.

@@            Coverage Diff             @@
##             main    #6411      +/-   ##
==========================================
- Coverage   92.19%   92.13%   -0.06%     
==========================================
  Files         599      601       +2     
  Lines       52987    53088     +101     
==========================================
+ Hits        48853    48915      +62     
- Misses       4134     4173      +39

Files	Coverage Δ
napari/_qt/_tests/test_qt_viewer.py	`94.70% <100.00%> (ø)`
napari/_qt/layer_controls/qt_labels_controls.py	`94.64% <ø> (ø)`
napari/_vispy/_tests/test_vispy_labels_layer.py	`98.18% <100.00%> (ø)`
napari/_vispy/utils/visual.py	`96.22% <100.00%> (ø)`
napari/_vispy/visuals/image.py	`88.88% <100.00%> (+1.38%)`	⬆️
napari/_vispy/visuals/util.py	`100.00% <100.00%> (ø)`
napari/_vispy/visuals/volume.py	`100.00% <100.00%> (ø)`
napari/layers/labels/_labels_utils.py	`95.83% <100.00%> (ø)`
napari/layers/labels/_tests/test_labels.py	`100.00% <100.00%> (ø)`
napari/utils/colormaps/_tests/test_colormap.py	`100.00% <100.00%> (ø)`
... and 6 more

... and 12 files with indirect coverage changes

napari/layers/labels/labels.py

imagesc-bot · 2023-11-06T08:44:09Z

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/request-for-testing-napari-0-4-19-release-candidate/87972/14

jni

@Czaki I've left some initial comments. Still getting my head around the flow, but I think I understand the gist. It's very cool and I love that you fixed the direct case too where possible. ❤️

napari/_tests/test_windowsettings.py

napari/_vispy/layers/image.py

jni · 2023-11-06T07:57:41Z

napari/_vispy/layers/image.py

        ):
            self._on_display_change(data)
        else:
            node.set_data(data)
-
-        node.visible = not self.layer._slice.empty and self.layer.visible


Is this change intentional? I find it surprising... Is it because _on_display_change correctly sets the node visibility anyway?

Yes. It is intentional not to trigger obsolete events.

napari/_vispy/layers/labels.py

napari/layers/labels/labels.py

jni · 2023-11-06T10:09:01Z

napari/utils/colormaps/colormap.py

+def _cast_labels_to_minimum_type_auto(
+    data: np.ndarray, num_colors: int, dtype
+) -> np.ndarray:
+    result_array = np.zeros_like(data, dtype=dtype)


Oof. This is a big step. However, I tried my best to optimise this in NumPy, avoiding unnecessary allocations, and got 10x worse speed...

In [14]: %timeit cm._cast_labels_to_minimum_type_auto(data, 50, np.uint8) OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead. 33 ms ± 2.19 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) In [16]: %%timeit ...: np.add(np.mod(data, np.uint8(50), out=out, casting='unsafe'), ...: np.uint8(1), ...: where=(data >= 50) | (data < 0), ...: out=out) ...: 388 ms ± 830 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

Even if we optimise for unsigned ints, we are still close to 10x worse:

In [15]: %%timeit ...: np.add(np.mod(data, np.uint8(50), out=out, casting='unsafe'), ...: np.uint8(1), ...: where=(data >= 50), ...: out=out) ...: 280 ms ± 458 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

So, maybe we should try to bite the bullet. Alternatively, maybe we can dispatch to the above NumPy code if numba is not available, but recommend numba (and include it in [all]).

What do you think @Czaki?

Certainly, depending on numba would open up a lot of optimisations. But it feels like we are making a big decision under time pressure. 😅

Ah, I just saw what you think in the PR description 😅. Yes I guess you are right that for 0.4.19 it is not a huge risk. But maybe since this is going into main, we make the dependency optional (as discussed above), and when backporting to 0.4.19 we tear out the compatibility code? Or vice versa, we can add non-numba option in a future PR, your call.

I think that the best option will be to donate such code to some of our compiled dependencies (vispy, scikit?)

Here I see a double allocation:

where=(data >= 50) | (data < 0),

I would rather open an issue for this problem than add much slower/memory consummimg code.

napari/utils/colormaps/colormap.py

napari/_vispy/visuals/util.py

jni · 2023-11-06T10:24:43Z

This PR disables caching used for speedup painting until someone starts painting. It is a significant speedup and reduces memory usage.

Nice catch!

Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com>

Carreau · 2023-11-08T10:35:51Z

restarting what looks like random failure.

Czaki · 2023-11-08T10:59:22Z

restarting what looks like random failure.

I have opened PR to fix this random failures #6423

brisvag · 2023-11-13T13:04:28Z

My impression from the previous talk (during the last fixing of labels rendering) is that we want to support setting different backgrounds from 0. Maybe we should just drop it. I just do not want to break this.

I remember this being added specifically cause someone asked for it and they had data with -1 as background... I think @kevinyamauchi was implementing this at the time maybe?

jni · 2023-11-13T13:48:26Z

Good thing @Czaki convinced me to keep the functionality then. 😅 Thanks for chiming in @brisvag! 🙏

jni · 2023-11-14T08:56:58Z

Gaaah, @Czaki while I was rewording the description, I thought of another issue with this PR: we should not use num_colors to do the mod operation, or labels that have the same color will always have the same color. The mod operation (a) should only occur when the label data dtype doesn't fit in uint8 in the first place, and (b) should be by a prime number that is greater than num_colors but still fits in the desired dtype, and that prime number should get changed when Shuffle is hit. 😕 And we should figure out a test that catches this issue in the future...

At any rate, though, in order to avoid complicating this PR further, I think that we should merge this, and fix it in a subsequent PR. We can raise an issue with the 0.4.19 milestone to make sure it gets handled...

Have I got the analysis right?

jni · 2023-11-14T09:12:11Z

Ok after a quick chat with @Czaki we have determined that this issue also exists on main. Therefore, I'm going to merge this and make an issue, which we can then fix. (It may even go back further, but I'm not certain about this, will need to test properly.)

jni · 2023-11-14T12:00:02Z

The issue is explained in detail in #6448.

closes #6397 This array fixes fps performance issues in OpenGL introduced by #3308. In that PR, the texture type was changed to float32 in order to directly pass the labels values to the texture. It turns out that OpenGL performance for float32 textures is much worse than for uint8 textures. Here we change the code to use uint8 whenever the final number of colors is less than 255 in automatic coloring mode, or uint16 if the number is less than 65535. This is achieved by transforming original data using a modulo-like operation that avoids the background label landing on 0. This PR introduces numba dependency, which might not be a long-term solution. We may try to move this utility to some package that already contains compiled code. We can revisit the decision if it causes issues (such as a delay in supporting newer Python versions), and perhaps push such a function to a compiled dependency such as scikit-image. This PR also disables caching used for speedup painting until someone starts painting. It is a significant speedup and reduces memory usage. --------- Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com> Co-authored-by: Matthias Bussonnier <bussonniermatthias@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

# Description Changes from #6411 make the thumbnail not work correctly as the slice is storing cast values, not the original ones for thumbnail generation. Before this PR: ![Zrzut ekranu z 2023-11-15 12-24-56](https://github.com/napari/napari/assets/3826210/1d5942a9-22dc-4f70-8be0-3c9a7c8e1b9f) With this PR: ![Zrzut ekranu z 2023-11-15 12-23-32](https://github.com/napari/napari/assets/3826210/01c65497-6c8c-4bf0-8887-5c58db04052b) Code: ```python import napari import numpy as np data = np.asarray([[0, 1], [2, 3]]) viewer = napari.Viewer() viewer.add_labels(data, opacity=1) napari.run() ``` --------- Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com>

Changes from #6411 make the thumbnail not work correctly as the slice is storing cast values, not the original ones for thumbnail generation. Before this PR: ![Zrzut ekranu z 2023-11-15 12-24-56](https://github.com/napari/napari/assets/3826210/1d5942a9-22dc-4f70-8be0-3c9a7c8e1b9f) With this PR: ![Zrzut ekranu z 2023-11-15 12-23-32](https://github.com/napari/napari/assets/3826210/01c65497-6c8c-4bf0-8887-5c58db04052b) Code: ```python import napari import numpy as np data = np.asarray([[0, 1], [2, 3]]) viewer = napari.Viewer() viewer.add_labels(data, opacity=1) napari.run() ``` --------- Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com>

closes #6397 This array fixes fps performance issues in OpenGL introduced by #3308. In that PR, the texture type was changed to float32 in order to directly pass the labels values to the texture. It turns out that OpenGL performance for float32 textures is much worse than for uint8 textures. Here we change the code to use uint8 whenever the final number of colors is less than 255 in automatic coloring mode, or uint16 if the number is less than 65535. This is achieved by transforming original data using a modulo-like operation that avoids the background label landing on 0. This PR introduces numba dependency, which might not be a long-term solution. We may try to move this utility to some package that already contains compiled code. We can revisit the decision if it causes issues (such as a delay in supporting newer Python versions), and perhaps push such a function to a compiled dependency such as scikit-image. This PR also disables caching used for speedup painting until someone starts painting. It is a significant speedup and reduces memory usage. --------- Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com> Co-authored-by: Matthias Bussonnier <bussonniermatthias@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Changes from #6411 make the thumbnail not work correctly as the slice is storing cast values, not the original ones for thumbnail generation. Before this PR: ![Zrzut ekranu z 2023-11-15 12-24-56](https://github.com/napari/napari/assets/3826210/1d5942a9-22dc-4f70-8be0-3c9a7c8e1b9f) With this PR: ![Zrzut ekranu z 2023-11-15 12-23-32](https://github.com/napari/napari/assets/3826210/01c65497-6c8c-4bf0-8887-5c58db04052b) Code: ```python import napari import numpy as np data = np.asarray([[0, 1], [2, 3]]) viewer = napari.Viewer() viewer.add_labels(data, opacity=1) napari.run() ``` --------- Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com>

…#6467) # Description This is follow-up to #6411. When labels data are int8, uint8, int16, or uint16, no copy is made — the data are sent directly as a uint8 or uint16 view to the GPU for rendering. Higher bit depth integer data is first converted on the CPU to uint8, uint16, or float32 (depending on the number of colors in the colormap) using the formula `texture = (values - 1) % ncolors + 1` — but with the background value always mapping to 0. --------- Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com> Co-authored-by: Lorenzo Gaifas <brisvag@gmail.com>

…rect color mode (#6439) Closes #6518 Closes #6084 # Description In this PR, similarly to #6411, instead of using `float32` to pass data to the GPU there we introduce heuristics for choosing smaller data types, while keeping high performance. Instead of complex calculation of color in the shader, a precomputed texture array is used. To avoid repetitive texture calculation, the textures are cached in the `Colormap` objects. For data of type uint8/int8/uint16/int16 we do not perform any transform of data. We send them to the GPU as it is. This allows to reduce computational time. Based on experiments, the rendering performance is a little worse for uint16/int16 than for uint8/int8. But it may depend on the GPU. Also, using uint16/int16 means usage more GPU memory than for 8 bits type. Still less than current main. For datatypes using at least 32 bits, we add a preprocessing step where we identify a set of labels that are mapped to the same color and map all of them to the same value. This often saves enough space to fall back to uint8/uint16. It allows using a smaller additional array, and use less GPU memory. If there are more than `2**16` distinct colors, then float32 is used, though performance will be reduced. We support only up to `2**23` distinct colors for now. For reduced memory usage, part of the functions used for data preprocessing are compiled using numba. We provide a version of the function that does not require `numba` but it limits the number of distinct colors to `2**16` and involves additional array creation (more memory usage). --------- Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Lorenzo Gaifas <brisvag@gmail.com> Co-authored-by: Andy Sweet <andrew.d.sweet@gmail.com>

…#6467) This is follow-up to #6411. When labels data are int8, uint8, int16, or uint16, no copy is made — the data are sent directly as a uint8 or uint16 view to the GPU for rendering. Higher bit depth integer data is first converted on the CPU to uint8, uint16, or float32 (depending on the number of colors in the colormap) using the formula `texture = (values - 1) % ncolors + 1` — but with the background value always mapping to 0. --------- Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com> Co-authored-by: Lorenzo Gaifas <brisvag@gmail.com>

…rect color mode (#6439) Closes #6518 Closes #6084 In this PR, similarly to #6411, instead of using `float32` to pass data to the GPU there we introduce heuristics for choosing smaller data types, while keeping high performance. Instead of complex calculation of color in the shader, a precomputed texture array is used. To avoid repetitive texture calculation, the textures are cached in the `Colormap` objects. For data of type uint8/int8/uint16/int16 we do not perform any transform of data. We send them to the GPU as it is. This allows to reduce computational time. Based on experiments, the rendering performance is a little worse for uint16/int16 than for uint8/int8. But it may depend on the GPU. Also, using uint16/int16 means usage more GPU memory than for 8 bits type. Still less than current main. For datatypes using at least 32 bits, we add a preprocessing step where we identify a set of labels that are mapped to the same color and map all of them to the same value. This often saves enough space to fall back to uint8/uint16. It allows using a smaller additional array, and use less GPU memory. If there are more than `2**16` distinct colors, then float32 is used, though performance will be reduced. We support only up to `2**23` distinct colors for now. For reduced memory usage, part of the functions used for data preprocessing are compiled using numba. We provide a version of the function that does not require `numba` but it limits the number of distinct colors to `2**16` and involves additional array creation (more memory usage). --------- Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Lorenzo Gaifas <brisvag@gmail.com> Co-authored-by: Andy Sweet <andrew.d.sweet@gmail.com>

…#6652) # Description In #6411, we missed these lines: https://github.com/napari/napari/blob/4f4c063ae5dd79d6d188e201d44b8d57eba71909/napari/_vispy/layers/image.py#L25-L32 These were added in #3990 because 'auto' has a slightly different meaning in VisPy than *really* fully auto: it chooses the default texture dtype for the input NumPy array's dtype. For float arrays, this is float. However, if your OpenGL implementation doesn't support float textures, VisPy will raise. Instead, passing `texture_format=None` when creating the ImageNode tells VisPy to transform the data to whatever texture format it sees fit. In #6411, when getting the VisPy node for a given dtype, we only check for `texture_format != 'auto'` as the "catch-all" texture format. But, in fact, if we are on a machine that doesn't support float32 textures, by this point in the code the format has been changed to None, incorrectly triggering these lines: https://github.com/napari/napari/blob/89f8194d3fa4eef620755804806ac69ef684df63/napari/_vispy/layers/image.py#L59-L73 and causing a ValueError. This PR fixes that by also checking for None in that same clause. This PR also adds a test by monkeypatching the function that checks for float32 texture support. # References #3988 #3990 --------- Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Czaki added 5 commits November 3, 2023 12:34

smarter create empty data

1060846

extrct contours to separate function

e3bdb83

use cache only if needed

db9e7b0

save texture format for future use

232aacd

minimum working implementation

c1e273e

Czaki mentioned this pull request Nov 4, 2023

Labels layer is lagging in 3d for big data #6397

Closed

Czaki added 2 commits November 5, 2023 15:31

fix test_labels

1eb6c2f

fix test

3156dd2

github-actions bot added tests Something related to our tests qt Relates to qt labels Nov 5, 2023

Czaki added 5 commits November 5, 2023 19:19

add matplotlib contarints

70a6df8

lower numba

30dfbda

move function to proper place

7f089fc

next move files

1d50df4

remove redundant +1/-1

cfb484b

Czaki commented Nov 5, 2023

View reviewed changes

napari/layers/labels/labels.py Show resolved Hide resolved

Czaki marked this pull request as ready for review November 5, 2023 22:56

jni reviewed Nov 6, 2023

View reviewed changes

Czaki added 3 commits November 6, 2023 11:51

cleanup opengl redundant code

cd201fa

fix typo

70a0204

add explanation

728d68c

Czaki mentioned this pull request Nov 8, 2023

Fix test_singlescreen_window_settings to allow it to run locally on multiscreen configuration #6422

Merged

Update napari/_vispy/layers/image.py

69252ec

Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com>

Czaki added 2 commits November 8, 2023 12:32

add test

fa88c47

workaround for coverage jit function

227b9fc

Czaki closed this Nov 13, 2023

Czaki reopened this Nov 13, 2023

jni approved these changes Nov 13, 2023

View reviewed changes

jni added the ready to merge Last chance for comments! Will be merged in ~24h label Nov 13, 2023

alessandrofelder mentioned this pull request Nov 13, 2023

[Feature] Use colours provided by bg-atlasapi to render annotation labels brainglobe/brainrender-napari#114

Open

jni merged commit a6e1569 into napari:main Nov 14, 2023
41 of 49 checks passed

jni deleted the bugfix/draw_performance branch November 14, 2023 09:13

jni removed the ready to merge Last chance for comments! Will be merged in ~24h label Nov 14, 2023

Czaki mentioned this pull request Nov 15, 2023

Fix thumbnail for auto color mode in labels #6459

Merged

jni mentioned this pull request Nov 15, 2023

Fix casting uint32 to vispy dtype for image layers #6456

Merged

Czaki mentioned this pull request Nov 17, 2023

Use views rather than CPU-based hashing for 8- and 16-bit Labels data #6467

Merged

psobolewskiPhD mentioned this pull request Dec 6, 2023

[Labels] IndexError after painting into 3D labels layer and toggling 3D #6518

Closed

Czaki mentioned this pull request Dec 13, 2023

Canvas goes black for very small or large numbers in image #6533

Closed

Czaki mentioned this pull request Feb 9, 2024

Allow both 'auto' and None as texture_format when choosing VisPy node #6652

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix lagging 3d view for big data in auto color mode #6411

Fix lagging 3d view for big data in auto color mode #6411

Czaki commented Nov 4, 2023 •

edited by jni

codecov bot commented Nov 5, 2023 •

edited

imagesc-bot commented Nov 6, 2023

jni left a comment

jni Nov 6, 2023

Czaki Nov 6, 2023

jni Nov 6, 2023

jni Nov 6, 2023 •

edited

Czaki Nov 6, 2023

jni commented Nov 6, 2023

Carreau commented Nov 8, 2023

Czaki commented Nov 8, 2023

brisvag commented Nov 13, 2023

jni commented Nov 13, 2023

jni commented Nov 14, 2023

jni commented Nov 14, 2023

jni commented Nov 14, 2023

Fix lagging 3d view for big data in auto color mode #6411

Fix lagging 3d view for big data in auto color mode #6411

Conversation

Czaki commented Nov 4, 2023 • edited by jni

References and relevant issues

Description

codecov bot commented Nov 5, 2023 • edited

Codecov Report

imagesc-bot commented Nov 6, 2023

jni left a comment

Choose a reason for hiding this comment

jni Nov 6, 2023

Choose a reason for hiding this comment

Czaki Nov 6, 2023

Choose a reason for hiding this comment

jni Nov 6, 2023

Choose a reason for hiding this comment

jni Nov 6, 2023 • edited

Choose a reason for hiding this comment

Czaki Nov 6, 2023

Choose a reason for hiding this comment

jni commented Nov 6, 2023

Carreau commented Nov 8, 2023

Czaki commented Nov 8, 2023

brisvag commented Nov 13, 2023

jni commented Nov 13, 2023

jni commented Nov 14, 2023

jni commented Nov 14, 2023

jni commented Nov 14, 2023

Czaki commented Nov 4, 2023 •

edited by jni

codecov bot commented Nov 5, 2023 •

edited

jni Nov 6, 2023 •

edited