New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Efficient labels mapping for drawing in Labels (60 FPS even with 8000x8000 images) #5732
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alisterburt Actually, there is a small problem. It failed some tests for the Label layer. It turned out that this optimization cannot be easily applied when |
ooh, I just saw you brought it back, could you explain how this is working? |
I found out that there is no need to disable it when |
Codecov Report
@@ Coverage Diff @@
## main #5732 +/- ##
==========================================
+ Coverage 89.89% 89.91% +0.01%
==========================================
Files 614 615 +1
Lines 52283 52474 +191
==========================================
+ Hits 47002 47180 +178
- Misses 5281 5294 +13
|
@alisterburt Labels and milestone (are you sure for 0.4)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of comments here,
-
Fix inefficient label mapping in direct color mode (10-20x speedup) #5723 seems to be reasonably independent of this — could we merge that first? It will make the changes more granular and easier to review — Fix inefficient label mapping in direct color mode (10-20x speedup) #5723 was already ready to merge so I kinda wanna get that in while we review this. Does that work @ksofiyuk? They look to me like they sort of touch different parts of the file. But,
-
Another reason why I want to have the two reviews separately is that I think there should be a bit more of a change in the middle to make it clearer what's going on. For example, I don't like reusing variable names (
raw_modified = raw_modified[changed_mask]
), as it makes it harder to reason about the code (is this image shaped or a linear array of changed values?). I also don't like that now, in most cases,image =
will actually not be an image, but a linear set of values. So I'd like a bit of an update to the logic and variable names to better reflect what's going on. -
Finally, while Fix inefficient label mapping in direct color mode (10-20x speedup) #5723 has no bearing on Use a shader for low discrepancy label conversion #3308, this does have a bearing on it — it looks like caching will be beneficial to Use a shader for low discrepancy label conversion #3308 also, except we are caching
.astype(np.float32)
instead of the complicated legacy logic here. So having a separate PR will make it easier to port those relevant changes to Use a shader for low discrepancy label conversion #3308.
btw, I'm super curious — I thought that this would actually track changed pixels from painting and so on, in which case the speedup would be obvious. But no, it actually checks for changes by doing a full array != value
, which is not immediately obviously faster than doing array.astype()
. But your benchmarks (and my own experimentation with this PR 🙏) suggest that it definitely is faster! Do I understand correctly that that's what you're doing here?
I just saw that indeed this simply has all of the original commits from #5723, so it will be very straightforward to rebase/merge main after that merges. I'm gonna go ahead and do that. |
Oh! One more Q: could you please test this solution with zarr, tensorstore, or dask arrays as labels? I'm concerned about some of these operations (boolean indexing) in those contexts... |
@ksofiyuk are you comfortable with cherry-picking (should be easier than merging here)? If not I can do it for you and force push. |
Really nice work. Thank you @ksofiyuk ! |
…x8000 images) (#5732) # Description This PR introduces two independent optimizations for labels rendering during drawing: local updates and caching optimizations. These optimizations virtually solve any performance issues related to drawing in the Labels layer. With them, it is possible to achieve 60 FPS when drawing even in 8000x8000 images. ### Local updates The initial implementation updates the whole labels map and copies it to the shader on each brush update, which becomes a major bottleneck when you work with high-resolution images no matter how effective the labels mapping implementation is. In this PR, only partial updates are sent to the VisPy labels layer. Each time `Layers.data_setitem` is called, it tracks the bounds of modified region, and instead of calling `Labels.refresh()` that triggers the update of the whole labels image, it calls `Labels._partial_labels_refresh` that emits the `Labels.events.labels_update` event that comes with the slice localizing the modified region, which is then handled by `VispyLabelsLayer._on_partial_labels_update`. VisPy textures can be partially updated, which is used in the `_on_partial_labels_update` method. ### Caching optimization The idea is to recompute color mapping only for the elements of a label map that are changed from a previous update. Even in the parts of the code where the color mapping is implemented quite efficiently it can give up to 5x speedup by avoiding slow `np.float32` recomputations for most pixels. In practice, when you use brush, less than 1% of pixels are updated at each iteration (on large images the percentage is even smaller). As a result, this optimization should work and give a significant boost in 99%+ of typical use case scenarios. ## Benchmark results I added a new benchmark (benchmark_labels_layer.LabelsDrawing2DSuite) that measures the timings of brush drawing in different modes (auto/direct, contour == 0/1) with different brush sizes. It simulates the brush drawing from the position (0, 0) to (n - 1, n - 1) with 30 refresh updates along the way. This PR: ``` ====== ============ ============ ============ ============ ============ -- color_mode / contour ------------------- --------------------------------------------------- n brush_size auto / 0 auto / 1 direct / 0 direct / 1 ====== ============ ============ ============ ============ ============ 512 8 28.2±0.3ms 33.6±1ms 29.2±1ms 32.9±0.7ms 512 64 22.8±0.1ms 28.1±0.2ms 22.9±0.1ms 29.1±0.3ms 512 256 128±0.9ms 153±0.7ms 128±0.7ms 153±1ms 3072 8 147±0.6ms 165±0.7ms 147±3ms 166±0.4ms 3072 64 131±0.9ms 151±2ms 132±0.8ms 150±1ms 3072 256 450±2ms 502±2ms 452±0.7ms 501±2ms ====== ============ ============ ============ ============ ============ ``` main 7b3f7ec: ``` ====== ============ ============ ============ ============ ============ -- color_mode / contour ------------------- --------------------------------------------------- n brush_size auto / 0 auto / 1 direct / 0 direct / 1 ====== ============ ============ ============ ============ ============ 512 8 86.6±0.7ms 242±2ms 100±1ms 256±1ms 512 64 75.6±1ms 230±1ms 90.6±1ms 241±1ms 512 256 163±0.9ms 316±2ms 180±1ms 328±1ms 3072 8 1.20±0.01s 6.33±0.02s 1.80±0.02s 6.90±0.02s 3072 64 1.15±0.01s 6.30±0s 1.77±0s 6.90±0s 3072 256 1.30±0.01s 6.40±0.01s 1.92±0s 7.04±0.02s ====== ============ ============ ============ ============ ============ ``` The difference is most noticeable when the heavy labels mapping code is used (contour=1), with small brush sizes it gives almost 50x performance boost. Please keep in mind that this benchmark only measures the time it takes to update `layer.data` + the time it takes to convert labels to colors before sending them to VisPy. This benchmark does not take into account the time it takes to transfer data from napari to OpenGL or the time it takes to process labels in any type of OpenGL shader. And the local updates also give a substantial speed up in the part that is not measured. ## Type of change <!-- Please delete options that are not relevant. --> - [X] Optimization (non-breaking change which speedups existing code) # How has this been tested? - [X] all tests pass with my change ## Final checklist: - [X] My PR is the minimum possible work for the desired functionality - [X] I have commented my code, particularly in hard-to-understand areas - [X] I have made corresponding changes to the documentation - [X] I have added tests that prove my fix is effective or that my feature works - [ ] If I included new strings, I have used `trans.` to make them localizable. For more information see our [translations guide](https://napari.org/developers/translations.html). --------- Co-authored-by: alisterburt <alisterburt@gmail.com> Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com>
…x8000 images) (#5732) # Description This PR introduces two independent optimizations for labels rendering during drawing: local updates and caching optimizations. These optimizations virtually solve any performance issues related to drawing in the Labels layer. With them, it is possible to achieve 60 FPS when drawing even in 8000x8000 images. ### Local updates The initial implementation updates the whole labels map and copies it to the shader on each brush update, which becomes a major bottleneck when you work with high-resolution images no matter how effective the labels mapping implementation is. In this PR, only partial updates are sent to the VisPy labels layer. Each time `Layers.data_setitem` is called, it tracks the bounds of modified region, and instead of calling `Labels.refresh()` that triggers the update of the whole labels image, it calls `Labels._partial_labels_refresh` that emits the `Labels.events.labels_update` event that comes with the slice localizing the modified region, which is then handled by `VispyLabelsLayer._on_partial_labels_update`. VisPy textures can be partially updated, which is used in the `_on_partial_labels_update` method. ### Caching optimization The idea is to recompute color mapping only for the elements of a label map that are changed from a previous update. Even in the parts of the code where the color mapping is implemented quite efficiently it can give up to 5x speedup by avoiding slow `np.float32` recomputations for most pixels. In practice, when you use brush, less than 1% of pixels are updated at each iteration (on large images the percentage is even smaller). As a result, this optimization should work and give a significant boost in 99%+ of typical use case scenarios. ## Benchmark results I added a new benchmark (benchmark_labels_layer.LabelsDrawing2DSuite) that measures the timings of brush drawing in different modes (auto/direct, contour == 0/1) with different brush sizes. It simulates the brush drawing from the position (0, 0) to (n - 1, n - 1) with 30 refresh updates along the way. This PR: ``` ====== ============ ============ ============ ============ ============ -- color_mode / contour ------------------- --------------------------------------------------- n brush_size auto / 0 auto / 1 direct / 0 direct / 1 ====== ============ ============ ============ ============ ============ 512 8 28.2±0.3ms 33.6±1ms 29.2±1ms 32.9±0.7ms 512 64 22.8±0.1ms 28.1±0.2ms 22.9±0.1ms 29.1±0.3ms 512 256 128±0.9ms 153±0.7ms 128±0.7ms 153±1ms 3072 8 147±0.6ms 165±0.7ms 147±3ms 166±0.4ms 3072 64 131±0.9ms 151±2ms 132±0.8ms 150±1ms 3072 256 450±2ms 502±2ms 452±0.7ms 501±2ms ====== ============ ============ ============ ============ ============ ``` main 7b3f7ec: ``` ====== ============ ============ ============ ============ ============ -- color_mode / contour ------------------- --------------------------------------------------- n brush_size auto / 0 auto / 1 direct / 0 direct / 1 ====== ============ ============ ============ ============ ============ 512 8 86.6±0.7ms 242±2ms 100±1ms 256±1ms 512 64 75.6±1ms 230±1ms 90.6±1ms 241±1ms 512 256 163±0.9ms 316±2ms 180±1ms 328±1ms 3072 8 1.20±0.01s 6.33±0.02s 1.80±0.02s 6.90±0.02s 3072 64 1.15±0.01s 6.30±0s 1.77±0s 6.90±0s 3072 256 1.30±0.01s 6.40±0.01s 1.92±0s 7.04±0.02s ====== ============ ============ ============ ============ ============ ``` The difference is most noticeable when the heavy labels mapping code is used (contour=1), with small brush sizes it gives almost 50x performance boost. Please keep in mind that this benchmark only measures the time it takes to update `layer.data` + the time it takes to convert labels to colors before sending them to VisPy. This benchmark does not take into account the time it takes to transfer data from napari to OpenGL or the time it takes to process labels in any type of OpenGL shader. And the local updates also give a substantial speed up in the part that is not measured. ## Type of change <!-- Please delete options that are not relevant. --> - [X] Optimization (non-breaking change which speedups existing code) # How has this been tested? - [X] all tests pass with my change ## Final checklist: - [X] My PR is the minimum possible work for the desired functionality - [X] I have commented my code, particularly in hard-to-understand areas - [X] I have made corresponding changes to the documentation - [X] I have added tests that prove my fix is effective or that my feature works - [ ] If I included new strings, I have used `trans.` to make them localizable. For more information see our [translations guide](https://napari.org/developers/translations.html). --------- Co-authored-by: alisterburt <alisterburt@gmail.com> Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com>
…x8000 images) (#5732) # Description This PR introduces two independent optimizations for labels rendering during drawing: local updates and caching optimizations. These optimizations virtually solve any performance issues related to drawing in the Labels layer. With them, it is possible to achieve 60 FPS when drawing even in 8000x8000 images. ### Local updates The initial implementation updates the whole labels map and copies it to the shader on each brush update, which becomes a major bottleneck when you work with high-resolution images no matter how effective the labels mapping implementation is. In this PR, only partial updates are sent to the VisPy labels layer. Each time `Layers.data_setitem` is called, it tracks the bounds of modified region, and instead of calling `Labels.refresh()` that triggers the update of the whole labels image, it calls `Labels._partial_labels_refresh` that emits the `Labels.events.labels_update` event that comes with the slice localizing the modified region, which is then handled by `VispyLabelsLayer._on_partial_labels_update`. VisPy textures can be partially updated, which is used in the `_on_partial_labels_update` method. ### Caching optimization The idea is to recompute color mapping only for the elements of a label map that are changed from a previous update. Even in the parts of the code where the color mapping is implemented quite efficiently it can give up to 5x speedup by avoiding slow `np.float32` recomputations for most pixels. In practice, when you use brush, less than 1% of pixels are updated at each iteration (on large images the percentage is even smaller). As a result, this optimization should work and give a significant boost in 99%+ of typical use case scenarios. ## Benchmark results I added a new benchmark (benchmark_labels_layer.LabelsDrawing2DSuite) that measures the timings of brush drawing in different modes (auto/direct, contour == 0/1) with different brush sizes. It simulates the brush drawing from the position (0, 0) to (n - 1, n - 1) with 30 refresh updates along the way. This PR: ``` ====== ============ ============ ============ ============ ============ -- color_mode / contour ------------------- --------------------------------------------------- n brush_size auto / 0 auto / 1 direct / 0 direct / 1 ====== ============ ============ ============ ============ ============ 512 8 28.2±0.3ms 33.6±1ms 29.2±1ms 32.9±0.7ms 512 64 22.8±0.1ms 28.1±0.2ms 22.9±0.1ms 29.1±0.3ms 512 256 128±0.9ms 153±0.7ms 128±0.7ms 153±1ms 3072 8 147±0.6ms 165±0.7ms 147±3ms 166±0.4ms 3072 64 131±0.9ms 151±2ms 132±0.8ms 150±1ms 3072 256 450±2ms 502±2ms 452±0.7ms 501±2ms ====== ============ ============ ============ ============ ============ ``` main 7b3f7ec: ``` ====== ============ ============ ============ ============ ============ -- color_mode / contour ------------------- --------------------------------------------------- n brush_size auto / 0 auto / 1 direct / 0 direct / 1 ====== ============ ============ ============ ============ ============ 512 8 86.6±0.7ms 242±2ms 100±1ms 256±1ms 512 64 75.6±1ms 230±1ms 90.6±1ms 241±1ms 512 256 163±0.9ms 316±2ms 180±1ms 328±1ms 3072 8 1.20±0.01s 6.33±0.02s 1.80±0.02s 6.90±0.02s 3072 64 1.15±0.01s 6.30±0s 1.77±0s 6.90±0s 3072 256 1.30±0.01s 6.40±0.01s 1.92±0s 7.04±0.02s ====== ============ ============ ============ ============ ============ ``` The difference is most noticeable when the heavy labels mapping code is used (contour=1), with small brush sizes it gives almost 50x performance boost. Please keep in mind that this benchmark only measures the time it takes to update `layer.data` + the time it takes to convert labels to colors before sending them to VisPy. This benchmark does not take into account the time it takes to transfer data from napari to OpenGL or the time it takes to process labels in any type of OpenGL shader. And the local updates also give a substantial speed up in the part that is not measured. ## Type of change <!-- Please delete options that are not relevant. --> - [X] Optimization (non-breaking change which speedups existing code) # How has this been tested? - [X] all tests pass with my change ## Final checklist: - [X] My PR is the minimum possible work for the desired functionality - [X] I have commented my code, particularly in hard-to-understand areas - [X] I have made corresponding changes to the documentation - [X] I have added tests that prove my fix is effective or that my feature works - [ ] If I included new strings, I have used `trans.` to make them localizable. For more information see our [translations guide](https://napari.org/developers/translations.html). --------- Co-authored-by: alisterburt <alisterburt@gmail.com> Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com>
…x8000 images) (#5732) # Description This PR introduces two independent optimizations for labels rendering during drawing: local updates and caching optimizations. These optimizations virtually solve any performance issues related to drawing in the Labels layer. With them, it is possible to achieve 60 FPS when drawing even in 8000x8000 images. ### Local updates The initial implementation updates the whole labels map and copies it to the shader on each brush update, which becomes a major bottleneck when you work with high-resolution images no matter how effective the labels mapping implementation is. In this PR, only partial updates are sent to the VisPy labels layer. Each time `Layers.data_setitem` is called, it tracks the bounds of modified region, and instead of calling `Labels.refresh()` that triggers the update of the whole labels image, it calls `Labels._partial_labels_refresh` that emits the `Labels.events.labels_update` event that comes with the slice localizing the modified region, which is then handled by `VispyLabelsLayer._on_partial_labels_update`. VisPy textures can be partially updated, which is used in the `_on_partial_labels_update` method. ### Caching optimization The idea is to recompute color mapping only for the elements of a label map that are changed from a previous update. Even in the parts of the code where the color mapping is implemented quite efficiently it can give up to 5x speedup by avoiding slow `np.float32` recomputations for most pixels. In practice, when you use brush, less than 1% of pixels are updated at each iteration (on large images the percentage is even smaller). As a result, this optimization should work and give a significant boost in 99%+ of typical use case scenarios. ## Benchmark results I added a new benchmark (benchmark_labels_layer.LabelsDrawing2DSuite) that measures the timings of brush drawing in different modes (auto/direct, contour == 0/1) with different brush sizes. It simulates the brush drawing from the position (0, 0) to (n - 1, n - 1) with 30 refresh updates along the way. This PR: ``` ====== ============ ============ ============ ============ ============ -- color_mode / contour ------------------- --------------------------------------------------- n brush_size auto / 0 auto / 1 direct / 0 direct / 1 ====== ============ ============ ============ ============ ============ 512 8 28.2±0.3ms 33.6±1ms 29.2±1ms 32.9±0.7ms 512 64 22.8±0.1ms 28.1±0.2ms 22.9±0.1ms 29.1±0.3ms 512 256 128±0.9ms 153±0.7ms 128±0.7ms 153±1ms 3072 8 147±0.6ms 165±0.7ms 147±3ms 166±0.4ms 3072 64 131±0.9ms 151±2ms 132±0.8ms 150±1ms 3072 256 450±2ms 502±2ms 452±0.7ms 501±2ms ====== ============ ============ ============ ============ ============ ``` main 7b3f7ec: ``` ====== ============ ============ ============ ============ ============ -- color_mode / contour ------------------- --------------------------------------------------- n brush_size auto / 0 auto / 1 direct / 0 direct / 1 ====== ============ ============ ============ ============ ============ 512 8 86.6±0.7ms 242±2ms 100±1ms 256±1ms 512 64 75.6±1ms 230±1ms 90.6±1ms 241±1ms 512 256 163±0.9ms 316±2ms 180±1ms 328±1ms 3072 8 1.20±0.01s 6.33±0.02s 1.80±0.02s 6.90±0.02s 3072 64 1.15±0.01s 6.30±0s 1.77±0s 6.90±0s 3072 256 1.30±0.01s 6.40±0.01s 1.92±0s 7.04±0.02s ====== ============ ============ ============ ============ ============ ``` The difference is most noticeable when the heavy labels mapping code is used (contour=1), with small brush sizes it gives almost 50x performance boost. Please keep in mind that this benchmark only measures the time it takes to update `layer.data` + the time it takes to convert labels to colors before sending them to VisPy. This benchmark does not take into account the time it takes to transfer data from napari to OpenGL or the time it takes to process labels in any type of OpenGL shader. And the local updates also give a substantial speed up in the part that is not measured. ## Type of change <!-- Please delete options that are not relevant. --> - [X] Optimization (non-breaking change which speedups existing code) # How has this been tested? - [X] all tests pass with my change ## Final checklist: - [X] My PR is the minimum possible work for the desired functionality - [X] I have commented my code, particularly in hard-to-understand areas - [X] I have made corresponding changes to the documentation - [X] I have added tests that prove my fix is effective or that my feature works - [ ] If I included new strings, I have used `trans.` to make them localizable. For more information see our [translations guide](https://napari.org/developers/translations.html). --------- Co-authored-by: alisterburt <alisterburt@gmail.com> Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com>
This pull request has been mentioned on Image.sc Forum. There might be relevant details there: https://forum.image.sc/t/announcement-napari-0-4-18-released/83322/1 |
Fixes #6079 It turns out that the caching behaviour introduced in #5732 depends on the slice data being updated by painting. This works out for NumPy arrays because the slice data is a view of the original data, so updating the original (as painting does) updates the slice. However, when the data is a zarr or tensorstore array, the slice is a NumPy copy of the original data, so the caching mechanism believes that nothing has changed and the display is not updated. This adds tests for the behaviour and fixes it by painting directly into the slice data if the data array is not a NumPy array. It's a bit of a bandaid fix but it works and is [endorsed](#6079 (comment)) by our slicing expert @andy-sweet. 😂 (I've also made a couple of drive-by updates to the code because some methods are no longer used in the code after #5732 but that was missed at the time.) ## Type of change - [x] Bug-fix (non-breaking change which fixes an issue) --------- Co-authored-by: Grzegorz Bokota <bokota+github@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Fixes #6079 It turns out that the caching behaviour introduced in #5732 depends on the slice data being updated by painting. This works out for NumPy arrays because the slice data is a view of the original data, so updating the original (as painting does) updates the slice. However, when the data is a zarr or tensorstore array, the slice is a NumPy copy of the original data, so the caching mechanism believes that nothing has changed and the display is not updated. This adds tests for the behaviour and fixes it by painting directly into the slice data if the data array is not a NumPy array. It's a bit of a bandaid fix but it works and is [endorsed](#6079 (comment)) by our slicing expert @andy-sweet. 😂 (I've also made a couple of drive-by updates to the code because some methods are no longer used in the code after #5732 but that was missed at the time.) - [x] Bug-fix (non-breaking change which fixes an issue) --------- Co-authored-by: Grzegorz Bokota <bokota+github@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
# References and relevant issues closes #6579 supersedes #6583 # Description #5732 introduced a cache of mapped data so that only changed indices were mapped to texture dtypes/values and sent on to the GPU. In this PR, an alternate strategy is introduced: rather than caching previously-transformed data and then doing a diff with the cache, we paint the data *and* the texture-mapped data directly. The partial update of the on-GPU texture also introduced in #5732 is maintained, as it can dramatically reduce the amount of data needing to be transferred from CPU to GPU memory. This PR is built on top of #6602. --------- Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
closes #6579 supersedes #6583 were mapped to texture dtypes/values and sent on to the GPU. In this PR, an alternate strategy is introduced: rather than caching previously-transformed data and then doing a diff with the cache, we paint the data *and* the texture-mapped data directly. The partial update of the on-GPU texture also introduced in #5732 is maintained, as it can dramatically reduce the amount of data needing to be transferred from CPU to GPU memory. This PR is built on top of #6602. --------- Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Description
This PR introduces two independent optimizations for labels rendering during drawing: local updates and caching optimizations.
These optimizations virtually solve any performance issues related to drawing in the Labels layer. With them, it is possible to achieve 60 FPS when drawing even in 8000x8000 images.
Local updates
The initial implementation updates the whole labels map and copies it to the shader on each brush update, which becomes a major bottleneck when you work with high-resolution images no matter how effective the labels mapping implementation is.
In this PR, only partial updates are sent to the VisPy labels layer. Each time
Layers.data_setitem
is called, it tracks the bounds of modified region, and instead of callingLabels.refresh()
that triggers the update of the whole labels image, it callsLabels._partial_labels_refresh
that emits theLabels.events.labels_update
event that comes with the slice localizing the modified region, which is then handled byVispyLabelsLayer._on_partial_labels_update
. VisPy textures can be partially updated, which is used in the_on_partial_labels_update
method.Caching optimization
The idea is to recompute color mapping only for the elements of a label map that are changed from a previous update. Even in the parts of the code where the color mapping is implemented quite efficiently it can give up to 5x speedup by avoiding slow
np.float32
recomputations for most pixels.In practice, when you use brush, less than 1% of pixels are updated at each iteration (on large images the percentage is even smaller). As a result, this optimization should work and give a significant boost in 99%+ of typical use case scenarios.
Benchmark results
I added a new benchmark (benchmark_labels_layer.LabelsDrawing2DSuite) that measures the timings of brush drawing in different modes (auto/direct, contour == 0/1) with different brush sizes. It simulates the brush drawing from the position (0, 0) to (n - 1, n - 1) with 30 refresh updates along the way.
This PR:
main 7b3f7ec:
The difference is most noticeable when the heavy labels mapping code is used (contour=1), with small brush sizes it gives almost 50x performance boost.
Please keep in mind that this benchmark only measures the time it takes to update
layer.data
+ the time it takes to convert labels to colors before sending them to VisPy. This benchmark does not take into account the time it takes to transfer data from napari to OpenGL or the time it takes to process labels in any type of OpenGL shader. And the local updates also give a substantial speed up in the part that is not measured.Type of change
How has this been tested?
Final checklist:
trans.
to make them localizable.For more information see our translations guide.