Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new async ui settings, remove old async/octree usage #5711

Closed
wants to merge 11 commits into from

Conversation

kcpevey
Copy link
Contributor

@kcpevey kcpevey commented Apr 11, 2023

Fixes/Closes

This PR is a major step in the async slicing implementation #4795

Description

A new approach to async slicing was introduced #5377 . This PR began as an effort to expose the new async slicing via the settings. This brought up the question - what should we do with the old async and octree settings? A poll was opened up to the community and it was unanimously decided that we should move forward with removal of all the old settings. That was done as part of this PR in order to reuse the existing environment variable NAPARI_ASYNC to smooth the transition with existing users (also, naming is hard). For all of that to work, we needed to go ahead and remove the usage of the old methods. We have left the actual source files in the repo for now, though we have deprecated a few properties.

A followup PR will be necessary to remove the actual source files that are no longer being used (and the properties deprecated here). I originally did all of that in branch and it touched 69 files. I'm hoping that this (still huge) might be easier to pass review, then followup with removal of the actual source. That's open for discussion.

I did go ahead and remove the tests for the source files. This was done because of the way the test suite is run. I'd have to add some extra marks or skips to prevent them from running and it just seemed like the best option was to remove them.

Also regarding tests: the old async required a restart, but the new does not. For this reason, the test suite becomes greatly simplified and I was able to remove the async/sync marks and the complex skipping scheme. As such, a separate GH Action run is no longer necessary.

Async slicing can now be turned on via environment variable NAPARI_ASYNC=1 or via the Settings -> Experimental UI. It does not require a restart.

Ultimately, this PR shows a lot of simplification and cleanup that we're able to achieve with the new async slicing method.

This has been a joint effort with @andy-sweet :)

References

Type of change

  • Bug-fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How has this been tested?

  • example: all tests pass with my change

Final checklist:

  • My PR is the minimum possible work for the desired functionality
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • If I included new strings, I have used trans. to make them localizable.
    For more information see our translations guide.

@github-actions github-actions bot added preferences Issues relating to the creation of new preference fields/panels qt Relates to qt task tests Something related to our tests labels Apr 11, 2023
@codecov
Copy link

codecov bot commented Apr 11, 2023

Codecov Report

Merging #5711 (084e653) into main (1da0350) will decrease coverage by 1.82%.
The diff coverage is 96.83%.

❗ Current head 084e653 differs from pull request most recent head a966bc7. Consider uploading reports for the commit a966bc7 to get more accurate results

@@            Coverage Diff             @@
##             main    #5711      +/-   ##
==========================================
- Coverage   89.86%   88.04%   -1.82%     
==========================================
  Files         610      602       -8     
  Lines       52121    51774     -347     
==========================================
- Hits        46837    45586    -1251     
- Misses       5284     6188     +904     
Impacted Files Coverage Δ
napari/_qt/_qapp_model/qactions/_view.py 90.90% <ø> (-1.40%) ⬇️
napari/_qt/_tests/test_qt_notifications.py 96.00% <ø> (-0.03%) ⬇️
napari/_qt/dialogs/preferences_dialog.py 96.29% <ø> (+5.77%) ⬆️
.../_qt/layer_controls/qt_layer_controls_container.py 100.00% <ø> (ø)
napari/_qt/qt_event_loop.py 79.86% <ø> (-1.10%) ⬇️
napari/_tests/test_magicgui.py 96.63% <ø> (-0.02%) ⬇️
napari/_tests/test_viewer.py 98.48% <ø> (-0.01%) ⬇️
napari/_vispy/utils/visual.py 96.07% <ø> (+8.14%) ⬆️
napari/conftest.py 86.72% <ø> (-0.38%) ⬇️
napari/layers/_tests/test_dask_layers.py 100.00% <ø> (ø)
... and 15 more

... and 58 files with indirect coverage changes

@andy-sweet
Copy link
Member

Also note that we omitted simplifying the various image slice classes (e.g. ImageSlice, ImageSliceData, ImageLoader) in this PR to avoid too many changes all in one go. But are planning to remove/simplify those in a follow-up PR.

@jni
Copy link
Member

jni commented Apr 12, 2023

Thank you @kcpevey and @andy-sweet! And thanks in particular @kcpevey for the detailed description, it's 🥇. Andy walked me through the PR at the architecture WG meeting. We noted two points:

  • rather than deprecate layer.loaded, let's hook it up to the new async events so that when slicing starts it gets set to false and when slicing is done it gets set to true.
  • keeping or discarding the old code is optional. I was at some point vaguely opposed to discarding it because some of it will be useful for the next tiled rendering implementation, and in particular the TiledImageVisual is something that VisPy asked us to upstream at the time — but it will still be in the code history and recoverable by anyone who needs to recover it e.g. by checking out 0.4.17. Having said this, your argument @kcpevey about keeping the PR small and easy to review is good, so I suggest we wait till the next PR to remove it. I'd be very supportive of a PR to VisPy to get the tiled visual in. 🙏

@kcpevey
Copy link
Contributor Author

kcpevey commented Apr 12, 2023

rather than deprecate layer.loaded, let's hook it up to the new async events so that when slicing starts it gets set to false and when slicing is done it gets set to true.

@jni My initial reaction to this was "why? It's not being used." Then I thought through how it could be used (e.g. programmatic screenshots) and I think you're right. Thats a useful feature and a good idea. I'll work on hooking it up.

@andy-sweet
Copy link
Member

andy-sweet commented Apr 12, 2023

rather than deprecate layer.loaded, let's hook it up to the new async events so that when slicing starts it gets set to false and when slicing is done it gets set to true.

@jni My initial reaction to this was "why? It's not being used." Then I thought through how it could be used (e.g. programmatic screenshots) and I think you're right. Thats a useful feature and a good idea. I'll work on hooking it up.

I thought a bit more about Layer.loaded.

As one of the goals of this work is to remove slice state from the layer, Layer.loaded should eventually be deprecated because that is definitely slice state.

I think that Layer.loaded is only useful for checking if some slicing is happening on the layer, so I wonder if something like Viewer.is_slicing or Viewer.wait_for_slicing() would be a better long term home for the desired usage.

But I'm OK with implementing a workaround to support Layer.loaded and not deprecate it in this PR. The implementation may be a little clumsy since the new async is not really built to work through side-effects on layer slice state, but I think it's still doable.

@andy-sweet
Copy link
Member

andy-sweet commented Apr 12, 2023

  • keeping or discarding the old code is optional.

Also optional for me. From the architecture working group meeting, I have a mild preference for removing the code and tests in this one PR, so that it's easier to reference (e.g. I can say that the octree code was removed in #5711 instead of #5711 and <issue_number>).

  • your argument @kcpevey about keeping the PR small and easy to review is good, so I suggest we wait till the next PR to remove it.

I'm think that removing the octree code will just mean more deleted files, which doesn't really make the PR harder to review. But if that's not the case, then waiting for the next one makes sense.

@kcpevey
Copy link
Contributor Author

kcpevey commented Apr 14, 2023

As one of the goals of this work is to remove slice state from the layer, Layer.loaded should eventually be deprecated because that is definitely slice state.

I think that Layer.loaded is only useful for checking if some slicing is happening on the layer, so I wonder if something like Viewer.is_slicing or Viewer.wait_for_slicing() would be a better long term home for the desired usage.

But I'm OK with implementing a workaround to support Layer.loaded and not deprecate it in this PR. The implementation may be a little clumsy since the new async is not really built to work through side-effects on layer slice state, but I think it's still doable.

That is true, if our goal is to remove state on the layer, it doesn't make sense to leave this in (and write new code to make it work with the new async). As you said, if we left it in, the implementation will be a little clumsy. I definitely felt this as I tried to implement it. I struggled a good bit with where and how.

I also started working through implementing this at the viewer level instead. This felt much cleaner and was easier to get working. I added a busy property to the layer_slicer itself, then exposed it publicly on the viewer for easy access. Initial tests look good. @jni what do you think about this approach?

@kcpevey
Copy link
Contributor Author

kcpevey commented Apr 14, 2023

@andy-sweet @jni I pushed both options up for comparison. I also tried adding a test for Points but I'm still working on it.

@andy-sweet
Copy link
Member

For anyone interested, I'm working off this branch to remove some of the other image slice classes introduced by the old approach to async slicing: kcpevey#30

We could merge these two efforts, but I wanted to keep the PR size/complexity down initially.

@jni
Copy link
Member

jni commented Apr 15, 2023

Hey folks,

Thanks @kcpevey for implementing both approaches and honestly I think we should keep both for now!

I also started working through implementing this at the viewer level instead. This felt much cleaner and was easier to get working. I added a busy property to the layer_slicer itself, then exposed it publicly on the viewer for easy access. Initial tests look good.

I think the issue is that in the future, layers will be loaded at different times — I think this is a really critical use case: for example, you might have an image layer (loads quickly), and a "result of complicated convnet on that image in the current slice" layer (loads very slowly). Users might be interested in querying only some of those layers or doing processing on each layer as soon as it's ready. So it's not enough to know whether the layer slicer is busy, we need to know what it's busy with.

Maybe someday we move away from layer.loaded and we instead use layer_slicer.loaded[layer] or something like that, but in the meantime I think it's good to have the semantic meaning of this handled somewhere.

And the code right now is not too onerous! 😊 So I'm happy where this is tbh! modulo tests passing 😅

@kcpevey
Copy link
Contributor Author

kcpevey commented Apr 19, 2023

it's not enough to know whether the layer slicer is busy, we need to know what it's busy with.

Totally agree with you here. And something like you suggested would likely be right choice. We do keep a dictionary of {tuple(layers): task}. We could either check that dictionary (which would require deconstructing it) or we could keep a second dict. That said, if you're happy with how its currently implemented, then I'm happy too :)

Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com>
assert not layer.loaded

wait_until_vispy_image_data_equal(
qtbot, vispy_layer, np.flip(data[:, 1:][-3]).reshape(1, 3)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is obnoxious, but I have no idea the proper way to go from the original to the expected. Can someone help?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can access the vispy marker's 3D positions with

positions = vispy_layer.node._subvisuals[0]._data['a_position']

Since this example is 3D with no actual slicing (i.e. 3D to 2D), then you just need to flip the coordinates with positions[:, ::-1] or np.fliplr. If you were doing 2D slicing of 3D data you'd also need to ignore the leading the dimension with something like positions[:, -ndisplay::-1].

@kcpevey kcpevey mentioned this pull request Apr 21, 2023
12 tasks
@@ -1721,6 +1729,8 @@ def _set_view_slice(self):

def _make_slice_request(self, dims) -> _PointSliceRequest:
"""Make a Points slice request based on the given dims and these data."""
# indicate the layer is currently mid-load
self._loaded = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach means there can be times when loaded is True when it shouldn't be. For example:

points = Points(...)
points.loaded  # => True, as expected
points._make_slice_request()  # 1st slice
points.loaded  # => False, as expected
points._make_slice_request()  # 2nd slice
points.loaded  # => False, as expected
points._update_slice_response(...)  # 1st slice
points.loaded  # => True, but should be False because the 2nd slice is not loaded
points._update_slice_response(...)  # 2nd slice
points.loaded  # => True, as expected

In order to always return the correct value, I think you need to know if there are other pending/active slice tasks for this layer. You could do that by storing an ID per request and checking that in _update_slice_response.

But I think a better solution is to update this state from _LayerSlicer which already has a way to check that state. That also has the benefit of making the state change more explicit rather than as a side-effect of make_slice_request where the intent was to making that a constant method (i.e. no state changes). And also doesn't add more slice state to the layer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are absolutely correct. I lost sight of the multiple requests part of this.

In order to always return the correct value, I think you need to know if there are other pending/active slice tasks for this layer. You could do that by storing an ID per request and checking that in _update_slice_response.

Not sure if this can be done because the layer itself has no access to the slice tasks. Well I guess you could do it if you store the tasks on the layer itself, but that just adds more state.

But I think a better solution is to update this state from _LayerSlicer

Exactly. My question is: how do we make this accessible from the layer? We already have a method LayerSlicer._find_existing_task([Layer,]) which can discover tasks for any layer, but we don't have anything that can reach out from the layer itself. I'm wondering if that's even necessary to have?


layer = vispy_layer.layer
assert not viewer.slicing_in_progress
with layer.lock:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a quick look, I don't think this lock is doing anything because it only seems to be held here. Did I miss something?

For images, I think you could reuse LockableData from test_layer_slicer and then hold that so that __getitem__ is blocked. Points are a bit harder since they coerce the input data to a numpy array, but you can set Points._data directly as in test_submit_with_one_3d_points

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you are right, I'll dig into this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is something very fishy going on with the data handling in async points. I haven't sorted it out yet, I need to understand more about dimensions (data indices vs world dims vs slice indices... )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a bug, @andy-sweet opened a PR: #5783

@kcpevey
Copy link
Contributor Author

kcpevey commented Apr 25, 2023

Note: napari/_vispy/experimental/_tests/test_vispy_tiled_image.py was recently modified on main. I imagine this means that someone is using and continuing to develop a file that we are planning to remove @andy-sweet

@andy-sweet
Copy link
Member

Note: napari/_vispy/experimental/_tests/test_vispy_tiled_image.py was recently modified on main. I imagine this means that someone is using and continuing to develop a file that we are planning to remove @andy-sweet

I think that was in #5432, which is a refactor not directly concerned with that code specifically. But as it's code that sits in the repo, it does need to be maintained as the recommended API of napari changes. More reason to remove it in my opinion ;)

@andy-sweet
Copy link
Member

Closing in favor of #5816

@andy-sweet andy-sweet closed this May 9, 2023
brisvag pushed a commit that referenced this pull request May 15, 2023
…5783)

# Description

This PR adds some async slicing integration tests for points, one of
which was failing due to unintended rounding. It also adds some debug
logs across sync and async slicing paths to make it easier to debug
issues. Finally, it changes the main log message format to include the
thread name, which is incredibly helpful when debugging concurrency
issues.

# References

This was split off from #5711 to avoid bloating that already large PR.
Credit to @kcpevey, who found the the bug and wrote most of #5711. Also
this part of the larger #4795

## Type of change
- [x] Bug-fix (non-breaking change which fixes an issue)

# How has this been tested?
- [x] added new tests to cover issues
- [x] all existing tests pass with my change

I manually tested behavior and logging with async off and on.
jni added a commit that referenced this pull request Jun 22, 2023
# Fixes/Closes

Closes #5517
Closes #2790

# Description

This replaces [the old approach to asynchronous
loading](https://napari.org/0.4.16/guides/rendering.html) with [the new
approach described in
NAP-4](https://napari.org/stable/naps/4-async-slicing.html).

See the old approach (on `main`) on some remote 3D multi-resolution
images from Janelia via [@chili-chiu's fantastic bio sample data
plugin](https://www.napari-hub.org/plugins/napari-bio-sample-data):

https://github.com/napari/napari/assets/2608297/e8673d4e-c387-49bc-9254-30d0fc9c1cd2

And the new approach on the same data that is more responsive:

https://github.com/napari/napari/assets/2608297/0b9f1f12-d5e0-4980-be05-c5c58285c1a2

This PR also removes the old approach to tiled rendering (AKA octree)
since that is tied to the old approach to asynchronous loading. Most of
the code change delta consists of deleted files that are experimental
modules and tests associated with those old approaches.

The other changes remove the old integration and adjust the new one to
effectively achieve replacement. The key changes are as follows.

1. Remove async/sync_only marks from pytest since these should only be
needed for the old approach.
2. Remove command that hides/shows whether octree chunks are visible.
I'm unsure if these commands are publicly exposed anywhere, though this
one currently appears to be broken.
3. Redefine the existing experimental setting previously used to enable
the old approach to enable the new one instead.
4. Add support for `Layer.loaded` for the new approach by adding an ID
to each slice request. This could probably be moved to a separate PR if
desired (e.g. see andy-sweet#51)
5. Absorb slicing logic spread across old classes into the slice request
call.

This PR currently includes #5783, which should be reviewed and merged
first to make this large PR as small as possible. For the difference
between this and #5783, see andy-sweet#50

# References

I think this improves the simplicity of slicing logic as described in
#2156
Also see #5711 for the first version of this PR from @kcpevey, which was
split into this and #5783
Part of #4795

## Type of change

There probably need to be some documentation updates related to this
since the old approach is well documented on napari.org. We should
probably write some developer docs based off the async slicing NAP and
mark the old docs as deprecated and/or remove them to avoid confusion.

# How has this been tested?

- [x] added an integration test for checking `Layer.loaded` is updated
correctly
- [x] all existing tests pass with my change

I also did some manual testing by running `NAPARI_ASYNC=1 napari` and
interacting with some remote and non-remote data.

---------

Co-authored-by: Kim Pevey <kcpevey@gmail.com>
Co-authored-by: Kim Pevey <kcpevey@quansight.com>
Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Matthias Bussonnier <bussonniermatthias@gmail.com>
melissawm pushed a commit to melissawm/napari that referenced this pull request Jul 3, 2023
# Fixes/Closes

Closes napari#5517
Closes napari#2790

# Description

This replaces [the old approach to asynchronous
loading](https://napari.org/0.4.16/guides/rendering.html) with [the new
approach described in
NAP-4](https://napari.org/stable/naps/4-async-slicing.html).

See the old approach (on `main`) on some remote 3D multi-resolution
images from Janelia via [@chili-chiu's fantastic bio sample data
plugin](https://www.napari-hub.org/plugins/napari-bio-sample-data):

https://github.com/napari/napari/assets/2608297/e8673d4e-c387-49bc-9254-30d0fc9c1cd2

And the new approach on the same data that is more responsive:

https://github.com/napari/napari/assets/2608297/0b9f1f12-d5e0-4980-be05-c5c58285c1a2

This PR also removes the old approach to tiled rendering (AKA octree)
since that is tied to the old approach to asynchronous loading. Most of
the code change delta consists of deleted files that are experimental
modules and tests associated with those old approaches.

The other changes remove the old integration and adjust the new one to
effectively achieve replacement. The key changes are as follows.

1. Remove async/sync_only marks from pytest since these should only be
needed for the old approach.
2. Remove command that hides/shows whether octree chunks are visible.
I'm unsure if these commands are publicly exposed anywhere, though this
one currently appears to be broken.
3. Redefine the existing experimental setting previously used to enable
the old approach to enable the new one instead.
4. Add support for `Layer.loaded` for the new approach by adding an ID
to each slice request. This could probably be moved to a separate PR if
desired (e.g. see andy-sweet#51)
5. Absorb slicing logic spread across old classes into the slice request
call.

This PR currently includes napari#5783, which should be reviewed and merged
first to make this large PR as small as possible. For the difference
between this and napari#5783, see andy-sweet#50

# References

I think this improves the simplicity of slicing logic as described in
napari#2156
Also see napari#5711 for the first version of this PR from @kcpevey, which was
split into this and napari#5783
Part of napari#4795

## Type of change

There probably need to be some documentation updates related to this
since the old approach is well documented on napari.org. We should
probably write some developer docs based off the async slicing NAP and
mark the old docs as deprecated and/or remove them to avoid confusion.

# How has this been tested?

- [x] added an integration test for checking `Layer.loaded` is updated
correctly
- [x] all existing tests pass with my change

I also did some manual testing by running `NAPARI_ASYNC=1 napari` and
interacting with some remote and non-remote data.

---------

Co-authored-by: Kim Pevey <kcpevey@gmail.com>
Co-authored-by: Kim Pevey <kcpevey@quansight.com>
Co-authored-by: Juan Nunez-Iglesias <jni@fastmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Matthias Bussonnier <bussonniermatthias@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement preferences Issues relating to the creation of new preference fields/panels qt Relates to qt task tests Something related to our tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants