New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve plotting of lazy signals, by keeping current chunk #2568
Improve plotting of lazy signals, by keeping current chunk #2568
Conversation
This looks very nice.
I think yes, because I often find my self inspecting 4D signals (EBSD) with different navigators (quality/similarity/orientation maps), and thus have to plot, close, plot again and so on. I think my replotting would benefit from keeping temporary chunks after closing a plot. |
Since it can really be used for things which are not plotting related
All done! The MacOS Python 3.6 test failed due to a timeout, which I don't think is related to the code changes in this PR. Travis is failing due to several errors in IO, which does not seem to be related to this PR. |
This is disturbingly much faster. Well done. Happy to merge this unless there is anything else you'd like to change? |
There were a few things I wanted to comments on this PR. I should be able to review it in the next few days. |
Codecov Report
@@ Coverage Diff @@
## RELEASE_next_minor #2568 +/- ##
======================================================
- Coverage 77.11% 77.07% -0.04%
======================================================
Files 206 206
Lines 31656 31682 +26
Branches 6934 7145 +211
======================================================
+ Hits 24412 24420 +8
- Misses 5489 5502 +13
- Partials 1755 1760 +5
Continue to review full report at Codecov.
|
Is it not possible to the numpy array dask is actually using? Or any of the dask caching? See for example: napari/napari#1173 |
The best option there is probably the dask caching. I'll give it a try, and see how it performs. |
Working on this now, testing some implementations using |
Very short summary: the current implementation in this pull request seems to give the best performance and "smooth" user experience when plotting. I suggest using this methodology (keeping temporary NumPy array), then add Opportunistic Caching for lazy signals in a separate pull request. I did a few tests, trying to figure out which implementation we should use. There are a couple of functionalities in
dask.base.config.set({"optimization.fuse.active": False}) The main aim of this pull request, is keep one or several chunks in RAM during the plotting.
Note that 1 and 2 also benefit from the caching. After testing this quite a bit, this is my summary:
I think the "temporary" NumPy array implementation is the way to go, as it leaves to a more "smooth" user experience when doing the plotting. Even if it adds some more code complexity. BenchmarkingMaking the datasetimport numpy as np
from tqdm import tqdm
import h5py
import dask.array as da
from hyperspy._lazy_signals import LazySignal2D
#################################################
f = h5py.File("001_test_data_256x256.hdf5", mode='w')
data = f.create_dataset(
'data', shape=(256, 256, 256, 256), dtype=np.uint16, chunks=(32, 32, 32, 32),
compression='gzip')
x_pos_list = np.linspace(128-20, 128+20, 256, dtype=np.uint16)
y_pos_list = np.linspace(128-40, 128+40, 256, dtype=np.uint16)
image = np.zeros((256, 256), dtype=np.uint16)
pbar = tqdm(total=256*256)
for ix_cube in range(8):
for iy_cube in range(8):
data_cube = np.zeros((32, 32, 256, 256), dtype=np.uint16)
for ix in range(32):
for iy in range(32):
ix_pos = ix_cube * 32 + ix
iy_pos = iy_cube * 32 + iy
x_pos = x_pos_list[ix_pos]
y_pos = y_pos_list[iy_pos]
image[x_pos, y_pos] = 10
data_cube[ix, iy, :, :] = image
image[x_pos, y_pos] = 0
pbar.update(1)
data[ix_cube*32:(ix_cube+1)*32, iy_cube*32:(iy_cube+1)*32, :, :] = data_cube
del data_cube
f.close()
#################################################
dask_array = da.from_array(h5py.File("001_test_data_256x256.hdf5")['data'], chunks=(32, 32, 32, 32))
s = LazySignal2D(dask_array)
s.save("001_test_data_256x256.hspy", chunks=(32, 32, 32, 32), overwrite=True)
#################################################
s_nav = s.sum(axis=(-1, -2)).T
s_nav.compute()
s_nav.save("001_test_data_256x256_nav.hspy", overwrite=True) Running the caching and presist:from time import time
import numpy as np
import dask
import dask.array as da
import hyperspy.api as hs
from dask.cache import Cache
cache = Cache(2e9)
cache.register()
s = hs.load("001_test_data_256x256.hspy", lazy=True)
# Loading a chunk
t0 = time()
data_chunk = s.data[0:32, 0:32].persist()
print("Loading full chunk: {0}".format(time() - t0))
t0 = time()
value = np.array(data_chunk[15, 10])
print("Loading single frame: {0}".format(time() - t0))
t0 = time()
value = np.array(data_chunk[14, 10])
print("Loading another single frame: {0}".format(time() - t0))
t0 = time()
value = np.array(data_chunk[15, 10])
print("Loading same frame as earlier: {0}".format(time() - t0))
######## New chunk
t0 = time()
data_chunk = s.data[32:64, 0:32].persist()
print("Loading new full chunk: {0}".format(time() - t0))
t0 = time()
value = np.array(data_chunk[20, 30])
print("Loading new frame: {0}".format(time() - t0))
# Loading the same chunk as earlier. Which will be pretty quick, due to caching
t0 = time()
data_chunk = s.data[0:32, 0:32].persist()
print("Loading same chunk as earlier: {0}".format(time() - t0)) Output: Loading full chunk: 1.0767991542816162
Loading single frame: 0.019429445266723633
Loading another single frame: 0.018543720245361328
Loading same frame as earlier: 0.00683283805847168
Loading new full chunk: 1.0792076587677002
Loading new frame: 0.018936872482299805
Loading same chunk as earlier: 0.014795541763305664 Running the caching +
|
Numpy Array | NumPy array + cache | Persist + cache | Cache + fuse | |
---|---|---|---|---|
Load chunk1 | 1.15 | 1.15 | 1.05 | N/A |
Load image from chunk1 | 0.000006 | 0.000006 | 0.016 | 1.0 |
Load second image from chunk1 | 0.000001 | 0.000001 | 0.016 | 0.016 |
Load same image from chunk1 | 0.0000007 | 0.0000009 | 0.006 | 0.006 |
Load chunk2 | 1.15 | 1.15 | 1.0 | N/A |
Load image from chunk2 | 0.0003 | 0.0005 | 0.015 | 0.016 |
Load chunk1 again | 1.15 | 0.11 | 0.011 | 0.007 |
Thus, there seems to be some added overhead with accessing a dask array, which we can avoid by temporarily storing the current plotting "chunk" as a NumPy array.While the 0.016 seconds to access an image (in for example Cache + fuse "Load image from chunk1") doesn't seem as much, for me it definitely feels quite a bit slower compared to the NumPy array one. Using the opportunistic caching together with the "NumPy" method seems to work really well, as it will keep a certain number of the chunks in memory, which makes them fairly quick to access (0.01).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was initially not convinced about the benefit about this approach because it doesn't work with ROI and I don't think that it is hyperspy task to do the caching of dask array. However, since the overhead for slicing in dask is noticeable here and until the latter is addressed, it is useful to have this workaround. Also, it should be useful for multifit
of lazy signal.
A better approach would be to reduce the slicing overhead to an extend that it is not significant anymore for our use case - which should be possible - but this should be done in dask and will more complicated to get right.
Instead of creating and deleting attribute at various point of the code, I think that the following alternative would be more clear and explicit:
- define initial value (
None
) for the cache in__init__
, add a comment to explain how it works - connect clearing cache method to
data_changed
event in__init__
- add a private method to clear the cache - reset to initial value (
None
)
Also starting writing docstring for this function
For plotting improvements to lazy
Todo for myself: check that the following code does not produce any errors. import numpy as np
import dask.array as da
import hyperspy.api as hs
data = da.zeros((128, 128, 256, 256), chunks=(32, 32, 32, 32), dtype=np.uint16)
s = hs.signals.Signal2D(data).as_lazy()
s.save("000_dataset.hspy", chunks=(32, 32, 32, 32)) |
There are some unit tests which fails in some strange ways. For example > if not self._clear_cache_dask_data in self.events.data_changed._connected_all:
E AttributeError: 'Signal1D' object has no attribute '_clear_cache_dask_data' This error happens at https://github.com/hyperspy/hyperspy/blob/RELEASE_next_minor/hyperspy/_signals/signal1d.py#L1659, and I'm guessing it is due to
Secondly, when is E AttributeError: 'LazySignal1D' object has no attribute 'events' |
Previously, when self was Lazy, it would call the LazySignal.__init__ signal with `_lazy': False, leading to the non-lazy version of the signal calling the LazySignal.__init__. However, the non-lazy signal did not have access to a function which is only available in LazySignal. This lead to an error, meaning get_current_signal would fail.
Some LazySignals are initialized without events, when they are created with full_initialisation=False. For these very few signals, the LazySignal.__init__ function failed. By checking in the __init__ function that events exist, the init no longer fails. However, this might(?) lead to memory leaks when plotting with newly created signals.
I figured out why the previous tests failed, but I'm uncertain if the fixes are the best ones: The first one is related to The problem was in cs = self.__class__(
self(),
axes=self.axes_manager._get_signal_axes_dicts(),
metadata=metadata.as_dictionary(),
attributes={'_lazy': False}) Somehow, This was fixed with cs = self.__class__(
self(),
axes=self.axes_manager._get_signal_axes_dicts(),
metadata=metadata.as_dictionary())
if self._lazy:
cs._lazy = False
cs._assign_subclass() The second issue was with This is due to The easy fix was to do: if hasattr(self, "events"):
if not self._clear_cache_dask_data in self.events.data_changed._connected_all:
self.events.data_changed.connect(self._clear_cache_dask_data) However, I'm not sure if this is the best solution. |
I've implemented all the feedback, and answered the questions in the code. I left the questions "Unresolved". I ran into some issues with the Other than that, this pull request is ready for review! |
f0837de
to
ed46373
Compare
Summary of the pull request
Old way
Currently, plotting lazy signals in HyperSpy can be quite slow. One of these reasons is how data is retrieved from the dask array. For a 4-dimensional dataset, with chunking (32, 32, 32, 32), the plotting works something like this when the navigator is moved to a new position
This leads to a whole lot of inefficiency, and very slow navigation.
New way
This pull request addresses this, by temporarily keeping the chunk in memory. Leading to much better responsiveness when navigating. In comparison to the current way of plotting lazy signals:
s._temp_plot_data
, and its "slice" ins._temp_plot_data_slice
This way, navigation is just as fast as a non-lazy signal when navigating inside one chunk. This leads to a much smoother user experience. For a comparison on an artificial dataset, chunked (32, 32, 32, 32), where I held down the right arrow key to make the navigator move:
Old
New
The one downside is this slightly increases the memory use when plotting, but just as much as one chunk. I think that in most applications this is not an issue.
One question is if the temporary plotting chunk should be kept in memory after closing the plot. On one hand, keeping it in memory will make the any subsequent plotting quicker, but it will take a little bit of memory.
Changes
BaseSignal.__call__
for handling lazy signalsTodo
To test this