RFC: improve show_stack() performance #346

kevinyamauchi · 2018-07-27T06:09:11Z

edit 7.27.18: updated to include prototypes for show_spots (see 5525d9b, 48ba1e7, dd941e2)

Overview

The show_stack tool is really useful for evaluating the images during processing. However, it is very slow and thus difficult to use. I have prototyped a faster way to update images (see the notebook) and proposed a method for speeding up the show_spots as well (see below). Thoughts?

Approach

Faster scrolling
The current version in master creates a new imshow() plot each time the slider is adjusted. Recreating the entire plot is very slow, so instead we use the imshow().set_data() to update just the image data. We should be able to make show_stack(). If we need to go faster, we may need to look into a different plotting library (e.g., pyqtgraph).

Faster spots
To speed up the spot display, we draw all of the spots on the image (as if they were z projected) before calling interact object. We set all spots to set_visible = False (i.e., make them invisible). Then, when the viewer "scrolls" between slices, toggle the visible property for spots that are members of that frame. We precompute the masks for each slice to speed up the upate function.

Image query cursor tooltop
Additionally, note that I use the 'notebook' (as opposed to 'inline') matplotlib magic, which both allows the draw to work and as a bonus gives an image tooltip. The tooltip is super useful because it reads out the coordinates and intensity of the pixel under the cursor (exactly what we were talking about today)!

Questions

Will the proposed show_spots method generalize to other slicing (i.e., not along z)?
Is the show_spots responsive enough? We could consider blitting (i.e., only interacting with spots that change state between slices), but it may not be worth the effort.
What class will be passed to show_spots as the 'results_df'?
Does it matter what backend (e.g., QtAgg or TkAgg) the user has?

ambrosejcarr · 2018-07-27T15:27:57Z

This looks awesome. I'm going to be on bad wifi today so I may not be able to really dig in and test things out until I get home tomorrow.

A random point:

PRs will fail unless you use nbencdec encode <name.ipynb> <name.py> to create a notebook that can be tested by our tests.

dganguli

This is awesome! Love the ability to pan and zoom!

kevinyamauchi · 2018-07-27T17:27:12Z

Ah sorry about the failed tests - I'll try to figure out the notebook creation later today.

I just added the show_spots functionality and I think it works pretty well (I updated the notebook to include an example)!

@ambrosejcarr, I have a question about the results_df that gets passed into show_spots() argument: under the new IntensityTable/Codebook, what class will be passed to show_spots?

ambrosejcarr · 2018-07-27T18:01:31Z

Ah sorry about the failed tests - I'll try to figure out the notebook creation later today.

Not a problem and no hurry.

pip3 install nbencdec
cd starfish
nbencdec encode notebooks/Faster_show_stack.ipynb notebooks/Faster_show_stack.py

@ambrosejcarr, I have a question about the results_df that gets passed into show_spots() argument: under the new IntensityTable/Codebook, what class will be passed to show_spots?

That's a great question. I think it would probably be an IntensityTable, but we'd have to update the code to make it work. Referencing #324 Because I think that there isn't any coverage of that method right now, and adding tests will help.

ttung · 2018-07-27T18:02:39Z

Generally, you should run pip install -r REQUIREMENTS-DEV.TXT -r REQUIREMENTS-NOTEBOOK.TXT

ambrosejcarr · 2018-07-27T18:06:05Z

I was able to load up the notebook -- I love the %matplotlib notebook, the tools are super useful; they are indeed exactly what we wanted!

Unfortunately, when I use that tool and try normal matplotlib figure plots, they don't show up anymore. I haven't done any due diligence yet to see if there are any easy solutions, but we should make sure it doesn't disrupt the regular plotting.

I tried replacing it with %matplotlib inline, but the new plot doesn't survive the transition. It also tried using both inline and notebook, but it seems like the notebook only supports one at a time.

Edit: working example:

import numpy as np
import matplotlib.pyplot as plt
plt.plot(np.arange(10), np.arange(10))

Under %matplotlib notebook This just returns the axes object, but nothing displays.

kevinyamauchi · 2018-07-27T20:07:33Z

Thanks for looking into the plotting, Ambrose. I took a look and we can plot using matplotlib, but when we use the %matplotlib notebook magic, we need to explicitly assign the plot object to a fig/axis (see below. plt.plot() just returns the plot object. I also added some examples to the notebook.

fig = plt.figure();
ax = fig.add_subplot(111);
ax.plot(range(10));

ambrosejcarr · 2018-07-27T20:22:08Z

Got it. Thanks for looking into that. I think the requirement that we have to assign to axes is an acceptable price to pay to get pan/zoom and better performance on the show stack.

Interested to see what @dganguli has to say about this. :)

kevinyamauchi · 2018-07-27T22:44:45Z

I agree. I also think that the specificity of assigning the plot to an axis is useful for when we are generating multiple plots (e.g., QC plots) and want to manipulate a particular one (either statically or dynamically).

ambrosejcarr · 2018-07-30T13:36:49Z

I like these changes, here's what I think it would take to merge them:

Since %matplotlib notebook causes plots not to show without explicit figure references, and the proposed code updates breaks show_stack in %matplotlib inline, we'll need to update existing notebooks to use the proposed strategy.
In updated notebooks, verify that new show_spots has logical results
[Optional] Add some kind of tests to the vis methods
Remove example notebook

@kevinyamauchi Do you feel comfortable working through these? We can give you some pointers on how we use nbencdec to generate .py files that pair with the notebooks if you'd like. Alternatively, I'd be happy to pair on this with you (or perhaps @ttung could in person).

kevinyamauchi · 2018-07-30T20:48:57Z

Sounds good. I can work through those items. I think I will need some help/discussion for creating tests (item 3) and generating the py files for notebooks, but I can take a crack at it myself first.

With regards to the spot data input format for show_spots(), should I think I will integrate the input format in the current implementation just to get it tested quickly with the current notebooks. We can work to add support for IntensityTable soon. How does that sound?

ambrosejcarr · 2018-07-30T20:55:57Z

Sounds good. I can work through those items. I think I will need some help/discussion for creating tests (item 3) and generating the py files for notebooks, but I can take a crack at it myself first

Cool, I don't have much experience with vis tests, so I'm more interested than anything else. If it seems too complicated I wouldn't be too upset if you punted on it (that's what we're currently doing, see #324)

With regards to the spot data input format for show_spots(), should I think I will integrate the input format in the current implementation just to get it tested quickly with the current notebooks. We can work to add support for IntensityTable soon. How does that sound?

Sounds like a good plan.

dganguli · 2018-07-30T23:21:13Z

starfish/image/_stack.py

@@ -335,25 +335,58 @@ def show_stack(

        show_spot_function = self._show_spots


this is no longer used -- is that intended?

Sorry - deleted.

dganguli · 2018-07-30T23:21:48Z

starfish/image/_stack.py

+        ax.set_xticks([])
+        ax.set_yticks([])
+
+        if show_spots:


why show fake spots? is this for demo purposes only? e.g., not intended to be merged into master?

alternatively, you could continue to use _show_spots, but pass in a spots table that looks like this:

spots = pd.DataFrame({'y':intensities.x, 'x':intensities.y, 'r':intensities.r}) f, ax = plt.subplots(figsize=(20, 20)) ax.imshow(blobs, cmap=plt.cm.gray) stack._show_spots(results, ax=plt.gca())

where intensities has the relevant spot attributes

We discussed this on slack, but just adding it here to consolidate the record. I just had this in as a placeholder until I figured out the expected format for the spot data. I have no implemented show_spots for the pandas data frame as you suggested here.

dganguli

This is an awesome contribution! I think we should move the random spot generation out of the code and let users submit spot attributes for testing. I also think we should verify that the existing notebooks will still plot appropriately. Or at least make sure we fix the existing notebooks to generate the relevant plots

kevinyamauchi · 2018-07-31T02:14:50Z

Sounds good, Deep! I fixed show_spots to work as a drop in for the current notebooks. I tested the notebooks and they all seem to work with minor tweaks. I can go over this with you when we meet tomorrow. If things look okay to you, I'll take a shot at making the notebook .py files.

@ambrosejcarr, the one thing I can't get to work is pixel spot detector in the MERFISH notebook. I get a "KeyError: 'barcode'". Is this something being addressed in a different PR (or perhaps already merged with master)? However, this doesn't seem related to the show_stack() so I am not sure we need to fix it in this PR.

from starfish.pipeline.features.pixels.pixel_spot_detector import PixelSpotDetector
psd = PixelSpotDetector(
    codebook='https://s3.amazonaws.com/czi.starfish.data.public/MERFISH/codebook.csv',
    distance_threshold=0.5176,
    magnitude_threshold=1,
    area_threshold=2,
    crop_size=40
)

spot_attributes, decoded = psd.find(s)

dganguli

👍 from an offline convo @kevinyamauchi and i went through this update. pros: show_stack is much much faster, even when visualizing spots. also we get the ability to pan/zoom and read intensity values via hover. only con is that %matplotlib notebook is now required. this means for inline plots, one needs to call ax = plt.figure(); ax.plot(...) instead of simply plt.plot(...). in the end, i think the pros outweigh the cons.

dganguli · 2018-07-31T21:33:49Z

notebooks/Faster_show_stack.ipynb

@@ -0,0 +1,3401 @@
+{


i say we nuke this notebook.

dganguli · 2018-07-31T21:34:29Z

and one last thing -- we should verify the new notebooks work -- which i'm pretty sure they weill.

ambrosejcarr · 2018-07-31T21:35:27Z

I like the extra expressiveness of:

f, ax = plt.subplots(figsize=(n, n))

This is just another "pro" for me. :)

codecov-io · 2018-08-02T20:52:32Z

Codecov Report

Merging #346 into master will decrease coverage by 0.25%.
The diff coverage is 3.57%.

@@            Coverage Diff             @@
##           master     #346      +/-   ##
==========================================
- Coverage   82.97%   82.72%   -0.26%     
==========================================
  Files          67       67              
  Lines        2585     2593       +8     
==========================================
  Hits         2145     2145              
- Misses        440      448       +8

Impacted Files	Coverage Δ
starfish/image/_stack.py	`73.33% <3.57%> (-1.6%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b142a5...b34a239. Read the comment docs.

kevinyamauchi · 2018-08-02T20:58:42Z

Okay, I finally got the notebooks updated and encoded as .py files (thanks, @ttung for the help!). The allen notebook still fails due to the experiment.json file needing updates. Since that is a separate issue, if everyone is okay with the changes, can we merge? Thoughts @dganguli and @ambrosejcarr ?

ambrosejcarr

Looks great, thanks very much!

kevinyamauchi added 5 commits July 26, 2018 17:55

replaced imshow in show_stack

3ec5fd0

removed unused fcn calls

20920ca

added show_stack demo notebook

c53fcf2

added data to notebook

6244366

fixed figure_size

bb51639

kevinyamauchi requested review from ambrosejcarr and dganguli July 27, 2018 06:09

fixed typos and whitespace

2ca1c05

kevinyamauchi requested a review from ttung July 27, 2018 06:15

added demo of fast spots

e873d17

dganguli reviewed Jul 27, 2018

View reviewed changes

kevinyamauchi added 3 commits July 27, 2018 10:14

add ability to chose n_spots

5525d9b

updated notebook for show_spots

48ba1e7

updated notebook doc

dd941e2

kevinyamauchi added 2 commits July 27, 2018 11:32

added plotting example

09bdc93

added no display example

22bc14a

dganguli reviewed Jul 30, 2018

View reviewed changes

added support for results_df

7536b5b

updated to master

1fe25d5

dganguli reviewed Jul 31, 2018

View reviewed changes

kevinyamauchi added 3 commits August 2, 2018 12:41

updated notebooks

61ba9dc

fixed whitespace

8bad718

updated notebook .py

b34a239

ambrosejcarr approved these changes Aug 2, 2018

View reviewed changes

kevinyamauchi merged commit 2785643 into master Aug 2, 2018

kevinyamauchi deleted the ky-show-stack-perf branch August 2, 2018 21:41

dganguli mentioned this pull request Aug 10, 2018

Update show_stack to work in both inline and notebook modes #388

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: improve show_stack() performance #346

RFC: improve show_stack() performance #346

kevinyamauchi commented Jul 27, 2018 •

edited

ambrosejcarr commented Jul 27, 2018

dganguli left a comment

kevinyamauchi commented Jul 27, 2018

ambrosejcarr commented Jul 27, 2018 •

edited

ttung commented Jul 27, 2018

ambrosejcarr commented Jul 27, 2018 •

edited

kevinyamauchi commented Jul 27, 2018

ambrosejcarr commented Jul 27, 2018

kevinyamauchi commented Jul 27, 2018

ambrosejcarr commented Jul 30, 2018

kevinyamauchi commented Jul 30, 2018

ambrosejcarr commented Jul 30, 2018

dganguli Jul 30, 2018

kevinyamauchi Jul 31, 2018

dganguli Jul 30, 2018

dganguli Jul 30, 2018 •

edited

kevinyamauchi Jul 31, 2018

dganguli left a comment

kevinyamauchi commented Jul 31, 2018

dganguli left a comment

dganguli Jul 31, 2018

dganguli commented Jul 31, 2018

ambrosejcarr commented Jul 31, 2018

codecov-io commented Aug 2, 2018

kevinyamauchi commented Aug 2, 2018

ambrosejcarr left a comment

		@@ -335,25 +335,58 @@ def show_stack(

		show_spot_function = self._show_spots

RFC: improve show_stack() performance #346

RFC: improve show_stack() performance #346

Conversation

kevinyamauchi commented Jul 27, 2018 • edited

Overview

Approach

Questions

ambrosejcarr commented Jul 27, 2018

dganguli left a comment

Choose a reason for hiding this comment

kevinyamauchi commented Jul 27, 2018

ambrosejcarr commented Jul 27, 2018 • edited

ttung commented Jul 27, 2018

ambrosejcarr commented Jul 27, 2018 • edited

kevinyamauchi commented Jul 27, 2018

ambrosejcarr commented Jul 27, 2018

kevinyamauchi commented Jul 27, 2018

ambrosejcarr commented Jul 30, 2018

kevinyamauchi commented Jul 30, 2018

ambrosejcarr commented Jul 30, 2018

dganguli Jul 30, 2018

Choose a reason for hiding this comment

kevinyamauchi Jul 31, 2018

Choose a reason for hiding this comment

dganguli Jul 30, 2018

Choose a reason for hiding this comment

dganguli Jul 30, 2018 • edited

Choose a reason for hiding this comment

kevinyamauchi Jul 31, 2018

Choose a reason for hiding this comment

dganguli left a comment

Choose a reason for hiding this comment

kevinyamauchi commented Jul 31, 2018

dganguli left a comment

Choose a reason for hiding this comment

dganguli Jul 31, 2018

Choose a reason for hiding this comment

dganguli commented Jul 31, 2018

ambrosejcarr commented Jul 31, 2018

codecov-io commented Aug 2, 2018

Codecov Report

kevinyamauchi commented Aug 2, 2018

ambrosejcarr left a comment

Choose a reason for hiding this comment

kevinyamauchi commented Jul 27, 2018 •

edited

ambrosejcarr commented Jul 27, 2018 •

edited

ambrosejcarr commented Jul 27, 2018 •

edited

dganguli Jul 30, 2018 •

edited