Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure interactive visualization can be run iteratively #541

Merged
merged 18 commits into from
May 19, 2022
Merged

Conversation

alex-l-kong
Copy link
Contributor

What is the purpose of this PR?

Closes #520. Users may want to run the interactive visualization in the same session multiple times if they determine the original meta cluster remapping needs changes. Currently, there is no way to do this.

  1. If the user doesn't change the names of the meta clusters (or has a purely numeric renaming scheme), the overlay will pull extra columns and doesn't properly execute all the components correctly
  2. If the user changes the names of the meta clusters to string values, the visualization will throw an error midway due to currently assuming all metacluster values are integers

How did you implement your changes

Flexibility has been added to account for the additional rename problem which solves both issues 1 and 2 listed above:

  1. metacluster_from_files in file_reader.py now handles the renaming of the {pixel/cell}_meta_cluster_rename column to metacluster_rename. This provides a backbone that we can work off of in metaclusterdata.py.
  2. _metacluster_displaynames_map in MetaClusterData gets prepopulated to ensure that the axes labels contain the renamed meta clusters where applicable.
  3. Ensure the clusters_from_metaclusters property in MetaClusterData also takes into account the metacluster_rename column when reindexing with _marker_order. Previously, it was assumed that there would just be the metacluster column.
  4. Ensure the clusters property in MetaClusterData also drops the metacluster_rename column if necessary to prevent clashes (for example, with computing linkage_matrix).

Remaining issues

The visualization has not been tested on a variety of datasets as I have only 1 on hand.

@alex-l-kong alex-l-kong self-assigned this May 6, 2022
@alex-l-kong alex-l-kong requested review from cliu72 and removed request for cliu72 May 10, 2022 05:16
@alex-l-kong
Copy link
Contributor Author

alex-l-kong commented May 10, 2022

Removed the code review request to discuss one remaining issue with setting the matplotlib backends. Running %matplotlib inline multiple times will cause %matplotlib widget (and thus, the interactive visualization on re-run) to fail.

As the notebooks are written now, we set %matplotlib inline in all non-interactive cells, which won't do. I've verified 2 fixes, let me know which one is preferable:

  1. Don't change the backend to inline for the pixel and cell cluster overlays (or the weighted channel heatmaps for cells): arguably leads to an easier workflow for the user. This will leave the backend as widget for all of these. These figures can be interacted with, zoomed in. Only downside is they may appear a bit larger than with inline with a bit more padding on the sides.
  2. Have one call to %matplotlib inline but have a note to the user to run it just once: the workflow will be a bit more awkward for this one. However, it does allow the figures to display statically, like they were. It also makes the legend for weighted heatmaps easier to draw.

@ngreenwald
Copy link
Member

ngreenwald commented May 10, 2022 via email

@alex-l-kong
Copy link
Contributor Author

Cell cluster overlay with %matplotlib inline

Screen Shot 2022-05-10 at 1 47 01 PM

Weighted channel average with %matplotlib inline

Screen Shot 2022-05-10 at 1 46 22 PM

Cell cluster overlay without %matplotlib inline: padding will need to be removed

Screen Shot 2022-05-10 at 1 43 21 PM

Weighted channel average without %matplotlib inline: note that I will need to figure out how to add the legend to this one.

Screen Shot 2022-05-10 at 1 44 00 PM

@ngreenwald
Copy link
Member

I think that's fine, if people are going to use those plots for anything they'll export them as PDFs anyway, so as long as it shows up well enough in the cells to see I think it's okay

@ngreenwald
Copy link
Member

Tried it out on pixel and cell clustering, didn't run into any issues

@cliu72
Copy link
Contributor

cliu72 commented May 11, 2022

Tested it a bit and also didn't run into issues. I agree with Noah, it looks fine (so option 1, not including %matplotlib inline looks fine to me). If you want to get around drawing the legend for the heatmap, you can just change the rownames of the heatmap (instead of showing integers as the rownames on the heatmap, map those integers to the cluster names specified by the user).

@cliu72
Copy link
Contributor

cliu72 commented May 11, 2022

Also, this isn't related to this pull request but it came up when I was testing just now, so I thought it'd be easier to put it here. In this code chunk in section 3.2:

# define the path to the channel file
if img_sub_folder is None:
    chan_file = os.path.join(
        fovs[0], io_utils.list_files(os.path.join(tiff_dir, fovs[0]), substrs=['.tif', '.tiff'])[0]
    )
else:
    chan_file = os.path.join(
        fovs[0], img_sub_folder, io_utils.list_files(os.path.join(tiff_dir, img_sub_folder, fovs[0]), substrs=['.tif', '.tiff'])[0]
    )

I think fovs[0], img_sub_folder, io_utils.list_files(os.path.join(tiff_dir, img_sub_folder, fovs[0]), substrs=['.tif', '.tiff'])[0] in the else statement should be replaced with fovs[0], img_sub_folder, io_utils.list_files(os.path.join(tiff_dir, fovs[0], img_sub_folder), substrs=['.tif', '.tiff'])[0] (seems as if img_sub_folder is in the wrong place).

The way it is now was causing errors for me.

@alex-l-kong
Copy link
Contributor Author

Also, this isn't related to this pull request but it came up when I was testing just now, so I thought it'd be easier to put it here. In this code chunk in section 3.2:

# define the path to the channel file
if img_sub_folder is None:
    chan_file = os.path.join(
        fovs[0], io_utils.list_files(os.path.join(tiff_dir, fovs[0]), substrs=['.tif', '.tiff'])[0]
    )
else:
    chan_file = os.path.join(
        fovs[0], img_sub_folder, io_utils.list_files(os.path.join(tiff_dir, img_sub_folder, fovs[0]), substrs=['.tif', '.tiff'])[0]
    )

I think fovs[0], img_sub_folder, io_utils.list_files(os.path.join(tiff_dir, img_sub_folder, fovs[0]), substrs=['.tif', '.tiff'])[0] in the else statement should be replaced with fovs[0], img_sub_folder, io_utils.list_files(os.path.join(tiff_dir, fovs[0], img_sub_folder), substrs=['.tif', '.tiff'])[0] (seems as if img_sub_folder is in the wrong place).

The way it is now was causing errors for me.

Oh crud, seems like I uploaded a notebook containing an older version of this cell with incorrect path specification. I'll be sure to fix that.

@ngreenwald
Copy link
Member

ngreenwald commented May 12, 2022 via email

@alex-l-kong
Copy link
Contributor Author

Tested it a bit and also didn't run into issues. I agree with Noah, it looks fine (so option 1, not including %matplotlib inline looks fine to me). If you want to get around drawing the legend for the heatmap, you can just change the rownames of the heatmap (instead of showing integers as the rownames on the heatmap, map those integers to the cluster names specified by the user).

Oh yeah that works too. I'll make this change and push the changes.

@alex-l-kong
Copy link
Contributor Author

Tested it a bit and also didn't run into issues. I agree with Noah, it looks fine (so option 1, not including %matplotlib inline looks fine to me). If you want to get around drawing the legend for the heatmap, you can just change the rownames of the heatmap (instead of showing integers as the rownames on the heatmap, map those integers to the cluster names specified by the user).

So while this solution will work with the meta cluster heatmap, it's better to have this heatmap for the SOM clusters.

Unfortunately, sns.clustermap doesn't play too nicely with our current formatting, so I'll have to spend some more time getting the legend to display properly.

@alex-l-kong
Copy link
Contributor Author

alex-l-kong commented May 12, 2022

Screen Shot 2022-05-12 at 3 23 38 PM

Was able to play around with some of the GridSpec features of sns.clustermap and got it to play kinda nicely with %matplotlib widgets (the bottom completely shows, just got cut off for this screenshot). The following changes were made:

  1. Ensure the cluster labels are positioned horizontally
  2. Ensure the legend doesn't get drawn over the figure with the %matplotlib widgets backend

The main problem with clustermap is that the author used axes that don't support tight_layout, making the actual figure very inflexible to change. To ensure both of the above are formatted correctly, I had to directly modify the GridSpec the clustermap is drawn on to accommodate room on the right (in draw_heatmap). The only minor issue is the size of the colormap, but if you're fine with this, I can push the changes and request a review.

@cliu72
Copy link
Contributor

cliu72 commented May 13, 2022

Screen Shot 2022-05-12 at 3 23 38 PM

Was able to play around with some of the GridSpec features of sns.clustermap and got it to play kinda nicely with %matplotlib widgets (the bottom completely shows, just got cut off for this screenshot). The following changes were made:

  1. Ensure the cluster labels are positioned horizontally
  2. Ensure the legend doesn't get drawn over the figure with the %matplotlib widgets backend

The main problem with clustermap is that the author used axes that don't support tight_layout, making the actual figure very inflexible to change. To ensure both of the above are formatted correctly, I had to directly modify the GridSpec the clustermap is drawn on to accommodate room on the right (in draw_heatmap). The only minor issue is the size of the colormap, but if you're fine with this, I can push the changes and request a review.

I think this looks fine! I'm okay with it.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@alex-l-kong alex-l-kong requested a review from cliu72 May 13, 2022 19:08
Copy link
Contributor

@cliu72 cliu72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested it and didn't run into issues, but what is the reason for "NOTE: after running or re-running the remapping widget, only run the following cell below ONCE. Do NOT run it more than once after a remapping session or it will break the interactivity of the widget on future runs." in the cell clustering notebook? Why can't that heatmap be generated more than once - I thought you had outlined solutions for this above? What would be the solution for someone who does make that heatmap and then decide they want to go back and change clusters - re-run the whole notebook?

@cliu72
Copy link
Contributor

cliu72 commented May 17, 2022

Also, is there a way to remove the white grid lines in the cell overlays? They don't seem to be there for the pixel cluster overlays. It would also be nice to remove the axis label numbers (for both cell and pixel cluster overlays). Those numbers aren't really necessary.
image

@alex-l-kong
Copy link
Contributor Author

Tested it and didn't run into issues, but what is the reason for "NOTE: after running or re-running the remapping widget, only run the following cell below ONCE. Do NOT run it more than once after a remapping session or it will break the interactivity of the widget on future runs." in the cell clustering notebook? Why can't that heatmap be generated more than once - I thought you had outlined solutions for this above? What would be the solution for someone who does make that heatmap and then decide they want to go back and change clusters - re-run the whole notebook?

Forgot to remove this comment, it was a leftover from my previous version including %matplotlib inline.

@alex-l-kong
Copy link
Contributor Author

Also, is there a way to remove the white grid lines in the cell overlays? They don't seem to be there for the pixel cluster overlays. It would also be nice to remove the axis label numbers (for both cell and pixel cluster overlays). Those numbers aren't really necessary. image

Should be easy to do with plt.axis('off') and plt.grid(b=None), I'll add those in.

@alex-l-kong alex-l-kong requested a review from cliu72 May 17, 2022 18:02
Copy link
Contributor

@cliu72 cliu72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Just one more tiny thing. Would it be easy to remove the "cell_meta_cluster_rename' under the colorbar (for both heatmaps)? Seems unnecessary.

image

@alex-l-kong
Copy link
Contributor Author

OK this should do the trick.

@alex-l-kong alex-l-kong requested a review from cliu72 May 18, 2022 20:47
@alex-l-kong alex-l-kong merged commit ad9fd61 into master May 19, 2022
@alex-l-kong alex-l-kong deleted the iter_viz branch May 19, 2022 18:10
@srivarra srivarra added the bug Something isn't working label Jun 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Metacluster GUI crashes when running on previously updated data
4 participants