Spatial-LDA viz and preprocessing notebook #468

bcollica · 2021-09-24T16:46:47Z

What is the purpose of this PR?

This PR creates the notebook and associated functions/visualizations for preprocessing data for spatial-LDA.

How did you implement your changes

The LDA_Preprocessing notebook is similar to the other example notebooks in the sense that it walks the user through the steps of loading, formatting, and visualizing cell data for use in spatial-LDA analysis. processing.py is the main module with the formatting and processing functions. Some visualization functions were added to analysis/visualize.py, and a few random helper functions were added to utils/spatial_lda_utils.py.

The notebook uses an example dataset with two FOVs to show how the functions work. This dataset is included in example_dataset/spatial_lda_input_data. There are also directories with examples of the different visualizations as well as the processed output data which will get passed to the training/inference functions in the next step.

Remaining issues

A few points here:

The example dataset is kinda wonky because the centroids are pulled from a different cell table, and having only two FOVs doesn't really produce anything super interesting anyway. This will be resolved when an official example dataset has been decided on.
The bootstrapping in topic_eda() can be a bit slow, especially for larger sets of FOVs and also larger topic numbers. One thought is to add support for parallel processing either across topics or across bootstrap iterations.
All visualizations use default parameters for color and style, but this can be adjusted depending on preferences.
There is currently a function for writing/saving the different data components, but not a function for reading/loading them which will be useful in the training/inference notebook but also for other potential exploratory analysis.
The notebook contains a “table of contents” with internal links to the different sections, but these might not actually function properly depending on how the user is viewing the notebook (i.e. in a browser, in an IDE, etc.).

…or spatial-LDA preprocessing notebook.

review-notebook-app · 2021-09-24T16:46:51Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

ngreenwald

Looking good. You added a ton of files in this PR. I think it makes sense to have the input data, but I don't think we need to check in the outputs, especially the plots that are generated.

ark/analysis/visualize.py

ark/utils/spatial_lda_utils.py

templates/LDA_Preprocessing.ipynb

…n and topic EDA.

…ate mock imports in conf.py.

alex-l-kong

Just a few minor things to add.

ark/analysis/visualize.py

requirements.txt

setup.py

Co-authored-by: alex-l-kong <31424707+alex-l-kong@users.noreply.github.com>

…A_viz

ngreenwald

Looks good!

bcollica added 24 commits September 17, 2021 14:39

Add utils function for pooled within cluster sum of squares.

df257a6

Add test for pooled within cluster sum of squares.

275a71e

Integrate within_cluster_sums() into gap_stat() and compute_topic_eda().

5ef7d23

Update test for gap stat

0821013

Update test for within_cluster_sums

388107f

Basic plotting function/test for topic EDA metrics.

3391756

Add total cell count to fov_density and update test.

0a4ceef

Basic plotting/tests for distributions of FOV metrics.

a30f76b

Add helper function for adjacency graph plot

b73ffc7

Add function/test for plotting FOV adjacency network graphs

535ed96

Fix testing issues

ec3c3b7

Add cell count stat to compute_topic_eda and update settings

33d617b

Add cell count heatmap to topic eda visualization and tests.

872c39b

Add function/tests for saving spatial-LDA data files

dda5964

Add spatial-LDA preprocessing notebook

3dcad2e

Fix hardcoded bug in spatial_lda_utils

9305096

update example dirs

d3303b4

update notebook with example cell table

6b6e0cb

update requirements

b95806f

pycodestyle

fd2ed49

Adding example cell table, processed data, and visualization images f…

410864f

…or spatial-LDA preprocessing notebook.

pycodestyle :(

b219184

pycodestyle >:(

98c96fc

visualize.py issues

c99cf66

bcollica added 5 commits September 24, 2021 12:38

Test coverage and attempt to fix read the docs build failure

fe23301

Add visualization of adjacency graph to notebook.

d76e436

Add spatial_lda to setup.py

cd9139b

Attempt to fix readthedocs failure

d1226f9

import plot_adjacency_graph from spatial_lda visualization module

4a55492

bcollica added 2 commits September 24, 2021 13:42

revert rtd-requirements

0444543

Add spatial_lda and palettable to conf.py autodoc_mock_imports

7e22688

bcollica requested review from ngreenwald, alex-l-kong and ackagel September 24, 2021 22:23

ngreenwald requested changes Sep 26, 2021

View reviewed changes

ark/analysis/visualize.py Show resolved Hide resolved

ark/analysis/visualize.py Show resolved Hide resolved

ark/utils/spatial_lda_utils.py Outdated Show resolved Hide resolved

ark/utils/spatial_lda_utils.py Show resolved Hide resolved

ngreenwald reviewed Sep 26, 2021

View reviewed changes

bcollica added 10 commits September 27, 2021 08:26

Remove output files, check metric inputs, fix docstring.

f529ed0

Update docstrings and provide links to documentation for featurizatio…

c2899ac

…n and topic EDA.

Math implementation for RTD

46797de

Math implementation for RTD, round 2

f924274

Add missing colons after return type in spatial_lda_utils.py, and upd…

28c76b6

…ate mock imports in conf.py.

Update processing functions to track featurization method

2ffc7fb

Update topic eda keys in settings

5c49d1c

Update topic eda visualization

45009b8

Update notebook to reflect recent changes

5ab6dce

Update notebook to use markers instead of cluster for example dataset.

0459739

alex-l-kong reviewed Sep 28, 2021

View reviewed changes

ark/analysis/visualize.py Show resolved Hide resolved

requirements.txt Outdated Show resolved Hide resolved

setup.py Outdated Show resolved Hide resolved

bcollica and others added 4 commits September 28, 2021 12:47

Update requirements.txt

828986a

Co-authored-by: alex-l-kong <31424707+alex-l-kong@users.noreply.github.com>

Specify range for package version

3c48564

Specify range for package version

f2555bd

Merge remote-tracking branch 'origin/spatial_LDA_viz' into spatial_LD…

395df74

…A_viz

bcollica requested a review from ngreenwald September 28, 2021 21:01

ngreenwald approved these changes Sep 28, 2021

View reviewed changes

bcollica merged commit 97acdad into spatial_LDA Sep 29, 2021

bcollica deleted the spatial_LDA_viz branch September 29, 2021 21:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spatial-LDA viz and preprocessing notebook #468

Spatial-LDA viz and preprocessing notebook #468

bcollica commented Sep 24, 2021

review-notebook-app bot commented Sep 24, 2021

ngreenwald left a comment

alex-l-kong left a comment

ngreenwald left a comment

Spatial-LDA viz and preprocessing notebook #468

Spatial-LDA viz and preprocessing notebook #468

Conversation

bcollica commented Sep 24, 2021

review-notebook-app bot commented Sep 24, 2021

ngreenwald left a comment

Choose a reason for hiding this comment

alex-l-kong left a comment

Choose a reason for hiding this comment

ngreenwald left a comment

Choose a reason for hiding this comment