Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update distance matrix to xarray type #200

Merged
merged 17 commits into from
Aug 29, 2020
Merged

Update distance matrix to xarray type #200

merged 17 commits into from
Aug 29, 2020

Conversation

alex-l-kong
Copy link
Contributor

What is the purpose of this PR?

Addresses and closes #191. The problem with our current distance matrix implementation is that it's not flexible. In other words, we have to assume that indices correspond to their respective cell labels. This is a limitation of numpy arrays for distance matrices. Using xarrays, we can assign more flexible indices.

How did you implement your changes

The bulk of the changes involve updating calc_dist_matrix and the indexing into distance matrices elsewhere in the spatial analysis code. We change the return type of calc_dist_matrix to xarray and add an indices argument to allow the passage of a set of indices for each fov which correspond to the cell labels. In this way, we do not have to worry about misordering of a distance matrix nor do we have to spend time to reindex the numpy array.

Indexing into a dist_matrix in xarray format will now use the .loc format.

Remaining issues

Specifically, the biggest issue is making sure our indexing is correct. There are many places where we use numpy functions on distance matrices and these might need to be updated for xarray usage/reindexing.

@alex-l-kong alex-l-kong self-assigned this Aug 25, 2020
@alex-l-kong
Copy link
Contributor Author

Not liking having to specify a dict of coordinates for calc_dist_matrix to work properly, that'll be the biggest point to look at for the upcoming code review.

Copy link
Contributor

@ackagel ackagel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good first pass. Mostly suggesting some optimizations.

ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/synthetic_spatial_datagen.py Outdated Show resolved Hide resolved
Copy link
Member

@ngreenwald ngreenwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work. I think we can completely remove the dict of coords passed in; let me know if I missed something.

ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils_test.py Show resolved Hide resolved
@alex-l-kong
Copy link
Contributor Author

alex-l-kong commented Aug 28, 2020

Remaining coverage issues due to lack of plot_utils and segmentation_utils testing. Also, @ngreenwald just saw your comment about changing the underlying FOVs of data to be strings, that'll be addressed on my next push.

Copy link
Contributor

@ackagel ackagel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good! Just one doc-string fix

ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
Copy link
Member

@ngreenwald ngreenwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome stuff, minor tweaks

ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils.py Outdated Show resolved Hide resolved
ark/utils/spatial_analysis_utils_test.py Outdated Show resolved Hide resolved
saeedseyyedi
saeedseyyedi previously approved these changes Aug 28, 2020
Copy link

@saeedseyyedi saeedseyyedi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks very good to me, good job!

Copy link
Member

@ngreenwald ngreenwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good! Is there any reason we need to keep the old version of the expression matrix and segmentation labels in the master branch?

@alex-l-kong
Copy link
Contributor Author

Everything looks good! Is there any reason we need to keep the old version of the expression matrix and segmentation labels in the master branch?

I'll remove those and re-request.

Copy link
Member

@ngreenwald ngreenwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome

@ngreenwald ngreenwald merged commit 3c41410 into master Aug 29, 2020
@ngreenwald ngreenwald deleted the dist_mat_index branch August 29, 2020 02:03
alex-l-kong added a commit that referenced this pull request Jan 14, 2021
* Fix randomization error in synthetic spatial gen and begin updating type of distance matrix to xarray

* Update notebooks and spatial_analysis_utils functions to handle xarray data type for distance matrices

* Forgot to change spatial analysis script outside of Docker

* Further integration

* Fix closenum tests

* Address minor changes (aka other than dict passing)

* Fix centroid labelling with prop.label, which makes life hella easier

* Change Erin's data and adjust accordingly

* Pycodestyle and xarray import crap

* Last fixes

* Update expression matrix, calc_dist_matrix, and notebook to reflect string-only fov indexing

* Forgot to add new segmentation labels

* Change list to .values for indexing purposes

* Break up logic and update comments

* Fix summing error in closenum random

* Remove old data files
y2kbugger pushed a commit that referenced this pull request Jul 29, 2021
* Fix randomization error in synthetic spatial gen and begin updating type of distance matrix to xarray

* Update notebooks and spatial_analysis_utils functions to handle xarray data type for distance matrices

* Forgot to change spatial analysis script outside of Docker

* Further integration

* Fix closenum tests

* Address minor changes (aka other than dict passing)

* Fix centroid labelling with prop.label, which makes life hella easier

* Change Erin's data and adjust accordingly

* Pycodestyle and xarray import crap

* Last fixes

* Update expression matrix, calc_dist_matrix, and notebook to reflect string-only fov indexing

* Forgot to add new segmentation labels

* Change list to .values for indexing purposes

* Break up logic and update comments

* Fix summing error in closenum random

* Remove old data files
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Distance_matrix assumes sequentially-labeled label map.
4 participants