Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distance_matrix assumes sequentially-labeled label map. #191

Closed
ngreenwald opened this issue Aug 18, 2020 · 2 comments · Fixed by #200
Closed

Distance_matrix assumes sequentially-labeled label map. #191

ngreenwald opened this issue Aug 18, 2020 · 2 comments · Fixed by #200
Assignees
Labels
bug Something isn't working

Comments

@ngreenwald
Copy link
Member

Describe the bug
The current calc_dist_mat constructs a distance matrix between all pairs of cells. The 1st cell will be placed in the 0th, row, and so on, up to the nth cell. This will produce a dist_mat with nxn.

The functions that index into the dist_mat to get distances between cells, for example compute_closenum, use the value of the label to index into the appropriate row. However, this assumes that the cells are sequentially labeled with no missing values. This is not always the case.

Expected behavior
The benefits of doing it this way is that the distance matrix only has to be constructed once. Then, if someone is doing anlyses on a subset of the cells, for example only immune cells, the same total distance matrix can be used, since the indexing still works.

One option would be to sequentially relabel all cells prior to analysis to avoid this error. The other option would be to require people to regenerate a new distance matrix if they want to analyze a subset of their cells.

I know @alex-l-kong ran into some annoyances with constructing a fake dist_matrix because of how it expected cells to be labeled. Is this related? Or switching this wouldn't have impacted anything? Tagging @vacuousplanet just to stand out above the sea of other issues I just added.

@ngreenwald ngreenwald added the bug Something isn't working label Aug 18, 2020
@alex-l-kong
Copy link
Contributor

Yeah this issue is similar to the one I had before. What ends up happening is that the call to regionprops in generate_dist_matrix ends up ordering the centroids in increasing order by x-coords (in the case of ties, ascending y-coord). Unfortunately, this means we end up losing the desired order of cell labels.

As you mention, relabeling the cells is an option, and IMO the easiest, although we could run into problems if the user insists on cells having specific labels. I think regenerating a new distance matrix every time would be a bit cumbersome.

@ngreenwald
Copy link
Member Author

Okay, we're going to modify this so that the xarray label of the coordinate is the cell_label. Then we can index using xarray.loc. This means that even as the distance matrix gets subset for different functions, the labels will remain attached to the correct row.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants