Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strange spatial_LDA -> spatial_cluster results #28

Closed
yerahko opened this issue Jun 11, 2022 · 5 comments
Closed

strange spatial_LDA -> spatial_cluster results #28

yerahko opened this issue Jun 11, 2022 · 5 comments

Comments

@yerahko
Copy link

yerahko commented Jun 11, 2022

Hello again!

When clustering (spatial_cluster) on spatial_LDA results, I am getting strange results, as below.

I always get reasonable spatial_cluster results when training on a single ROI. But with as few as 2 ROIs, I start to get this artifactual-seeming result, visible as clusters forming vertical stripes in one or more of the ROIs.

I have tried both 'knn' and 'radius' as spatial_LDA methods with varying values of motifs, knn, and radius.
Clustering method was always kmeans (leiden and phenograph were always giving me 99 clusters even with resolution set to 0.1—so I am actually not sure if it's spatial_LDA or instead the clustering that is contributing to this)

Conditions which promote the appearance of this "artifact":

  • more than one ROI trained together
  • radius larger than 30
  • smaller/more numerous cells

Example of a "sensible" spatial clustering result:

image

when one additional ROI is trained together with it, with all the same spatial_LDA and spatial_cluster parameters, that ROI becomes:

image

Some real structure is retained in the lower left corner, while the right side no longer makes sense...

Any idea what could be causing this, or parameters to try which could mitigate?

Thank you again!!

@ajitjohnson
Copy link
Collaborator

ajitjohnson commented Jun 11, 2022

@yerahko can you post the commands you ran and also share a snippet of what your adata.obs looks like? Thank you.

@yerahko
Copy link
Author

yerahko commented Jun 13, 2022

Hi @ajitjohnson, here is an example of code and what adata.obs looks like.

I am varying mostly radius/ knn, and lda_method, of the parameters below.

Thank you!

num_motifs=10
radius=20
knn=10
lda_method = 'radius' # 'knn' or 'radius' 
cluster_method='kmeans'

adata=sm.tl.spatial_lda(adata, num_motifs=num_motifs, radius=radius, knn=knn, method = lda_method, 
                        x_coordinate='X', y_coordinate='Y', phenotype='cellsimple', imageid='Unique_ID',
                       random_state=0)

adata=sm.tl.spatial_cluster(adata, random_state=0, df_name='spatial_lda',
                            method=cluster_method, k = 10 )

def voronoi_plots(unique_ID, color_by = 'spatial_kmeans'):  

    selected = adata[adata.obs['Unique_ID'].isin([unique_ID]), :].copy() 

    # sm.pl.voronoi(adata=adata, imageid='Unique_ID', subset = unique_ID,  # breaking; use line above for selecting subset 
    sm.pl.voronoi(adata=selected,
                color_by=color_by, x_coordinate='X', y_coordinate='Y')
    plt.show()

for i in pd.unique(adata.obs['Unique_ID']).categories: 
    print(i)
    voronoi_plots(color_by = 'cellsimple', unique_ID = i)
    voronoi_plots(color_by = 'spatial_%s' % cluster_method, unique_ID = i)  #  voronoi_plots(color_by = 'spatial_kmeans', unique_ID = i)
X Y Area cellsimple Patient_ID ROI_ID Unique_ID leiden spatial_kmeans
1.400000 294.500000 0.047619 B PT5 ROI_9 PT5.ROI_9 1 2
1.428571 329.380952 0.057143 B PT5 ROI_9 PT5.ROI_9 1 2
1.764706 345.529412 0.180952 Unknown PT5 ROI_9 PT5.ROI_9 22 2
2.300000 634.600000 0.047619 B PT5 ROI_9 PT5.ROI_9 1 4
2.096154 219.807692 0.352381 Unknown PT5 ROI_9 PT5.ROI_9 1 2

79308 rows × 9 columns

@ajitjohnson
Copy link
Collaborator

Hi @yerahko It all looks good to me. This issue has never occurred to me previously. I just want to confirm if your Unique_ID is unique to each image/ROI (It looks like it but just want to confirm). Just to give some background, each unique category within Unique_ID is processed independently in a dataset with multiple images. So If you have multiple ROIs within a single image, each ROI should be considered as an individual image for this purpose.

If that is what you did, not sure what else is going on and might need some example data from you to debug it.

@yerahko
Copy link
Author

yerahko commented Jun 14, 2022

Hi @ajitjohnson yup, that is how I'm using Unique_ID. Each value corresponds to a single image.

Instead of spatial_LDA, I ran spatial_count -> spatial_cluster and that workflow did run successfully without similar artifacts, so for the time being we will work with the spatial_count results.

Thank you for your work on creating and maintaining this package—it's been a great tool for us!

@ajitjohnson
Copy link
Collaborator

Weird, if you would like me to debug, feel free to send me a subset of the data later on. Glad you are enjoying it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants