-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compatibility with VisiumHD #358
Comments
Solving modelling challenges for VisiumHD, NovaST, OpenST and other similar high-resolution technologies is a research project rather than a matter of tweaking models to work on larger data. Subcellular resolution of measurements requires rethinking what it means to analyse this data. I see two major ways forward:
If anyone is interested in collaborating to make this happen please reach out to me, Omer and Oliver by email. |
A practical way to analyse VisiumHD right now is to aggregate it at 8um or 20um or 50um resolution (depending on data quality) and use cell2location with tips discussed in #356. |
Hey thank you so much for your valuable input! Indeed, this was something I was thinking - Performing clustering in order to achive 25-55 micron artificial "spots" and perform deconvolution on these. Another thing I was testing was cell typist that is commonly used for scRNA annotation, and since the slots are so small now, it's no longer a manner of spot deconvolution anymore. So by using the 8 micron resolution and assuming that each bin is a cell, cell typist seemed to do okay in that mapping, but I'm still figuring out the biological meaning of that mapping (if there's one to begin with). Other would be to use cell typist to annotate the 2 micron, then use these predictions to predict the 8 micron bin with majority voting, and then use these 8 micron bins to predict predict the 16 micron bins with majority voting once again. But then again, the goal of this new HD technology would be to avoid performing deconvolution as that was one of the main problems to solve from that technology, so I'm also interested in finding a solution to this for my research |
Another thing from these probabilistic methods is that it's heavily dependent on the intersection between the genes present in the scRNA used as reference and the ST data. So even when I use the 16 bin resolution, which has over 150k bins for a sample I'm using, I'm essentially only using 1500-2000 genes to do the "predictions" on these 150k bins |
Another possible improvement might come on the torch's end. Not sure how this would impact scvi tools, but cuda currently has a way to treat system memory as gpu memory. Pytorch doesn't have this implemented to my knowledge (pytorch/pytorch#104417 (comment)) and https://developer.nvidia.com/blog/simplifying-gpu-application-development-with-heterogeneous-memory-management/. |
Visium HD data can be segmented into cells using https://github.com/Teichlab/bin2cell and then counts aggregated into segmented cells can be analysed using cell2location to examine cell purity and to further decompose areas with spatially interlaced cell types. |
I've been using bin2cell, but even then it produces quite a few number of cells (one of the datasets I've been testing still counts 350-400k cells out of the 9 million + bins of 2 micron resolution after bin2cell processing) so cell2location would still throw some memory errors I'm afraid. Unless we only use cell2location for subsets of the image, but that wouldn't scale well for reproducibility Right now I'm using Bin2Cell + CellTypist and seems to be the most scalable approach for cell type inference using a scRNA reference |
@vitkl @Rafael-Silva-Oliveira Hello, I am interested in using Bin2Cell for cell segmentation followed by analysis with Cell2Loc on HD data. I have followed the demo notebook for Bin2Cell available at (https://nbviewer.org/github/Teichlab/bin2cell/blob/main/notebooks/demo.ipynb), but I am unsure of the next steps after completing the pipeline. Could you please direct me to any tutorials or reproducible code that demonstrate this workflow? I would prefer not to use CellTypist, as I already possess a fully annotated single-cell dataset. Thank you! |
Following up with this: #356
Are there any improvements coming soon in cell2location to take in the full data for very large datasets? The approach of splitting in batches removes any spatial information that might be used to do cell deconvolution so taking in the full dataset would be important
With the new VisiumHD being more prominent, the number of spots or bins can range anywhere from 150k to 650k+, so when I test with cell2location, I always get memory errors.
And doing in batches also doesn't work very well (the full dataset would have to be passed along).
The text was updated successfully, but these errors were encountered: