subsetting / subclustering, use raw #826

bobermayer · 2019-09-10T12:30:40Z

when I select a subset of cells using ad_sub=ad[ad.obs['louvain']=='subcluster_of_interest',:], and then re-apply preprocessing routines, this will use only the genes of ad.X (variable over the entire dataset), but not those that are variable only within the subcluster and might be informative for its substructure even if the variance doesn't pass the cutoff when evaluated over the entire dataset. basically, the set of variable genes can only shrink by subsetting..

I'd propose to either use

tmp=ad[ad.obs['louvain']=='subcluster_of_interest',:]
ad_sub=sc.AnnData(tmp.raw.X,obs=tmp.obs,var=tmp.raw.var)

to "reset" the .X matrix (maybe there's a better way?)
or to make sc.pp.highly_variable_genes work on ad.raw.X

scanpy==1.4.4 anndata==0.6.22.post1 umap==0.3.10 numpy==1.16.4 scipy==1.2.1 pandas==0.25.1 scikit-learn==0.20.3 statsmodels==0.10.1 python-igraph==0.7.1 louvain==0.6.1

The text was updated successfully, but these errors were encountered:

chansigit · 2020-03-06T06:28:11Z

I have the same question

ajynair · 2021-02-25T23:42:11Z

+1

li-xuyang28 · 2021-06-06T02:35:01Z

+1

ivirshup added the Enhancement ✨ label Sep 11, 2019

ivirshup mentioned this issue Sep 11, 2019

Allow selection of layer/ raw for methods that use X #828

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

subsetting / subclustering, use raw #826

subsetting / subclustering, use raw #826

bobermayer commented Sep 10, 2019

chansigit commented Mar 6, 2020

ajynair commented Feb 25, 2021

li-xuyang28 commented Jun 6, 2021

subsetting / subclustering, use raw #826

subsetting / subclustering, use raw #826

Comments

bobermayer commented Sep 10, 2019

chansigit commented Mar 6, 2020

ajynair commented Feb 25, 2021

li-xuyang28 commented Jun 6, 2021