Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error running dyn.tl.moments() #72

Closed
ccruizm opened this issue Aug 22, 2020 · 4 comments
Closed

Error running dyn.tl.moments() #72

ccruizm opened this issue Aug 22, 2020 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@ccruizm
Copy link

ccruizm commented Aug 22, 2020

Good day!

Thanks for developing this great tool! I am trying to analyze some of my data but had a problem when running dyn.tl.moments(). I have exported my Seurat object as loom file and using scvelo, I have read and merged the new anndata object with the splicing info.

AnnData object with n_obs × n_vars = 7116 × 25639
    obs: 'AC3', 'ClusterID', 'ClusterName', 'OC2', 'OPC_shared4', 'OPC_variable5', 'RNA_snn_res_0_3', 'RNA_snn_res_0_7', 'cellcycle1', 'nCount_RNA', 'nFeature_RNA', 'orig_ident', 'seurat_clusters', 'sample_batch', 'Clusters', '_X', '_Y', 'initial_size_spliced', 'initial_size_unspliced', 'initial_size'
    var: 'Selected', 'vst_mean', 'vst_variable', 'vst_variance', 'vst_variance_expected', 'vst_variance_standardized', 'Accession', 'Chromosome', 'End', 'Start', 'Strand'
    obsm: 'mnn_cell_embeddings', 'pca_cell_embeddings', 'umap_cell_embeddings'
    varm: 'pca_feature_loadings'
    layers: 'norm_data', 'scale_data', 'ambiguous', 'matrix', 'spliced', 'unspliced'

Then, I applied the dyn.pp.recipe_monocle(adata):

AnnData object with n_obs × n_vars = 7116 × 25639
    obs: 'AC3', 'ClusterID', 'ClusterName', 'OC2', 'OPC_shared4', 'OPC_variable5', 'RNA_snn_res_0_7', 'cellcycle1', 'nCount_RNA', 'nFeature_RNA', 'orig_ident', 'seurat_clusters', 'sample_batch', 'Clusters', '_X', '_Y', 'initial_size_spliced', 'initial_size_unspliced', 'initial_size', 'nGenes', 'nCounts', 'pMito', 'use_for_pca', 'scale_data_Size_Factor', 'initial_scale_data_cell_size', 'spliced_Size_Factor', 'initial_spliced_cell_size', 'norm_data_Size_Factor', 'initial_norm_data_cell_size', 'unspliced_Size_Factor', 'initial_unspliced_cell_size', 'Size_Factor', 'initial_cell_size', 'ntr', 'cell_cycle_phase'
    var: 'Selected', 'vst_mean', 'vst_variable', 'vst_variance', 'vst_variance_expected', 'vst_variance_standardized', 'Accession', 'Chromosome', 'End', 'Start', 'Strand', 'pass_basic_filter', 'log_cv', 'score', 'log_m', 'use_for_pca', 'ntr'
    uns: 'velocyto_SVR', 'pp_norm_method', 'PCs', 'explained_variance_ratio_', 'pca_fit', 'feature_selection'
    obsm: 'mnn_cell_embeddings', 'pca_cell_embeddings', 'umap_cell_embeddings', 'X_pca', 'X', 'cell_cycle_scores'
    varm: 'pca_feature_loadings'
    layers: 'norm_data', 'scale_data', 'ambiguous', 'matrix', 'spliced', 'unspliced', 'X_scale_data', 'X_spliced', 'X_norm_data', 'X_unspliced'

But when trying to run either first dyn.tl.moments(adata) or dyn.tl.dynamics(adata) directly, I get the error below:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-9-47e88b9d5c17> in <module>
----> 1 dyn.tl.moments(adata)

/hpc/pmc_stunnenberg/cruiz/miniconda3/envs/python3/lib/python3.8/site-packages/dynamo/tools/moments.py in moments(adata, genes, group, use_gaussian_kernel, normalize, use_mnn, layers, n_pca_components, n_neighbors)
    148             layer_y = adata.layers[layer2].copy()
    149 
--> 150             layer_y_group = np.where([layer2 in x for x in
    151                                       [only_splicing, only_labeling, splicing_and_labeling]])[0][0]
    152             # don't calculate 2 moments among uu, ul, su, sl -

IndexError: index 0 is out of bounds for axis 0 with size 0

Do you know what the problem might be? Thanks in advance for your help!

@Xiaojieqiu
Copy link
Collaborator

Xiaojieqiu commented Aug 22, 2020

Hey @ccruizm thanks for your interest in dynamo! My immediate thought regarding this error is that you got a few non-standard layers (especially those norm_data, scale_data layers). To filter this possibility, could you please first deleting 'norm_data', 'scale_data', 'ambiguous', 'matrix', layers? you can do that by:

del adata.layers['norm_data'], adata.layers['scale_data'], adata.layers['matrix'], adata.layers['norm_data']

and then run

dyn.pp.recipe_monocle(adata)
dyn.tl.dynamics(adata)
dyn.tl.reduceDimension(adata)
dyn.tl.cell_velocities(adata)
dyn.vf.VectorField(adata, basis='umap')
dyn.vf.topography(adata, basis='umap')
......

We still need to figure out the exact issue and would like to use your other norm or scaled layer inform. So in order to reproduce your error, if you can share me with a small sample of your dataset (like 100 cells), I can do some more careful investigations to have an ideal solution. (I probably can also fake an anndata object like yours too)

btw, for keys in .obsm, the anndata convention is to have a X_ at beginning. So you may need to name those keys as 'X_mnn_cell_embeddings', 'X_pca_cell_embeddings', 'X_umap_cell_embeddings' instead of 'mnn_cell_embeddings', 'pca_cell_embeddings', 'umap_cell_embeddings'.

@Xiaojieqiu Xiaojieqiu self-assigned this Aug 22, 2020
@Xiaojieqiu Xiaojieqiu added the bug Something isn't working label Aug 22, 2020
@Xiaojieqiu
Copy link
Collaborator

Xiaojieqiu commented Aug 22, 2020

I just tested and found that the issue is indeed because of the existence of 'norm_data', 'scale_data' layers. Since dyn.tl.moments assumes either conventional scRNA-seq data (only_splicing) with unspliced, spliced layers or labeling data with either new, total (only_labeling) or uu, ul, su, sl layers (uu, ul, su, sl (splicing_and_labeling) corresponds to unspliced unlabelled, unspliced labeled, spliced unlabeled and spliced labeled), so it cannot find those layers which leads to the indexing error above:

--> 150             layer_y_group = np.where([layer2 in x for x in
    151                                       [only_splicing, only_labeling, splicing_and_labeling]])[0][0]```

One quick solution  (better than the above one) now is to set `layers = ['X_spliced', 'X_unspliced']` when you run moments. Something like the following: 

dyn.pp.recipe_monocle(adata)
dyn.tl.moments(adata, layers=['X_unspliced', 'X_spliced'])
dyn.tl.dynamics(adata)
dyn.tl.reduceDimension(adata)
dyn.tl.cell_velocities(adata)
dyn.vf.VectorField(adata, basis='umap')
dyn.vf.topography(adata, basis='umap')
dyn.pl.streamline_plot(adata, basis='umap')
......

@Xiaojieqiu
Copy link
Collaborator

Xiaojieqiu commented Aug 22, 2020

I think my recent commit should fix this issue. now dynamo automatically checks your layer inform and avoids normalizing data that are not conventional. so you will only added X_unspliced, X_spliced in layers after receipe_monocle, this also fixes the issue in tl.moments. Feel free to pull those latest changes and install the newest version to try it out. Please let me know how it goes.

p.s. running something like the following without worrying about those unconventional layers anymore:

dyn.pp.recipe_monocle(adata)
dyn.tl.moments(adata)
dyn.tl.dynamics(adata)
dyn.tl.reduceDimension(adata)
dyn.tl.cell_velocities(adata)
dyn.vf.VectorField(adata, basis='umap')
dyn.vf.topography(adata, basis='umap')
dyn.pl.streamline_plot(adata, basis='umap')

@ccruizm
Copy link
Author

ccruizm commented Aug 22, 2020

Thanks so such for the quick reply and fixing the issue! Now it works smoothly!

@ccruizm ccruizm closed this as completed Aug 22, 2020
Xiaojieqiu pushed a commit that referenced this issue Nov 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants