Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requested feature: handle inputs from Seurat #262

Closed
TomKellyGenetics opened this issue Nov 26, 2019 · 27 comments
Closed

Requested feature: handle inputs from Seurat #262

TomKellyGenetics opened this issue Nov 26, 2019 · 27 comments
Labels
bug Something isn't working
Milestone

Comments

@TomKellyGenetics
Copy link

Proposal to implement to extract a cell_data_set from a Seurat object. Can submit a PR if needed.

@TomKellyGenetics TomKellyGenetics added the bug Something isn't working label Nov 26, 2019
@diegoalexespi
Copy link

hi @TomKellyGenetics - I've been able to extract a monocle3 CDS from a seurat v3 object - would be happy to share the function here or as part of a pull request, I'm not sure which is best. Granted, it's for moving a SCT & harmony-integrated dataset from Seurat to monocle3 so it may not work with all input Seurat objects

@TomKellyGenetics
Copy link
Author

@d93espinoza That would be great. I've managed to extract the relevant fields of a Seurat object but haven't made it into a function (thinking about the best way to do this). If you've already done it, that would be great, I hope to see it as a PR.

@summerrfair
Copy link

@d93espinoza Hello! I am hoping to perform pseudotime analysis on an integrated Seurat (v3) object in Monocle. It sounds like you have performed similar analyses - I would greatly appreciate your help and any code you may be willing to share. Thank you so much!

@TomKellyGenetics
Copy link
Author

TomKellyGenetics commented Jan 15, 2020

@summerrfair Based on the documentation here, this is a working example on a Suerat object. Note that monocle takes the raw counts (merged or integrated objects are okay) and requires the gene names and barcodes to match the metadata.

library("org.Hs.eg.db")
gene_symbol <- as.list(org.Hs.egSYMBOL)
library("Suerat")
library("monocle3")
corrected_data <- GetAssayData(sample.integrated, assay = "integrated", slot = "data")
raw_count_data <- GetAssayData(sample.integrated, assay = "RNA", slot = "counts")
class(raw_count_data)

cells_info <- sample.integrated@meta.data

gene_name <- gene_symbol[rownames(raw_count_data)]  
gene_name <- sapply(gene_name, function(x) x[[1]][1])

#preparing cds
gene_name <- ifelse(is.na(gene_name), names(gene_name), gene_name)
gene_short_name <- gene_name
gene_id <- rownames(raw_count_data)


genes_info <- cbind(gene_id, gene_short_name)
genes_info <- as.data.frame(genes_info)
rownames(genes_info) <- rownames(raw_count_data)

cds <- new_cell_data_set(expression_data = raw_count_data,
                         cell_metadata = cells_info,
                         gene_metadata = genes_info)
cds

You can also ensure that the clusters and dimension reductions match those computed in Suerat like this:

#replace monocle clusters with seurat
cds@clusters$UMAP$partitions <- sample.integrated@meta.data$seurat_clusters
names(cds@clusters$UMAP$partitions) <- rownames(sample.integrated@meta.data)
cds@clusters$UMAP$clusters <- sample.integrated@meta.data$seurat_clusters
names(cds@clusters$UMAP$clusters) <- rownames(sample.integrated@meta.data)

#replace monocle dimensions with seurat
cds@reducedDims$TSNE <-  suerat_object@reductions$tsne@cell.embeddings
cds@reducedDims$UMAP <-  suerat_object@reductions$umap@cell.embeddings

It wouldn't be too difficult to wrap either of these into a function but not sure on the best way to do this.

@summerrfair
Copy link

@TomKellyGenetics

Many thanks for sharing this. Very helpful! I will try to implement and will let you know how it goes.

@diegoalexespi
Copy link

diegoalexespi commented Feb 24, 2020

Here's my try (so far) at the function, sorry for the delay and thanks @TomKellyGenetics for sharing your code!

ToMonocle3 <- function(seurat_object,
                       scale_all = FALSE,
                       assay = "SCT",
                       reduction_for_projection = "PCA",
                       UMAP_cluster_slot = NULL){
  
  if(scale_all){
    message("Getting residuals for all Seurat genes in chosen assay slot and placing in scale.data")
    seurat_genes <- rownames(seurat_object[[assay]])
    remaining_genes <- setdiff(seurat_genes, rownames(seurat_object[[assay]]@scale.data))
    if(assay == "SCT"){
      seurat_object <- Seurat::GetResidual(seurat_object, features = remaining_genes, assay = assay, umi.assay = "RNA")
    } else {
      seurat_object <- Seurat::ScaleData(seurat_object, features = rownames(seurat_object[[assay]]))
    }
  }
  
  #We prep the seurat object by creating gene loadings for ALL genes in the Seurat scale.data slot. This is done to allow downstream monocle3 functions on gene_modules to work appropriately.
  message("Projecting gene loadings for all Seurat genes in scale.data slot")
  seurat_object <- Seurat::ProjectDim(seurat_object, reduction = reduction_for_projection, assay = assay)
  
  ##################
  
  message("Initializing CDS object")
  
  #Extract Seurat's log-transformed values
  expression_matrix <- Seurat::GetAssayData(seurat_object, assay = assay, slot = "counts")
  #Extract Seurat meta_data
  meta_data <- seurat_object@meta.data
  #Extract gene names from Seurat object SCT slot to make CDS
  seurat_genes <- data.frame(gene_short_name = rownames(seurat_object[[assay]]),
                             row.names = rownames(seurat_object[[assay]]))
  new_cds <- monocle3::new_cell_data_set(expression_data = expression_matrix, cell_metadata = meta_data, gene_metadata = seurat_genes)
  
  ##################
  
  message("Making an SCE object from the Seurat object to facilitate transfer of information from SCE to CDS")
  sce <- as.SingleCellExperiment(seurat_object, assay = assay)
  message("Loading in all Seurat reductions (PCA, HARMONY, UMAP, etc.) into CDS")
  SingleCellExperiment::reducedDims(new_cds) <- SingleCellExperiment::reducedDims(sce)
  message("Loading in specified Seurat assay into CDS")
  SummarizedExperiment::assays(new_cds) <- SummarizedExperiment::assays(sce)
  message("Loading in Seurat gene names into CDS")
  SummarizedExperiment::rowData(new_cds) <- SummarizedExperiment::rowData(sce)
  SummarizedExperiment::rowData(new_cds)$gene_short_name <-  row.names(new_cds)
  message("Loading in Seurat gene loadings into CDS")
  new_cds@preprocess_aux$gene_loadings <- seurat_object@reductions[[reduction_for_projection]]@feature.loadings.projected
  
  ##################
  
  message("Get user specified selected clusters (or active idents) from Seurat and load into CDS")
  if(is.null(UMAP_cluster_slot)){
    list_cluster <- Idents(seurat_object)
  } else {
    Idents(seurat_object) <- UMAP_cluster_slot
    list_cluster <- Idents(seurat_object)
  }
  new_cds@clusters[["UMAP"]]$clusters <- list_cluster
  #The next two commands are run in order to allow "order_cells" to be run in monocle3
  rownames(new_cds@principal_graph_aux[['UMAP']]$dp_mst) <- NULL
  colnames(SingleCellExperiment::reducedDims(new_cds)[["UMAP"]]) <- NULL
  
  ##################
  
  message("Setting all cells as belonging to one partition (multiple partitions not supported yet)")
  recreate_partition <- c(rep(1, length(new_cds@colData@rownames)))
  names(recreate_partition) <- new_cds@colData@rownames
  recreate_partition <- as.factor(recreate_partition)
  new_cds@clusters[["UMAP"]]$partitions <- recreate_partition
  
  ##################
  message("Done")
  new_cds
}

@Vidya-Acadgild
Copy link

@TomKellyGenetics
Thank you for sharing the codes on using Monocle3 for Seurat object. I am trying to use your script on my Seurat integrated object. I am getting an error message - could not find function "new_cell_data_set". It would be helpful to know how to resolve this. Thanks!

@TomKellyGenetics
Copy link
Author

@Vidya-Acadgild Have you run library("monocle3") to import this function?

@Vidya-Acadgild
Copy link

@TomKellyGenetics
Thanks for the message. I re-ran the installation steps and I was able to import my Seurat object to create cds using the function new_cell_data_set (). For further analysis, I followed the steps as in the link -https://cole-trapnell-lab.github.io/monocle3/docs/clustering/
But I am stuck with this error message - "Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 'x' must be atomic "- when making the dot plot with the function plot_genes_by_group(). Not sure how to resolve this. I greatly appreciate any help in this regard. Thank you!

@TomKellyGenetics
Copy link
Author

@Vidya-Acadgild Sorry it's unclear what issue you are having here. I suggest that you follow the steps in the vignette in order. Please note that this code is not supported by Seurat or monocle3 developers and is a workaround. If you are still having trouble, I suggest you open a new issue on more details on what you've tried already and give a minimal reproducible example. You can do this in R with dput and reprex.

@Vidya-Acadgild
Copy link

@TomKellyGenetics
Yes, I understand, your codes were very helpful in importing the Seurat object to Monocle3 as there was no straight forward solution available at the developers' website. Sure, I will open a new issue to discuss the follow-up errors after generating the cds with your code. Thank you!

@Vidya-Acadgild
Copy link

@TomKellyGenetics
A quick clarification- I imported the Seurat object (which underwent clustering and cluster specification) to monocle 3, following your code. Should I need to do the preprocessing and dimensionality reduction again as per the Monocle3 vignette - I am not sure if I understand this correctly. My apologies if I am missing something here, Thanks a lot for your time.

@Wincky111
Copy link

Wincky111 commented Jun 5, 2020

You can also ensure that the clusters and dimension reductions match those computed in Suerat like this:
#replace monocle clusters with seurat
cds@clusters$UMAP$partitions <- sample.integrated@meta.data$seurat_clusters
names(cds@clusters$UMAP$partitions) <- rownames(sample.integrated@meta.data)
cds@clusters$UMAP$clusters <- sample.integrated@meta.data$seurat_clusters
names(cds@clusters$UMAP$clusters) <- rownames(sample.integrated@meta.data)

It wouldn't be too difficult to wrap either of these into a function but not sure on the best way to do this.

@TomKellyGenetics Hi, Tom, thank you for your codes, it is helpful.
But you did here only replace the Monocle clusters with Seurat clusters, do you know how to replace the reduce_dimension data of Monocle? Because I want the umap in Monocle is consistent with Seurat.
I tried
cds@reducedDims@ListDAta[["UMAP']] <- Data.integrated@reductions[["umap"]]@cell.embeddings
cds@reducedDims[["UMAP"]] = as.matrix(Data.integrated@reductions[["umap"]]@cell.embeddings)
reducedDims(cds) = as.matrix(molar[["pca"]]@cell.embeddings)

They didn't work. It's really puzzled me. Thanks.

@TomKellyGenetics
Copy link
Author

TomKellyGenetics commented Jun 8, 2020

Yes you can do that as follows.

cds@reducedDims$UMAP <-  suerat_object@reductions$umap@cell.embeddings

Note: you need to have run umap with monocle3 plot_cells(cds) first to create the objects.

@Wincky111
Copy link

Yes you can do that as follows.
cds@reducedDims$UMAP <- suerat_object@reductions$umap@cell.embeddings

Note: you need to have run umap with monocle3 plot_cells(cds) first to create the objects.

Thank you for your help.
Still the same question, after using your code to create the cds object, I ran:
cds <- preprocess_cds(cds, num_dim = 20)
cds <- reduce_dimension(cds)
plot_cells(cds)
cds@reducedDims$UMAP <- seurat@reductions$umap@cell.embeddings
Still get this wrong information:
no slot of name "reducedDims" for this object of class "cell_data_set"

Is it because the version of SingleCellExperiment? Mine is 1.8.0
If I ran this:
reducedDims(cds) = as.matrix(seurat[["pca"]]@cell.embeddings)
I will get another error:
Error in vapply(value, nrow, FUN.VALUE = 0L) : values must be length 1, but FUN(X[[1]]) result is length 0

God help me==

@TomKellyGenetics
Copy link
Author

TomKellyGenetics commented Jun 8, 2020

Here's a minimal reproducible example.

library("Seurat")
library("monocle3")
seurat_object <- identity(pbmc_small)

library("org.Hs.eg.db")
gene_symbol <- as.list(org.Hs.egSYMBOL)
raw_count_data <- GetAssayData(seurat_object, assay = "RNA", slot = "counts")
class(raw_count_data)

cells_info <- seurat_object@meta.data

gene_name <- gene_symbol[rownames(raw_count_data)]  
gene_name <- sapply(gene_name, function(x) x[[1]][1])

#preparing cds
gene_name <- ifelse(is.na(gene_name), names(gene_name), gene_name)
gene_short_name <- gene_name
gene_id <- rownames(raw_count_data)


genes_info <- cbind(gene_id, gene_short_name)
genes_info <- as.data.frame(genes_info)
rownames(genes_info) <- rownames(raw_count_data)

cds <- new_cell_data_set(expression_data = raw_count_data,
                         cell_metadata = cells_info,
                         gene_metadata = genes_info)
cds <- preprocess_cds(cds)
cds <- reduce_dimension(cds)
cds <- cluster_cells(cds)

#replace monocle dimensions with seurat
seurat_object <- RunTSNE(seurat_object, dims = 1:2)
cds@reducedDims$TSNE <-  seurat_object@reductions$tsne@cell.embeddings
seurat_object <- RunUMAP(seurat_object, dims = 1:2)
cds@reducedDims$UMAP <-  seurat_object@reductions$umap@cell.embeddings
#replace monocle clusters with seurat
seurat_object <- FindClusters(seurat_object, dims = 1:2)
cds@clusters$UMAP$partitions <- seurat_object@meta.data$seurat_clusters
names(cds@clusters$UMAP$partitions) <- rownames(seurat_object@meta.data)
cds@clusters$UMAP$clusters <- seurat_object@meta.data$seurat_clusters
names(cds@clusters$UMAP$clusters) <- rownames(seurat_object@meta.data)


plot_cells(cds)

R version 3.6.2, Seurat_3.0.0, monocle3_0.2.0

@hpliner hpliner modified the milestones: v0.2.2, v0.2.3 Jun 11, 2020
@rohanshad
Copy link

I can confirm what @Wincky111 is seeing, there's no "reducedDims" slot in the cds object after preprocessing / reducing dimensions / and clustering.

This is what I see:
Error in cds@reducedDims$UMAP <- infarct@reductions$umap@cell.embeddings : no slot of name "reducedDims" for this object of class "cell_data_set"

@TomKellyGenetics
Copy link
Author

TomKellyGenetics commented Jun 16, 2020

@rohanshad @Wincky111 You need to run monocle3::reduce_dimension first which will create the slot, then replace it with the Seurat results.

cds <- reduce_dimension(cds)
cds@reducedDims$UMAP <- infarct@reductions$umap@cell.embeddings

This example works with monocle3 version 0.2.0.

@williamhsy
Copy link

Happened to meet the same error: "no slot of name "reducedDims""
If you use monocle 3 version 0.2.1, maybe you could try: reducedDims(cds)$UMAP.

It works for me!

@rohanshad
Copy link

Yep, above answer works!
reducedDims(cds)$UMAP <- infarct@reductions$umap@cell.embeddings replaced the monocle3 embedding with Seurat ones

@brgew
Copy link
Collaborator

brgew commented Jun 16, 2020

Hi,

Paul Hoffman at the Satija Lab is working on tools for inter-converting Seurat and Monocle3 objects. This is a work-in-progress so it may not be feature-complete or stable but Paul encourages people to test it.

The URL is https://github.com/satijalab/seurat-wrappers/tree/feat/monocle3.

@rohanshad
Copy link

rohanshad commented Jun 16, 2020

@williamhsy @TomKellyGenetics any luck getting the learn_graph command to work on new embeddings processed this way? use_partition = TRUE / FALSE doesn't help and 'order_cells' errors out. Hmmm...

Edit:
Was able to get everything going by following #130. Everything works as expected now.

@TomKellyGenetics
Copy link
Author

As announced by Rahul Satija this is now supported by SeuratWrappers. See here for details on the code and usage.

devtools::install_github('satijalab/seurat-wrappers')
library(monocle3)
library(Seurat)
library(SeuratWrappers)

Apply it to a Seurat objects as follows:

cds <- as.cell_data_set(integrated_seurat_obj)

@laijen000
Copy link

laijen000 commented Jun 30, 2020

Hello! I have followed the SeuratWrappers tutorial for converting an integrated seurat v3 object to monocle3. However, when I try to plot single gene expression in pseudotime using plot_genes_in_pseudotime(), I get the error

"Error: When label_by_short_name = TRUE, rowData must have a column of gene names called gene_short_name."

rowData(cds)
DataFrame with 33538 rows and 0 columns

I wonder whether the as.cell_data_set() function to convert my integrated seurat object into a cds does not carry over gene names as a column.

Edit: I was able to get around this by adding the gene_short_name column to my rowData(cds) from a previous monocle dataframe I already had.

@yaolutian
Copy link

Hello! I have followed the SeuratWrappers tutorial for converting an integrated seurat v3 object to monocle3. However, when I try to plot single gene expression in pseudotime using plot_genes_in_pseudotime(), I get the error

"Error: When label_by_short_name = TRUE, rowData must have a column of gene names called gene_short_name."

rowData(cds)
DataFrame with 33538 rows and 0 columns

I wonder whether the as.cell_data_set() function to convert my integrated seurat object into a cds does not carry over gene names as a column.

Edit: I was able to get around this by adding the gene_short_name column to my rowData(cds) from a previous monocle dataframe I already had.

Hi, I have the same problem! can you specify how you did this? thanks a lot!

@deliasoto
Copy link

@laijen000
Can you provide an example for adding the gene_short_name column from a previous monocle dataframe? I'm also getting:

"Error: When label_by_short_name = TRUE, rowData must have a column of gene names called gene_short_name."

or can someone else provide an example to set up the gene_short_name column after converting from a Seurat object?
Thanks!!

@laijen000
Copy link

This worked for me previously!

library(monocle3)
library(SeuratWrappers)
cds <- as.cell_data_set(so)
## Calculate size factors using built-in function in monocle3
cds <- estimate_size_factors(cds)
## Add gene names into CDS
cds@rowRanges@elementMetadata@listData[["gene_short_name"]] <- rownames(so[["RNA"]])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests