Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use Seurat Generated UMAP for Monocle3 pseudotime analyse? #388

Closed
sinvaya opened this issue Dec 6, 2019 · 8 comments
Closed

How to use Seurat Generated UMAP for Monocle3 pseudotime analyse? #388

sinvaya opened this issue Dec 6, 2019 · 8 comments

Comments

@sinvaya
Copy link

sinvaya commented Dec 6, 2019

Hi,

We want to use monocle3 for pseudotime analyze. But it generate a totally different UMAP than Seurat and it split into too many clusters. Monocle3 generates pseudotime based on UMAP. Can we use the UMAP of Seurat?

Thanks!

@RobertAlpin
Copy link

This issue has been brought up on the Seurat issues page as well, where a user there has posted a script that will work (although I personally had to make some modifications to get it to work for my purposes, namely changing the cluster source and carrying over the RNA assay matrix for differential analysis rather than the integrated one, as the UMAP is preserved either way).

@JoshuaCumming
Copy link

Hi Robert,

I am also having some issues with the script

Would you be able to share the modifications to it you made?

Thanks for any help!

@RobertAlpin
Copy link

So it would seem that there's a major issue with porting Seurat objects into Monocle, namely that the integration anchor data takes the place of PCA loadings, which causes issues downstream with pseudotime analysis as graphtest only has access to PCA loadings. Therefore I would suggest that you NOT use integration with monocle at this time. You might choose to try the Align function in Monocle which hypothetically serves a similar purpose, or to just cluster in seurat which, in my case gave me a cleaner result (although in my case I'm examining transcriptional change with the addition of certain chemicals-something align/integrate seemed to muddle somewhat).

Here's what I ran with some personal inputs expunged, you're going to need to find the resolution variables and whatnot yourself. I've still used the first steps as I prefer seurat's filtering features, but I CAN NOT RECOMMEND using the partition onward. Frankly, I've forgotten what most of the changes I made are and why, so take caution when using this.)

# ### Re-dimension reduction for 3D rendering
# 
# if (Dim = "3D") {
#   
#   print ("Running UMAP 3D")
#   
#   seurat <- RunUMAP(object = seurat, reduction = "pca", dims = 1:20, n.components = 3)
#   
#   print("Clustering 3D")
#   
#   seurat <- FindNeighbors(object=seurat, dims=1:20)
#   seurat <- FindClusters(object=seurat, resolution=0.5)
#   seurat[[sprintf("ClusterNames_%.1f_%dPC", 0.5, 20)]] <- Idents(object = seurat)
#   
# }
# 

### Building the necessary parts for a basic cds

# part one, gene annotations



seurat <- seuratobj


library(Seurat)
library(monocle3)
library(htmlwidgets)

seurat <- readRDS("WHEREVER/YOU/KEEP/YOURS")

gene_annotation <- as.data.frame(rownames(seurat@reductions[["pca"]]@feature.loadings), row.names = rownames(seurat@reductions[["pca"]]@feature.loadings))
colnames(gene_annotation) <- "gene_short_name"

# part two, cell information

cell_metadata <- as.data.frame(seurat@assays[["RNA"]]@counts@Dimnames[[2]], row.names = seurat@assays[["RNA"]]@counts@Dimnames[[2]])
colnames(cell_metadata) <- "barcode"

# part three, counts sparse matrix

New_matrix <- seurat@assays[["RNA"]]@counts
New_matrix <- New_matrix[rownames(seurat@reductions[["pca"]]@feature.loadings), ]
expression_matrix <- New_matrix


### Construct the basic cds object

cds_from_seurat <- new_cell_data_set(expression_matrix,
                                     cell_metadata = cell_metadata,
                                     gene_metadata = gene_annotation)


### Construct and assign the made up partition 
###### I DO NOT ADVISE THIS

recreate.partition <- c(rep(1, length(cds_from_seurat@colData@rownames)))
names(recreate.partition) <- cds_from_seurat@colData@rownames
recreate.partition <- as.factor(recreate.partition)

cds_from_seurat@clusters@listData[["UMAP"]][["partitions"]] <- recreate.partition


### Assign the cluster info

list_cluster <- seurat@meta.data[[sprintf("seurat_clusters")]]
#list_cluster <- seurat@meta.data[[sprintf("ClusterNames_%s_%sPC", 0.5, 20)]]
names(list_cluster) <- seurat@assays[["RNA"]]@data@Dimnames[[2]]

cds_from_seurat@clusters@listData[["UMAP"]][["clusters"]] <- list_cluster


### Could be a space-holder, but essentially fills out louvain parameters

cds_from_seurat@clusters@listData[["UMAP"]][["louvain_res"]] <- "NA"


### Assign UMAP coordinate

cds_from_seurat@reducedDims@listData[["UMAP"]] <-seurat@reductions[["umap"]]@cell.embeddings


### Assign feature loading for downstream module analysis

cds_from_seurat@preprocess_aux$gene_loadings <- seurat@reductions[["pca"]]@feature.loadings


### Learn graph, this step usually takes a significant period of time for larger samples

print("Learning graph, which can take a while depends on the sample")

cds_from_seurat <- learn_graph(cds_from_seurat, use_partition = T)

plot_cells(cds_from_seurat)

####Here I chose to save the gene_metadata, rds, etc.`

@JoshuaCumming
Copy link

Great, thanks for the help!

Best,

Joshua

@hpliner
Copy link
Contributor

hpliner commented Apr 15, 2020

Hello,

This issue appears to be an issue about our new package, Monocle 3. Please post any issues for Monocle 3 to the monocle3 repository at https://github.com/cole-trapnell-lab/monocle3.

Monocle 3 Development team

@hpliner hpliner closed this as completed Apr 15, 2020
@Kikibarmpa
Copy link

So it would seem that there's a major issue with porting Seurat objects into Monocle, namely that the integration anchor data takes the place of PCA loadings, which causes issues downstream with pseudotime analysis as graphtest only has access to PCA loadings. Therefore I would suggest that you NOT use integration with monocle at this time. You might choose to try the Align function in Monocle which hypothetically serves a similar purpose, or to just cluster in seurat which, in my case gave me a cleaner result (although in my case I'm examining transcriptional change with the addition of certain chemicals-something align/integrate seemed to muddle somewhat).

Here's what I ran with some personal inputs expunged, you're going to need to find the resolution variables and whatnot yourself. I've still used the first steps as I prefer seurat's filtering features, but I CAN NOT RECOMMEND using the partition onward. Frankly, I've forgotten what most of the changes I made are and why, so take caution when using this.)

# ### Re-dimension reduction for 3D rendering
# 
# if (Dim = "3D") {
#   
#   print ("Running UMAP 3D")
#   
#   seurat <- RunUMAP(object = seurat, reduction = "pca", dims = 1:20, n.components = 3)
#   
#   print("Clustering 3D")
#   
#   seurat <- FindNeighbors(object=seurat, dims=1:20)
#   seurat <- FindClusters(object=seurat, resolution=0.5)
#   seurat[[sprintf("ClusterNames_%.1f_%dPC", 0.5, 20)]] <- Idents(object = seurat)
#   
# }
# 

### Building the necessary parts for a basic cds

# part one, gene annotations



seurat <- seuratobj


library(Seurat)
library(monocle3)
library(htmlwidgets)

seurat <- readRDS("WHEREVER/YOU/KEEP/YOURS")

gene_annotation <- as.data.frame(rownames(seurat@reductions[["pca"]]@feature.loadings), row.names = rownames(seurat@reductions[["pca"]]@feature.loadings))
colnames(gene_annotation) <- "gene_short_name"

# part two, cell information

cell_metadata <- as.data.frame(seurat@assays[["RNA"]]@counts@Dimnames[[2]], row.names = seurat@assays[["RNA"]]@counts@Dimnames[[2]])
colnames(cell_metadata) <- "barcode"

# part three, counts sparse matrix

New_matrix <- seurat@assays[["RNA"]]@counts
New_matrix <- New_matrix[rownames(seurat@reductions[["pca"]]@feature.loadings), ]
expression_matrix <- New_matrix


### Construct the basic cds object

cds_from_seurat <- new_cell_data_set(expression_matrix,
                                     cell_metadata = cell_metadata,
                                     gene_metadata = gene_annotation)


### Construct and assign the made up partition 
###### I DO NOT ADVISE THIS

recreate.partition <- c(rep(1, length(cds_from_seurat@colData@rownames)))
names(recreate.partition) <- cds_from_seurat@colData@rownames
recreate.partition <- as.factor(recreate.partition)

cds_from_seurat@clusters@listData[["UMAP"]][["partitions"]] <- recreate.partition


### Assign the cluster info

list_cluster <- seurat@meta.data[[sprintf("seurat_clusters")]]
#list_cluster <- seurat@meta.data[[sprintf("ClusterNames_%s_%sPC", 0.5, 20)]]
names(list_cluster) <- seurat@assays[["RNA"]]@data@Dimnames[[2]]

cds_from_seurat@clusters@listData[["UMAP"]][["clusters"]] <- list_cluster


### Could be a space-holder, but essentially fills out louvain parameters

cds_from_seurat@clusters@listData[["UMAP"]][["louvain_res"]] <- "NA"


### Assign UMAP coordinate

cds_from_seurat@reducedDims@listData[["UMAP"]] <-seurat@reductions[["umap"]]@cell.embeddings


### Assign feature loading for downstream module analysis

cds_from_seurat@preprocess_aux$gene_loadings <- seurat@reductions[["pca"]]@feature.loadings


### Learn graph, this step usually takes a significant period of time for larger samples

print("Learning graph, which can take a while depends on the sample")

cds_from_seurat <- learn_graph(cds_from_seurat, use_partition = T)

plot_cells(cds_from_seurat)

####Here I chose to save the gene_metadata, rds, etc.`

Hello,
I followed your script and I still get the error:
Error: No dimensionality reduction for UMAP calculated. Please run reduce_dimensions with reduction_method = UMAP and cluster_cells before running learn_graph.

Any idea how to solve this? It seems like it doesn't recognize the dimensionality reduction from Seurat in order to create trajectories..or this is what I understand.

Thank you in advance!

@tlusardi
Copy link

tlusardi commented Jul 1, 2020

This has worked for me - good luck!

#### Create a Monocle CDS Object
    # Project PC dimensions to whole data set
    my.so <- ProjectDim(my.so, reduction = "pca")
  
    # Create an expression matrix
    expression_matrix <- my.so@assays$RNA@counts

    # Get cell metadata
    cell_metadata <- my.so@meta.data
    if (all.equal(colnames(expression_matrix), rownames(cell_metadata))) {
      print(sprintf("Cell identifiers match"))
    } else {
      print(sprintf("Cell identifier mismatch - %i cells in expression matrix, %i cells in metadata",
                    ncol(expression_matrix), nrow(cell_metadata)))
      print("If the counts are equal, sort differences will throw this error")
    }

    # get gene annotations
    gene_annotation <- data.frame(gene_short_name = rownames(my.so@assays$RNA), row.names = rownames(my.so@assays$RNA))
    if (all.equal(rownames(expression_matrix), rownames(gene_annotation))) {
      print(sprintf("Gene identifiers all match"))
    } else {
      print(sprintf("Gene identifier mismatch - %i genes in expression matrix, %i gene in gene annotation",
                    nrow(expression_matrix), nrow(gene_annotation)))
      print("If the counts are equal, sort differences will throw this error")
    }

    # Seurat-derived CDS
    my.cds <- new_cell_data_set(expression_matrix,
                                cell_metadata = cell_metadata,
                                gene_metadata = gene_annotation)
  
    # Transfer Seurat embeddings
    # Note that these may be calculated on the Integrated object, not the counts
    #   and thus will involve fewer genes
    reducedDim(my.cds, type = "PCA") <- my.so@reductions$pca@cell.embeddings 
    my.cds@preprocess_aux$prop_var_expl <- my.so@reductions$pca@stdev
    plot_pc_variance_explained(my.cds)
  
    # Transfer Seurat UMAP embeddings
    my.cds@int_colData@listData$reducedDims$UMAP <- my.so@reductions$umap@cell.embeddings
#    plot_cells(my.cds)
  
    # Copy cluster info from Seurat
    my.cds@clusters$UMAP_so$clusters <- my.so@meta.data$gt_tp_cell_type_integrated_.0.9

    my.cds <- cluster_cells(my.cds, reduction_method = "UMAP", resolution = 1e-3)

    # Fix from https://gitmemory.com/cole-trapnell-lab
    rownames(my.cds@principal_graph_aux$UMAP$dp_mst) <- NULL
    colnames(my.cds@int_colData@listData$reducedDims$UMAP) <- NULL
 
    DimPlot(so, reduction = "umap")
    DimPlot(my.so, reduction = "umap")
    plot_cells(my.cds, color_cells_by = "partition", group_label_size = 3.5)
    plot_cells(my.cds, color_cells_by = cellids, show_trajectory_graph = FALSE, group_label_size = 3.5)

@tlusardi
Copy link

Added a row/column name clean up so that you can run pseudo time... Apologies - it was strung across two scripts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants