# [Change title].
[reference] 
 
 PMID: [link]
 ***
 

## 0. Setting up workenvironment<a id="0"></a>

In [None]:
suppressPackageStartupMessages({
    library(DropletUtils)
    library(SingleCellExperiment)
    library(scuttle)
    library(Seurat)
    library(SeuratWrappers)
    library(stringr)
    library(dplyr)
    library(data.table)
    library(Matrix)
    library(patchwork)
    library(ggplot2)
})

options(repr.plot.width = 16, repr.plot.height = 8)

## 1. Data analysis in Seurat

In [None]:
seuratObject <- readRDS(file = "/home/jovyan/researcher_home/tom/Atlas/data/SCA_core/Guilak/post-QC/E11_5_SeuratObject.Rds")

In [None]:
VlnPlot(seuratObject, features = c("nCount_RNA", "nFeature_RNA", "subsets_Mito_percent"), 
        ncol = 3, group.by = "orig.ident", pt.size = 0)

### 1.1. Feature selection

We next calculate a subset of features that exhibit high cell-to-cell variation in the dataset (i.e, they are highly expressed in some cells, and lowly expressed in others). We and others have found that focusing on these genes in downstream analysis helps to highlight biological signal in single-cell datasets.

The procedure in Seurat v3 directly models the mean-variance relationship inherent in single-cell data, and is implemented in the `FindVariableFeatures` function. By default, it defaults to 2,000 features (genes) per dataset. These will be used in downstream analysis, like principal component analysis (PCA).

In [None]:
seuratObject <- NormalizeData(seuratObject, verbose = FALSE)
seuratObject <- FindVariableFeatures(seuratObject, selection.method = "vst", nfeatures = 2000)

# Identify the 10 most highly variable genes
top10 <- head(VariableFeatures(seuratObject), 10)

In [None]:
options(repr.plot.width=10)

# plot variable features with and without labels
plot1 <- VariableFeaturePlot(seuratObject)
plot2 <- LabelPoints(plot = plot1, points = top10, repel = TRUE)
plot2

### 1.2. Scaling the data

Next, we apply a linear transformation (scaling) that is a standard pre-processing step prior to dimensional reduction techniques like PCA. The ScaleData function:

- Shifts the expression of each gene, so that the mean expression across cells is 0
- Scales the expression of each gene, so that the variance across cells is 1
- This step gives equal weight in downstream analyses, so that highly-expressed genes do not dominate. This is superior to log-normalized counts, which overemphasize the influence of small count fluctuations.
- The results of this are stored in `object[["originalexp"]]@scale.data`

In [None]:
all.genes <- rownames(seuratObject)
seuratObject <- ScaleData(seuratObject, features = all.genes)

### 1.3 Linear dimensionality reduction: PCA

Next we perform PCA on the scaled data. By default, only the previously determined variable features are used as input, but can be defined using features argument if you wish to choose a different subset.

In [None]:
seuratObject <- RunPCA(seuratObject, features = VariableFeatures(object = seuratObject), verbose = FALSE)

In [None]:
DimPlot(seuratObject, reduction = "pca") + xlab("PC 1") + ylab("PC 2")

### 1.4. Determine dimensionality

To overcome the extensive technical noise in any single feature for scRNA-seq data, Seurat clusters cells based on their PCA scores, with each PC essentially representing a 'metafeature' that combines information across a correlated feature set. The top principal components therefore represent a robust compression of the dataset. However, how many componenets should we choose to include? 

[Macosko et al](http://www.cell.com/abstract/S0092-8674(15)00549-8) introduced a resampling test inspired by the JackStraw procedure. They randomly permute a subset of the data (1% by default) and rerun PCA, constructing a 'null distribution' of feature scores, and repeat this procedure. 'Significant' PCs are defined as those who have a strong enrichment of low p-value features.

**NOTE: This process can take a long time for big datasets, and is therefore commented out for expediency. More approximate techniques such as a scree plot implemented in `ElbowPlot()` can be used to reduce computation time.**

The `JackStrawPlot` function provides a visualization tool for comparing the distribution of p-values for each PC with a uniform distribution (dashed line). 'Significant' PCs will show a strong enrichment of features with low p-values (solid curve above the dashed line).

An alternative heuristic method generates an 'Elbow plot': a ranking of principle components based on the percentage of variance explained by each one (ElbowPlot function). 

In [None]:
ElbowPlot(seuratObject, 50)

### 1.5. Clustering

Seurat v4 applies a graph-based clustering approach, building upon initial strategies in [Macosko et al](http://www.cell.com/abstract/S0092-8674(15)00549-8). Importantly, the distance metric which drives the clustering analysis (based on previously identified PCs) remains the same. The new method embeds cells in a graph structure - for example a K-nearest neighbor (KNN) graph, with edges drawn between cells with similar feature expression patterns, and then attempt to partition this graph into highly interconnected 'quasi-cliques' or 'communities'.

As in [PhenoGraph](https://doi.org/10.1016/j.cell.2015.05.047), Seurat first constructs a KNN graph based on the euclidean distance in PCA space, and refine the edge weights between any two cells based on the shared overlap in their local neighborhoods (Jaccard similarity). This step is performed using the `FindNeighbors` function, and takes as input the previously defined dimensionality of the dataset.

To cluster the cells, modularity optimization techniques such as the Louvain algorithm are then applied to iteratively group cells together, with the goal of optimizing the standard modularity function. The `FindClusters` function implements this procedure and contains a resolution parameter that sets the 'granularity' of the downstream clustering, with increased values leading to a greater number of clusters. Optimal resolution often increases for larger datasets.

In [None]:
seuratObject <- FindNeighbors(seuratObject, dims = 1:20)
seuratObject <- FindClusters(seuratObject, resolution = 0.3)

### 1.6. Non-linear dimensionality reduction: UMAP

Seurat offers several non-linear dimensional reduction techniques, such as tSNE and UMAP, to visualize and explore these datasets. The goal of these algorithms is to learn the underlying manifold of the data in order to place similar cells together in low-dimensional space. Cells within the graph-based clusters determined above should co-localize on these dimension reduction plots. As input to the UMAP and tSNE, [it is recommended to use the same PCs](https://doi.org/10.1038/s41467-019-13056-x) as input to the clustering analysis.

In [None]:
seuratObject <- RunUMAP(seuratObject, dims = 1:20, n.neighbors = 12, verbose = FALSE)

In [None]:
options(repr.plot.width = 8)

DimPlot(seuratObject, reduction = "umap", label = TRUE) + xlab("UMAP 1") + ylab("UMAP 2")

### 1.7. Differential expression analysis
#### 1.7.1. Global annotation by differential expression analysis

Seurat identifies markers that define clusters via differential expression. By default, it identifes positive and negative markers of a single cluster (specified in ident.1), compared to all other cells. `FindAllMarkers` automates this process for all clusters, but it is possible to test groups of clusters vs. each other, or against all cells.

The `min.pct` argument requires a feature to be detected at a minimum percentage in either of the two groups of cells, and the `thresh.test` argument requires a feature to be differentially expressed (on average) by some amount between the two groups. You can set both of these to 0, but with a dramatic increase in time - since this will test a large number of features that are unlikely to be highly discriminatory. As another option to speed up these computations, `max.cells.per.ident` can be set. This will downsample each identity class to have no more cells than whatever this is set to. While there is generally going to be a loss in power, the speed increases can be significiant and the most highly differentially expressed features will likely still rise to the top.

In [None]:
# Seurat default settings
seuratObject.markers <- FindAllMarkers(seuratObject, only.pos = TRUE, min.pct = 0.25, logfc.threshold = 0.25)

In [None]:
#select only cluster 0 and order avg_log2FC from big to small

seuratObject.markers[seuratObject.markers$cluster == 4,] %>%
slice_max(n = 20
          , order_by = avg_log2FC)

In [None]:
write.csv(seuratObject.markers, file = "/home/jovyan/researcher_home/tom/", quote = FALSE)

In [None]:
#Ectoderm
options(repr.plot.width = 18, repr.plot.height = 12)

p1 <- DimPlot(seuratObject, label = TRUE) + NoLegend() + xlab("UMAP 1") + ylab("UMAP 2")
p2 <- FeaturePlot(seuratObject, c("Fgf8","Wnt7a","En1",'Wnt6', 'Msx1','Msx2','Lama5'), min.cutoff = "q1")  

p1 + p2

In [None]:
#Mesenchyme
options(repr.plot.width = 18, repr.plot.height = 12)

p1 <- DimPlot(seuratObject, label = TRUE) + NoLegend() + xlab("UMAP 1") + ylab("UMAP 2")
p2 <- FeaturePlot(seuratObject, c("Fgf10","Prrx1","Prrx2","Meis1"), min.cutoff = "q1")  

p1 + p2

In [None]:
#Muscle
options(repr.plot.width = 18, repr.plot.height = 12)

p1 <- DimPlot(seuratObject, label = TRUE) + NoLegend() + xlab("UMAP 1") + ylab("UMAP 2")
p2 <- FeaturePlot(seuratObject, c("Pax3","Acta2","Myog"), min.cutoff = "q1")  

p1 + p2

In [None]:
#BEC
options(repr.plot.width = 18, repr.plot.height = 12)

p1 <- DimPlot(seuratObject, label = TRUE) + NoLegend() + xlab("UMAP 1") + ylab("UMAP 2")
p2 <- FeaturePlot(seuratObject, c("Pecam1","Flt1","Emcn"), min.cutoff = "q1")  

p1 + p2

In [None]:
options(repr.plot.width = 18, repr.plot.height = 12)

p1 <- DimPlot(seuratObject, label = TRUE) + NoLegend() + xlab("UMAP 1") + ylab("UMAP 2")
p2 <- FeaturePlot(seuratObject, c("Cd68","Ccr2","Elane","Mcpt8"), min.cutoff = "q1")  

p1 + p2

In [None]:
options(repr.plot.width = 18, repr.plot.height = 12)

p1 <- DimPlot(seuratObject, label = TRUE) + NoLegend() + xlab("UMAP 1") + ylab("UMAP 2")
p2 <- FeaturePlot(seuratObject, c("Fgf8","Wnt7a","Krt14","Krt7"), min.cutoff = "q1")  

p1 + p2

In [None]:
options(repr.plot.width = 18, repr.plot.height = 12)

p1 <- DimPlot(seuratObject, label = TRUE) + NoLegend() + xlab("UMAP 1") + ylab("UMAP 2")
p2 <- FeaturePlot(seuratObject, c("","","",""), min.cutoff = "q1")  

p1 + p2

In [None]:
DotPlot(object = seuratObject, features = c('','','',''), col.min = 0)

In [None]:
options(repr.plot.width = 18, repr.plot.height = 6)
FeaturePlot(A, features = c('', ''), blend = TRUE )

In [None]:
# Step 1: Define the variable 'name' with cell type / global
name <- "Global"

# Step 2: Create the column name dynamically
column_name <- paste0("CellType_", name)

In [None]:
seuratObject@meta.data[[column_name]] <- Idents(seuratObject)

In [None]:
seuratObject@meta.data[[column_name]] <- as.character(seuratObject@meta.data[[column_name]])

seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '0'] <- 'Mesenchyme'
seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '1'] <- 'Mesenchyme'
seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '2'] <- 'Mesenchyme'
seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '3'] <- 'Vascular endothelial cells'
seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '4'] <- 'Ectoderm'
seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '5'] <- 'Ectoderm'
seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '6'] <- 'Immune cells'
seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '7'] <- 'Muscle progenitor'

seuratObject@meta.data[[column_name]] <- as.factor(seuratObject@meta.data[[column_name]])

In [None]:
options(repr.plot.width = 12)

DimPlot(seuratObject, reduction = "umap", label = TRUE, group.by = column_name) + xlab("UMAP 1") + ylab("UMAP 2")

In [None]:
saveRDS(seuratObject, file = "/home/jovyan/researcher_home/tom/Atlas/data/SCA_core/Guilak/V2_E11_5_global.rds")

## 2. Refinement of global annotations

### 2.1. Mesenchymal

In [None]:
seuratObject <- readRDS("/home/jovyan/researcher_home/tom/Atlas/data/SCA_core/Guilak/V2_E11_5_global.rds")

In [None]:
Idents(seuratObject) <- seuratObject@meta.data$CellType_Global

In [None]:
seuratObject <- subset(seuratObject, idents = c("Mesenchyme"))

In [None]:
seuratObject <- ScaleData(seuratObject)

In [None]:
seuratObject <- RunPCA(seuratObject, features = VariableFeatures(object = seuratObject), verbose = FALSE)

In [None]:
ElbowPlot(seuratObject, 50)

In [None]:
seuratObject <- FindNeighbors(seuratObject, dims = 1:20, verbose = FALSE)
seuratObject <- FindClusters(seuratObject, resolution = 0.8, verbose = FALSE)

In [None]:
seuratObject <- RunUMAP(seuratObject, dims = 1:20, n.neighbors = 12, verbose = FALSE)

In [None]:
options(repr.plot.width = 10, repr.plot.height = 8)
DimPlot(seuratObject, reduction = "umap", label=TRUE) + xlab("UMAP 1") + ylab("UMAP 2")

In [None]:
# Seurat default settings
seuratObject.markers <- FindAllMarkers(seuratObject, only.pos = TRUE, min.pct = 0.25, logfc.threshold = 0.25)

In [None]:
#select only cluster 0 and order avg_log2FC from big to small

seuratObject.markers[seuratObject.markers$cluster == 0,] %>%
slice_max(n = 20
          , order_by = avg_log2FC)

In [None]:
write.csv(seuratObject.markers, file = "/home/jovyan/researcher_home/tom/", quote = FALSE)

In [None]:
options(repr.plot.width = 18, repr.plot.height = 12)

p1 <- DimPlot(seuratObject, label = TRUE) + NoLegend() + xlab("UMAP 1") + ylab("UMAP 2")
p2 <- FeaturePlot(seuratObject, c("Hoxa13","Hoxa11","Hoxa9"), min.cutoff = "q1", 
                  keep.scale ='all',
                  order = TRUE
                 )  

p1 + p2

In [None]:
options(repr.plot.width = 18, repr.plot.height = 12)

p1 <- DimPlot(seuratObject, label = TRUE) + NoLegend() + xlab("UMAP 1") + ylab("UMAP 2")
p2 <- FeaturePlot(seuratObject, c("Hoxd13","Hoxd11","Hoxd9",'Meis1'), min.cutoff = "q1", keep.scale ='all', order = TRUE)  

p1 + p2

In [None]:
options(repr.plot.width = 18, repr.plot.height = 12)

p1 <- DimPlot(seuratObject, label = TRUE) + NoLegend() + xlab("UMAP 1") + ylab("UMAP 2")
p2 <- FeaturePlot(seuratObject, 
                  c("Prrx1","Sox9","Col2a1",'Ihh'), 
                  min.cutoff = "q1", 
                  keep.scale = 'all'
                 )  

p1 + p2

In [None]:
options(repr.plot.width = 18, repr.plot.height = 6)
FeaturePlot(seuratObject, features = c('Sox9', 'Prrx1'), blend = TRUE, order = TRUE)

In [None]:
options(repr.plot.width = 18, repr.plot.height = 12)

p1 <- DimPlot(seuratObject, label = TRUE) + NoLegend() + xlab("UMAP 1") + ylab("UMAP 2")
p2 <- FeaturePlot(seuratObject, 
                  c("Tnmd","Prrx1","Scx"), 
                  min.cutoff = "q1", 
                  keep.scale = 'all'
                 )  

p1 + p2

In [None]:
options(repr.plot.width = 10, repr.plot.height = 8)
DimPlot(seuratObject, reduction = "umap", label=TRUE) + xlab("UMAP 1") + ylab("UMAP 2")

In [None]:
# Step 1: Define the variable '' with cell type
name <- "Mesenchyme"

# Step 2: Create the column name dynamically
column_name <- paste0("CellType_", name)

In [None]:
seuratObject@meta.data[[column_name]] <- Idents(seuratObject)

In [None]:
seuratObject@meta.data[[column_name]] <- as.character(seuratObject@meta.data[[column_name]])

seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '0'] <- 'Distal limb bud mesenchyme'
seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '1'] <- 'Tenocyte precursor'
seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '2'] <- 'Proximal limb bud mesenchyme'
seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '3'] <- 'Proximal limb bud mesenchyme'
seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '4'] <- 'Intermediate limb bud mesenchyme'
seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '5'] <- 'Intermediate limb bud mesenchyme'
seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '6'] <- 'Chondrocyte precursor'
seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '7'] <- 'Proximal limb bud mesenchyme'
seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '8'] <- 'Intermediate limb bud mesenchyme'
seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '9'] <- 'Distal limb bud mesenchyme'

seuratObject@meta.data[[column_name]] <- as.factor(seuratObject@meta.data[[column_name]])

In [None]:
options(repr.plot.width = 14)

DimPlot(seuratObject, reduction = "umap", label = TRUE, group.by = column_name) + xlab("UMAP 1") + ylab("UMAP 2")

In [None]:
saveRDS(seuratObject, file = "/home/jovyan/researcher_home/tom/Atlas/data/SCA_core/Guilak/V2_E11_5_mesenchyme.rds")

### 2.2. Ectoderm

In [None]:
seuratObject <- readRDS("/home/jovyan/researcher_home/tom/Atlas/data/SCA_core/Guilak/V2_E11_5_global.rds")

In [None]:
Idents(seuratObject) <- seuratObject@meta.data$CellType_Global

In [None]:
seuratObject <- subset(seuratObject, idents = c("Ectoderm"))

In [None]:
seuratObject <- ScaleData(seuratObject)

In [None]:
seuratObject <- RunPCA(seuratObject, features = VariableFeatures(object = seuratObject), verbose = FALSE)

In [None]:
ElbowPlot(seuratObject, 50)

In [None]:
seuratObject <- FindNeighbors(seuratObject, dims = 1:20, verbose = FALSE)
seuratObject <- FindClusters(seuratObject, resolution = 0.4, verbose = FALSE)

In [None]:
seuratObject <- RunUMAP(seuratObject, dims = 1:20, n.neighbors = 12, verbose = FALSE)

In [None]:
options(repr.plot.width = 10, repr.plot.height = 8)
DimPlot(seuratObject, reduction = "umap", label=TRUE) + xlab("UMAP 1") + ylab("UMAP 2")

In [None]:
# Seurat default settings
seuratObject.markers <- FindAllMarkers(seuratObject, only.pos = TRUE, min.pct = 0.25, logfc.threshold = 0.25)

In [None]:
#select only cluster 0 and order avg_log2FC from big to small

seuratObject.markers[seuratObject.markers$cluster == 0,] %>%
slice_max(n = 20
          , order_by = avg_log2FC)

In [None]:
#select only cluster 0 and order avg_log2FC from big to small

seuratObject.markers[seuratObject.markers$cluster == 1,] %>%
slice_max(n = 20
          , order_by = avg_log2FC)

In [None]:
#select only cluster 0 and order avg_log2FC from big to small

seuratObject.markers[seuratObject.markers$cluster == 2,] %>%
slice_max(n = 20
          , order_by = avg_log2FC)

In [None]:
write.csv(seuratObject.markers, file = "/home/jovyan/researcher_home/tom/", quote = FALSE)

In [None]:
options(repr.plot.width = 18, repr.plot.height = 12)

p1 <- DimPlot(seuratObject, label = TRUE) + NoLegend() + xlab("UMAP 1") + ylab("UMAP 2")
p2 <- FeaturePlot(seuratObject, c("Cxcr4","Edar","Nfkb1",'Wnt10b'), min.cutoff = "q1")  

p1 + p2

In [None]:
options(repr.plot.width = 18, repr.plot.height = 6)
FeaturePlot(seuratObject, features = c('Wnt7a', 'En1'), blend = TRUE)

In [None]:
options(repr.plot.width = 18, repr.plot.height = 12)

p1 <- DimPlot(seuratObject, label = TRUE) + NoLegend() + xlab("UMAP 1") + ylab("UMAP 2")
p2 <- FeaturePlot(seuratObject, c("Fgf8"), min.cutoff = "q1")  

p1 + p2

In [None]:
options(repr.plot.width = 10, repr.plot.height = 8)
DimPlot(seuratObject, reduction = "umap", label=TRUE) + xlab("UMAP 1") + ylab("UMAP 2")

In [None]:
# Step 1: Define the variable '' with cell type
name <- "Ectoderm"

# Step 2: Create the column name dynamically
column_name <- paste0("CellType_", name)

In [None]:
seuratObject@meta.data[[column_name]] <- Idents(seuratObject)

In [None]:
seuratObject@meta.data[[column_name]] <- as.character(seuratObject@meta.data[[column_name]])

seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '0'] <- 'Ectoderm'
seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '1'] <- 'Keratinocytes'
seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == '2'] <- 'AER'

seuratObject@meta.data[[column_name]] <- as.factor(seuratObject@meta.data[[column_name]])

In [None]:
options(repr.plot.width = 8)

DimPlot(seuratObject, reduction = "umap", label = TRUE, group.by = column_name) + xlab("UMAP 1") + ylab("UMAP 2")

In [None]:
saveRDS(seuratObject, file = "/home/jovyan/researcher_home/tom/Atlas/data/SCA_core/Guilak/V2_E11_5_ectoderm.rds")

## 3. Updating annotations

In [None]:
global <- readRDS("/home/jovyan/researcher_home/tom/Atlas/data/SCA_core/Guilak/V2_E11_5_global.rds")

In [None]:
subset1 <- readRDS(file = "/home/jovyan/researcher_home/tom/Atlas/data/SCA_core/Guilak/V2_E11_5_mesenchyme.rds")

In [None]:
subset2 <- readRDS(file = "/home/jovyan/researcher_home/tom/Atlas/data/SCA_core/Guilak/V2_E11_5_ectoderm.rds")

In [None]:
meta1 <- global@meta.data
meta2 <- subset1@meta.data

In [None]:
df <- split(meta1, meta1$CellType_Global)

In [None]:
subset1a <- rownames(df$Mesenchyme)

In [None]:
subset1a_meta <- meta2[subset1a, c("CellType_Mesenchyme", "orig.ident")]
subset1a_meta$orig.ident <- NULL

head(subset1a_meta)

In [None]:
df$Mesenchyme <- cbind(df$Mesenchyme, subset1a_meta)
df$Mesenchyme$CellType_Global <- df$Mesenchyme$CellType_Mesenchyme
df$Mesenchyme$CellType_Mesenchyme <- NULL

In [None]:
meta2 <- subset2@meta.data

In [None]:
subset2a <- rownames(df$Ectoderm)

In [None]:
subset2a_meta <- meta2[subset2a, c("CellType_Ectoderm", "orig.ident")]
subset2a_meta$orig.ident <- NULL

head(subset2a_meta)

In [None]:
df$Ectoderm <- cbind(df$Ectoderm, subset2a_meta)
df$Ectoderm$CellType_Global <- df$Ectoderm$CellType_Ectoderm
df$Ectoderm$CellType_Ectoderm <- NULL

In [None]:
df$'Immune cells'$Barcode <- rownames(df$'Immune cells')
df$'Vascular endothelial cells'$Barcode <- rownames(df$'Vascular endothelial cells')
#df$Ectoderm$Barcode <- rownames(df$Ectoderm)
df$'Muscle progenitor'$Barcode <- rownames(df$'Muscle progenitor')

In [None]:
df$Mesenchyme$Barcode = rownames(df$Mesenchyme)
df$Ectoderm$Barcode = rownames(df$Ectoderm)
df$'Muscle progenitor'$Barcode = rownames(df$'Muscle progenitor')
df$'Immune cells'$Barcode = rownames(df$'Immune cells')
df$'Vascular endothelial cells'$Barcode = rownames(df$'Vascular endothelial cells')

**Make sure all dataframes in the list have the same number of columns, in the same order.**

In [None]:
str(df)

In [None]:
names(df)

In [None]:
meta.data <- do.call(rbind, df)
rownames(meta.data) = meta.data$Barcode
meta.data$Barcode <- NULL

In [None]:
head(meta.data)

In [None]:
dim(global@meta.data)
dim(meta.data)

In [None]:
target <- rownames(global@meta.data)
meta.data <- meta.data[match(target, rownames(meta.data)),]

In [None]:
global@meta.data = meta.data

In [None]:
global$CellType <- global$CellType_Global
global$CellType_Global <- NULL

In [None]:
seuratObject <- global
rm(global)

#### 5.4.3. Finalized embedding and annotation<a id="37"></a>

In [None]:
options(repr.plot.width=14)
DimPlot(seuratObject, group.by = "CellType", label = FALSE) + xlab("UMAP 1") + ylab("UMAP 2")

#### 5.4.3. Finalized embedding and annotation<a id="37"></a>

In [None]:
saveRDS(seuratObject, file = "/home/jovyan/researcher_home/tom/Atlas/data/SCA_core/Guilak/V2_E11_5_final.rds")

## Some code

In [None]:
options(repr.plot.width = 18, repr.plot.height = 12)

p1 <- DimPlot(seuratObject, label = TRUE) + NoLegend() + xlab("UMAP 1") + ylab("UMAP 2")
p2 <- FeaturePlot(seuratObject, c("","","",""), min.cutoff = "q1")  

p1 + p2

In [None]:
DotPlot(object = seuratObject, features = c('','','',''), col.min = 0)

In [None]:
options(repr.plot.width = 18, repr.plot.height = 6)
FeaturePlot(A, features = c('', ''), blend = TRUE )

### 2.X. [change cell type]

In [None]:
seuratObject <- readRDS("/home/jovyan/researcher_home/tom/Atlas/data/SCA_core/Guilak/V2_E11_5_global.rds")

In [None]:
Idents(seuratObject) <- seuratObject@meta.data$CellType_Global

In [None]:
seuratObject <- subset(seuratObject, idents = c("Muscle"))

In [None]:
seuratObject <- ScaleData(seuratObject)

In [None]:
seuratObject <- RunPCA(seuratObject, features = VariableFeatures(object = seuratObject), verbose = FALSE)

In [None]:
ElbowPlot(seuratObject, 50)

In [None]:
seuratObject <- FindNeighbors(seuratObject, dims = 1:20, verbose = FALSE)
seuratObject <- FindClusters(seuratObject, resolution = 0.8, verbose = FALSE)

In [None]:
seuratObject <- RunUMAP(seuratObject, dims = 1:20, n.neighbors = 12, verbose = FALSE)

In [None]:
options(repr.plot.width = 10, repr.plot.height = 8)
DimPlot(seuratObject, reduction = "umap", label=TRUE) + xlab("UMAP 1") + ylab("UMAP 2")

In [None]:
# Seurat default settings
seuratObject.markers <- FindAllMarkers(seuratObject, only.pos = TRUE, min.pct = 0.25, logfc.threshold = 0.25)

In [None]:
#select only cluster 0 and order avg_log2FC from big to small

seuratObject.markers[seuratObject.markers$cluster == 0,] %>%
slice_max(n = 20
          , order_by = avg_log2FC)

In [None]:
write.csv(seuratObject.markers, file = "/home/jovyan/researcher_home/tom/", quote = FALSE)

In [None]:
options(repr.plot.width = 18, repr.plot.height = 12)

p1 <- DimPlot(seuratObject, label = TRUE) + NoLegend() + xlab("UMAP 1") + ylab("UMAP 2")
p2 <- FeaturePlot(seuratObject, c("","","",""), min.cutoff = "q1")  

p1 + p2

In [None]:
options(repr.plot.width = 18, repr.plot.height = 12)

p1 <- DimPlot(seuratObject, label = TRUE) + NoLegend() + xlab("UMAP 1") + ylab("UMAP 2")
p2 <- FeaturePlot(seuratObject, c("","","",""), min.cutoff = "q1")  

p1 + p2

In [None]:
options(repr.plot.width = 10, repr.plot.height = 8)
DimPlot(seuratObject, reduction = "umap", label=TRUE) + xlab("UMAP 1") + ylab("UMAP 2")

In [None]:
# Step 1: Define the variable '' with cell type
name <- ""

# Step 2: Create the column name dynamically
column_name <- paste0("CellType_", name)

In [None]:
seuratObject@meta.data[[column_name]] <- Idents(seuratObject)

In [None]:
seuratObject@meta.data[[column_name]] <- as.character(seuratObject@meta.data[[column_name]])

seuratObject@meta.data[[column_name]][seuratObject@meta.data[[column_name]] == ''] <- ''

seuratObject@meta.data[[column_name]] <- as.factor(seuratObject@meta.data[[column_name]])

In [None]:
options(repr.plot.width = 8)

DimPlot(seuratObject, reduction = "umap", label = TRUE, group.by = column_name) + xlab("UMAP 1") + ylab("UMAP 2")

In [None]:
saveRDS(seuratObject, file = "/home/jovyan/researcher_home/tom/")