# Single-Cell RNA Analysis for Noobs
**Analyzing scRNA-seq data step-by-step using Seurat in R**

 *Dataset: PBMC dataset from GEO - GSE120219*

Let's dive into some single-cell RNA sequencing analysis using the PBMC dataset from GEO!

In [None]:
# Step 1: Install & Load Required Packages
install.packages("Seurat")
install.packages("dplyr")
install.packages("patchwork")
install.packages("GEOquery")
library(Seurat)
library(dplyr)
library(patchwork)
library(GEOquery)

## Step 2: Load the PBMC Dataset
This dataset contains PBMCs from healthy individuals.

### We’ll load the data directly from GEO.

In [None]:
# Step 2: Load the PBMC Dataset
gse <- getGEO("GSE120219", GSEMatrix = TRUE)
pbmc_data <- exprs(gse[[1]])
pbmc_metadata <- pData(gse[[1]])
head(pbmc_data)
head(pbmc_metadata)

## Step 3: Preprocessing the Data
We filter out low-quality cells and normalize the data.

In [None]:
# Convert the data into a Seurat object
pbmc <- CreateSeuratObject(counts = pbmc_data, meta.data = pbmc_metadata)

# Quality control
pbmc[['percent.mt']] <- PercentageFeatureSet(pbmc, pattern = "^MT-")
pbmc <- subset(pbmc, subset = nFeature_RNA > 200 & nFeature_RNA < 2500 & percent.mt < 5)

# Normalize data
pbmc <- NormalizeData(pbmc, normalization.method = "LogNormalize", scale.factor = 10000)

# Identify highly variable genes
pbmc <- FindVariableFeatures(pbmc, selection.method = "vst", nfeatures = 2000)

# Scale data
pbmc <- ScaleData(pbmc)

## Step 4: Principal Component Analysis (PCA)
Let’s reduce the dimensionality of the data.

In [None]:
pbmc <- RunPCA(pbmc, features = VariableFeatures(object = pbmc))
DimPlot(pbmc, reduction = "pca")

## Step 5: Clustering & UMAP Visualization
Now, let’s cluster the cells and visualize them in a UMAP plot.

In [None]:
# Find neighbors and clusters
pbmc <- FindNeighbors(pbmc, dims = 1:10)
pbmc <- FindClusters(pbmc, resolution = 0.5)

# Run UMAP for visualization
pbmc <- RunUMAP(pbmc, dims = 1:10)
DimPlot(pbmc, reduction = "umap", label = TRUE)

## Step 6: Find Marker Genes
Identify differentially expressed genes for each cluster.

In [None]:
cluster.markers <- FindAllMarkers(pbmc, only.pos = TRUE, min.pct = 0.25, logfc.threshold = 0.25)
head(cluster.markers)

## Yo, finally
Congratulations bestie!  You’ve just completed your first scRNA-seq analysis using Seurat!

- Want to explore more? Check out the full Seurat tutorial: [Seurat Website](https://satijalab.org/seurat/)
- If something breaks, **Google it first**, then open an issue here!

Now go forth and analyze like a pro! 