SCTransform () and apply Findmarkers #5029

YeonuiKwak · 2021-09-01T14:31:43Z

Hello,
Although I know there are many issues relevant to my question, none of them seem to clearly ressolve my issue.
In my seurat object,

total cell number :100k
total human samples: 60 (condition: "Normal", "Disease")
-Normal human samples :n=30; Disease human sampels : n=30.
there are many confounding factors that affect the gene expression of our samples other than ; sex, region, weight, etc.

I defined ~20 clulsters by integrating data using CCA.
Now what I want to do is to identify DEGs between "normal" vs "disease" in a specific cluster( let's say cluster1).

What I was confused about in running Findmarkers is,
it is recommended to use raw counts (by setting the default assay to "RNA"), and I believe that Findmarker function will take "data" which is normalized by nUMI...
But the point is that In my data, various confounding factors can make it challenging to detect DEG between "normal" cells and "disease" cells in cluster 1. I strongly feel that I gotta regress out the effects of those confounding factors, such as sex, and batches, when identifying DEG in my case.

So, I am thinking it is better to do the following way?? Please correct me if I am wrong, and suggest me what I should do.

subset the seurat object only to have cluster 1 data.
for the subset.seurat, do SCTranform with return.only.var.genes = FALSE so that we can retain as many genes as possible in scale.data.
SCTransform(sub.seurat, vars.to.regress = c("mitoRatio","Phase","batch"),return.only.var.genes = FALSE)
Run Findmarkers on the "scale.data" of "SCT slot.

FindMarkers( sub.seurat, ident.1 = "case_D1",
ident.2 = "control_D1",
group.by = "condition"
assay="SCT",
slot="scale.data",
only.pos = F, logfc.threshold = 0.0)

Hope to hear from you soon!
Thank you!

saketkc · 2021-09-03T16:10:34Z

One option would be to:

Run SCTransform(object_x, vars.to.regress=c("sex", "region", "weight")) on individual objects_x (split by batch)
Integrate
Subset Cluster 1
Run FindMarkers on Cluster 1 on SCT assay, data slot with idents.1="Disease" and idents.2="Normal" (do not run SCTransform again)

There is an important caveat however, which is if the library sizes across batches are very different this might result in lot of false positives (see explanation here)

YeonuiKwak · 2021-09-07T22:56:46Z

Thank you very much!!

saketkc closed this as completed Sep 3, 2021

Tommy0398 mentioned this issue Nov 6, 2023

Workflow for SCTransform V2, Integration, and DE while regressing out variables queries #7976

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SCTransform () and apply Findmarkers #5029

SCTransform () and apply Findmarkers #5029

YeonuiKwak commented Sep 1, 2021

saketkc commented Sep 3, 2021

YeonuiKwak commented Sep 7, 2021

SCTransform () and apply Findmarkers #5029

SCTransform () and apply Findmarkers #5029

Comments

YeonuiKwak commented Sep 1, 2021

saketkc commented Sep 3, 2021

YeonuiKwak commented Sep 7, 2021