You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
Although I know there are many issues relevant to my question, none of them seem to clearly ressolve my issue.
In my seurat object,
total cell number :100k
total human samples: 60 (condition: "Normal", "Disease")
-Normal human samples :n=30; Disease human sampels : n=30.
there are many confounding factors that affect the gene expression of our samples other than ; sex, region, weight, etc.
I defined ~20 clulsters by integrating data using CCA.
Now what I want to do is to identify DEGs between "normal" vs "disease" in a specific cluster( let's say cluster1).
What I was confused about in running Findmarkers is,
it is recommended to use raw counts (by setting the default assay to "RNA"), and I believe that Findmarker function will take "data" which is normalized by nUMI...
But the point is that In my data, various confounding factors can make it challenging to detect DEG between "normal" cells and "disease" cells in cluster 1. I strongly feel that I gotta regress out the effects of those confounding factors, such as sex, and batches, when identifying DEG in my case.
So, I am thinking it is better to do the following way?? Please correct me if I am wrong, and suggest me what I should do.
subset the seurat object only to have cluster 1 data.
for the subset.seurat, do SCTranform with return.only.var.genes = FALSE so that we can retain as many genes as possible in scale.data.
SCTransform(sub.seurat, vars.to.regress = c("mitoRatio","Phase","batch"),return.only.var.genes = FALSE)
Run SCTransform(object_x, vars.to.regress=c("sex", "region", "weight")) on individual objects_x (split by batch)
Integrate
Subset Cluster 1
Run FindMarkers on Cluster 1 on SCT assay, data slot with idents.1="Disease" and idents.2="Normal" (do not run SCTransform again)
There is an important caveat however, which is if the library sizes across batches are very different this might result in lot of false positives (see explanation here)
Hello,
Although I know there are many issues relevant to my question, none of them seem to clearly ressolve my issue.
In my seurat object,
total cell number :100k
total human samples: 60 (condition: "Normal", "Disease")
-Normal human samples :n=30; Disease human sampels : n=30.
there are many confounding factors that affect the gene expression of our samples other than ; sex, region, weight, etc.
I defined ~20 clulsters by integrating data using CCA.
Now what I want to do is to identify DEGs between "normal" vs "disease" in a specific cluster( let's say cluster1).
What I was confused about in running Findmarkers is,
it is recommended to use raw counts (by setting the default assay to "RNA"), and I believe that Findmarker function will take "data" which is normalized by nUMI...
But the point is that In my data, various confounding factors can make it challenging to detect DEG between "normal" cells and "disease" cells in cluster 1. I strongly feel that I gotta regress out the effects of those confounding factors, such as sex, and batches, when identifying DEG in my case.
So, I am thinking it is better to do the following way?? Please correct me if I am wrong, and suggest me what I should do.
SCTransform(sub.seurat, vars.to.regress = c("mitoRatio","Phase","batch"),return.only.var.genes = FALSE)
FindMarkers( sub.seurat, ident.1 = "case_D1",
ident.2 = "control_D1",
group.by = "condition"
assay="SCT",
slot="scale.data",
only.pos = F, logfc.threshold = 0.0)
Hope to hear from you soon!
Thank you!
The text was updated successfully, but these errors were encountered: