-
Notifications
You must be signed in to change notification settings - Fork 889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Seurat3.0 negative value in data slot from IntegrateData() #1057
Comments
Hi Zinger, Does the presence of negative values cause any problem in downstream applications? For expression data, the Tim |
Hi Tim, Thank you for the prompt response. We are interesting to perform differential expression analysis between clusters and we are uncertain of how to approach the negative values. The need for integration stems from significant batch effect in the dataset we are provided with. Best regards, |
In general we don't recommend using the integrated matrix for differential expression, instead you might like to try using the logistic regression test with batch as a latent variable: DefaultAssay(object) <- "RNA"
markers <- FindAllMarkers(object, test.use = "LR", latent.vars = "orig.ident") This uses the uncorrected, log-normalized, expression data and compares a logistic regression model predicting group membership from the expression of the gene and latent variables with a null model containing only latent variables. The test is based on this paper from Lior Pachter's group: http://dx.doi.org/10.1101/258566 |
Hi Tim, Thanks for the clarification and recommendation. We will look into it. Best, |
Hi Tim, Best regards, |
Hi,
Thank you for providing the community a great single cell analysis tool. I am attempting to integrate two datasets with batch effect in Seurat v3 for evaluation purpose, following default configuration in each function. However I am obtaining negative values in the integration result. My understanding from the online FAQ (https://satijalab.org/seurat/faq):
The data slot (object@data) stores normalized and log-transformed single cell expression. This maintains the relative abundance levels of all genes, and contains only zeros or positive values
In a second try with a different datasets I am also retrieving negative values in the data slot. I have confirmed in all cases the input matrices do not contain values <0 and the outputs provide expected subpopulations / gene patterns. I am wondering if you can help pinpoint what may have been configured incorrectly?
Best regards,
Zinger
The text was updated successfully, but these errors were encountered: