Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Number of cells discrepancy #20

Closed
sgmccalla opened this issue Aug 4, 2020 · 1 comment
Closed

Number of cells discrepancy #20

sgmccalla opened this issue Aug 4, 2020 · 1 comment

Comments

@sgmccalla
Copy link

I have a question about the data available on ncbi and the cell cluster labels. I downloaded GEO data, but am noticing that there are fewer cells in the ground truth true label files on github than in the GEO single cell dataset. Is there a cell cluster file available that contains more cells?

For example, there are 4001 cells for sc_10x GSM3022245, but only 902 cells in the ground truth true label dataset; there are 384 cells for sc_CEL-seq2 GSM3336845, but 274 cells in the ground truth true label dataset; there are 4001 sc_Drop-seq GSM3336849, but 225 cells in the ground truth true label dataset.

@LuyiTian
Copy link
Owner

LuyiTian commented Aug 4, 2020

for the 10x, I think you might refer to the wrong reference. We had two batch of 10x data. The first batch have around 900 cells (3 cell lines), second have around 4000 cells(5 cell lines). For sc_CEL-seq2 5 cell line mixtures (3X384 well plate), there are more doublets than we would expect so we exclude this data when we compare some methods, such as clustering. The Drop-seq data contains around 200 cells.
The gene count matrixs were acquired by scPipe, which contains top X cells ranked by reads and it does not perform QC in preprocessing step. So the cell number in gene count matrix does not reflact the real cell number, just result of the parameter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants