Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different length and counts depending on the number of loaded samples #54

Closed
FlorianRocher opened this issue Jun 29, 2023 · 1 comment
Closed

Comments

@FlorianRocher
Copy link

Hi,

I noticed something weird using tximport by summarizing at the gene level.
Depending on the number of samples I load I don't have the same length and counts for my genes in the corresponding samples.

Quant_Matrix_tximport=tximport(**AlltheFiles**,txIn = T, txOut = F, type="salmon", countsFromAbundance ="lengthScaledTPM", tx2gene = tx2gene)
Quant_Matrix_tximport1=tximport(**20FirstFiles**,txIn = T, txOut = F, type="salmon", countsFromAbundance ="lengthScaledTPM", tx2gene = tx2gene)

Example:
When all samples are load I have Sample1:Gene 1 --> Counts=386.3943113
When 20 samples are loaded I have Sample1:Gene1 --> Counts=383.4657441

I used "no" option insted to check. I did not see any differences at the count level but at the length of the genes level.

Is this something that is expected ?

Florian Rocher

@mikelove
Copy link
Collaborator

Yes expected.

When a gene has some samples with no expression for any isoform, the average length for those samples depends on the other samples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants