Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about no_feature count in transcript_model_counts.tsv #141

Closed
FabianJetzinger opened this issue Jan 11, 2024 · 3 comments
Closed
Labels
bug Something isn't working weird results Something looks odd in the resulting files

Comments

@FabianJetzinger
Copy link

Hello!

I'm trying to understand the output of IsoQuant in more detail, specifically .transcript_counts.tsv and .transcript_model_counts.tsv, using the toy data (MAPT.Mouse) from this repository.

Running:
isoquant.py --reference MAPT.Mouse.reference.fasta --genedb MAPT.Mouse.genedb.gtf --fastq MAPT.Mouse.ONT.simulated.fastq --data_type nanopore -o toy_data_out

While there are 12 Transcripts in the OUT.transcript_counts.tsv, there are only 10 in OUT.transcript_models.tsv; I take this to mean that for the two transcripts "ENSMUST00000100347.10" and "ENSMUST00000146353.1", IsoQuant does not see enough evidence in the reads to include their transcript models, even though they are known reference transcripts. This seems to be supported by the fact that OUT.read_assignments.tsv shows no FSMs for these two transcripts, only ISMs.

What confuses me, however, is that OUT.transcript_counts.tsv shows "no_feature 15", while OUT.transcript_model_counts.tsv shows "no_feature 0", where I would have expected either "no_feature 15" (the same 15 completely unassigned reads), or even "no_feature X" (where X is the 15 unassigned reads, plus the number of reads assigned to the transcripts which were not included in the transcript_models).

See isoquant_OUT.zip

What am I missing or misunderstanding here? Is this number just always 0 for the transcript_model? Sadly I wasn't able to come up with an answer from reading older issues, so any help would be greatly appreciated.

@andrewprzh andrewprzh added bug Something isn't working weird results Something looks odd in the resulting files labels Jan 24, 2024
@andrewprzh
Copy link
Collaborator

Dear @FabianJetzinger

Sorry for the delayed response.

Yes, you are right, inconsistencies between OUT.transcript_counts.tsv and OUT.transcript_models.tsv are normal since they have different nature and underlying algorithms. However, there are known minor flaws and currently I'm working on making them more consistent with each other.

With respect to no_feature attribute, you are also right, it seems that this field is simply ignored in OUT.transcript_models.tsv.
I will fix that, thanks for the report!

Best
Andrey

@FabianJetzinger
Copy link
Author

Thanks for your reply, that makes it much clearer!

@andrewprzh
Copy link
Collaborator

Should be now fixed in IsoQuant 3.4.

Also, correlation between transcript_model_counts and transcript_counts is better now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working weird results Something looks odd in the resulting files
Projects
None yet
Development

No branches or pull requests

2 participants