Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too many WARNING, for excemple: WARNING - Malformed transcript mRNA_23979 has no exons #143

Closed
biochristmas opened this issue Jan 21, 2024 · 4 comments
Labels
third-party problem Problem is related to other tools, libraries, system etc weird results Something looks odd in the resulting files

Comments

@biochristmas
Copy link

Dear @andrewprzh, When I runing isoquant.py, i got many WARNING, I wanted to inquire about a potential issue that may affect the final output of the GTF file. I am wondering if you are aware of the underlying reason causing this issue.
warning
This is my input files and outfile:
Files.zip
Best wish.

@biochristmas
Copy link
Author

Dear @andrewprzh,
"The reference annotation indicates that there are 100,919 genes and 107,233 mRNAs. The input files consist of 63,077 and 4,922 sequences, respectively. The command used is 'isoquant.py --reference reference.fa --genedb reference.gtf --fastq mixture.polish.fasta genome.polish.fasta --data_type pacbio_ccs -o output.' The resulting file is OUT.extended_annotation.gtf. The annotation file obtained contains 29,734 genes and 35,349 transcripts. I understand that this annotation file in the isoquant software includes complete reference annotations and any novel transcripts found. Why are there significantly fewer genes and transcripts compared to the number of records in the reference annotation GTF file? During the execution, there were many 'no exons' warnings. Could these warnings possibly be the cause of the missing gene and transcript count?"
The input and output files are included in the ‘Files_1 and Files_2’ compressed packages. I look forward to your response. Thank you!
Files_2.zip
Files_1.zip

@andrewprzh
Copy link
Collaborator

andrewprzh commented Jan 24, 2024

Dear @biochristmas

Regarding the first question. It seems like gffutils does something weird during conversion on your GTF to .db. Could you try changing all 'mRNA' entries to 'transcript' (i.e. in column 3 of your GTF)? I'll also try to get more insights.

With respect to the second question. Yes, the problem can directly connected by the first issue. But as I mentioned in another thread, there is a minor flaw in IsoQuant that will be fixed in the next release.

Best
Andrey

@andrewprzh andrewprzh added weird results Something looks odd in the resulting files third-party problem Problem is related to other tools, libraries, system etc labels Jan 24, 2024
@biochristmas
Copy link
Author

Dear @andrewprzh,
Thank you for your prompt reply. I replaced 'mRNA' with 'transcript' and successfully resolved the warning. I am looking forward to your new version to solve my urgent problem.

@andrewprzh
Copy link
Collaborator

New IsoQuant 3.4 is finally out, so you can give it a try. It now implements additional GTF consistency checks.
Please, re-open if you encounter similar issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
third-party problem Problem is related to other tools, libraries, system etc weird results Something looks odd in the resulting files
Projects
None yet
Development

No branches or pull requests

2 participants