reference gene lost #140

zpliu1126 · 2024-01-07T08:01:52Z

Hi~ Andrey,

As you mentioned in README.

SAMPLE_ID.extended_annotation.gtf - GTF file with the entire reference annotation plus all discovered novel transcripts;

After I successfully ran IsoQuant, I checked the contents of the file SAMPLE_ID.extended_annotation.gtf; I found that it was missing some genes compared to annotated reference gff3 (38,958 vs 40,281). Are the missing 1,323 reference annotated genes due to long reads not detecting these genes ?

#* count of reference genes
cat A2.extended_annotation.gtf|awk '$3=="gene"{print $0}'|grep novel_gene -v |wc -l

Best
zpliu

The text was updated successfully, but these errors were encountered:

andrewprzh · 2024-01-11T10:22:39Z

Dear @zpliu1126

This looks odd, but I feel like I might have a clue where the bug is...
I'm out of the office for a while, so I'll try to fix it as soon as I'm back.

Best
Andrey

biochristmas · 2024-01-23T10:06:43Z

Hi,
I have also encountered a similar problem. The reference annotation indicates that there are 100,919 genes and 107,233 mRNAs. The input files consist of 63,077 and 4,922 sequences, respectively. The command used is 'isoquant.py --reference reference.fa --genedb reference.gtf --fastq mixture.polish.fasta genome.polish.fasta --data_type pacbio_ccs -o output.' The resulting file is OUT.extended_annotation.gtf. The annotation file obtained contains 29,734 genes and 35,349 transcripts. I understand that this annotation file in the isoquant software includes complete reference annotations and any novel transcripts found. Why are there significantly fewer genes and transcripts compared to the number of records in the reference annotation GTF file? During the execution, there were many 'no exons' warnings. Could these warnings possibly be the cause of the missing gene and transcript count?"

andrewprzh · 2024-01-24T10:23:29Z

Dear @biochristmas @zpliu1126

Yes, there is a flaw in the construction function, this problem will be fixed in the nearest release.

Best
Andrey

andrewprzh · 2024-05-09T07:52:09Z

Finally released new version 3.4, which fixes this issue.

andrewprzh added bug Something isn't working weird results Something looks odd in the resulting files labels Jan 11, 2024

dongdongdong0203 mentioned this issue Mar 29, 2024

Question about Canonical #171

Closed

andrewprzh added the fixed in release Issue resolved and the fix is released, waiting for approval label May 9, 2024

andrewprzh closed this as completed May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reference gene lost #140

reference gene lost #140

zpliu1126 commented Jan 7, 2024

andrewprzh commented Jan 11, 2024

biochristmas commented Jan 23, 2024

andrewprzh commented Jan 24, 2024

andrewprzh commented May 9, 2024

reference gene lost #140

reference gene lost #140

Comments

zpliu1126 commented Jan 7, 2024

andrewprzh commented Jan 11, 2024

biochristmas commented Jan 23, 2024

andrewprzh commented Jan 24, 2024

andrewprzh commented May 9, 2024