Standardize gene representation #381

garrettjstevens · 2024-04-10T04:23:51Z

Up until now we've basically preserved exactly what is in the GFF3 that is imported, with only a bit of formatting changes to store it internally. However, this has led to some places in our code that handle things differently based on how the GFF3 is formatted. A big example is the CanonicalGeneGlyph and the ImplicitExonGeneGlyph. There have also been GFF3s that we've tried uploading where neither of these glyphs work. I also noticed this behavior when looking at the Transcript Details Widget, certain things only worked if the GFF3 was formatted in a certain way.

I think the way we need to handle this going forward is to standardize the GFF3 data on import, specifically for genes, so Apollo can always expect a single format. This means a potential loss of data. For example, if a five_prime_UTR in a GFF3 has an ID, but we decide to drop UTRs from the data when standardizing it (since the location of UTRs can be calculated based on the locations of other features), we'd lose the UTR's ID. I think this is unavoidable, though, and can also be somewhat mitigated by having a robust GFF3 export system.

Here are some GFF3s that I found that illustrate how GFF3s format genes:

Sequence Ontology GFF3 Spec

See GFF3

##gff-version 3.1.26
##sequence-region ctg123 1 1497228
ctg123	.	gene	1000	9000	.	+	.	ID=gene00001;Name=EDEN
ctg123	.	TF_binding_site	1000	1012	.	+	.	ID=tfbs00001;Parent=gene00001
ctg123	.	mRNA	1050	9000	.	+	.	ID=mRNA00001;Parent=gene00001;Name=EDEN.1
ctg123	.	mRNA	1050	9000	.	+	.	ID=mRNA00002;Parent=gene00001;Name=EDEN.2
ctg123	.	mRNA	1300	9000	.	+	.	ID=mRNA00003;Parent=gene00001;Name=EDEN.3
ctg123	.	exon	1300	1500	.	+	.	ID=exon00001;Parent=mRNA00003
ctg123	.	exon	1050	1500	.	+	.	ID=exon00002;Parent=mRNA00001,mRNA00002
ctg123	.	exon	3000	3902	.	+	.	ID=exon00003;Parent=mRNA00001,mRNA00003
ctg123	.	exon	5000	5500	.	+	.	ID=exon00004;Parent=mRNA00001,mRNA00002,mRNA00003
ctg123	.	exon	7000	9000	.	+	.	ID=exon00005;Parent=mRNA00001,mRNA00002,mRNA00003
ctg123	.	CDS	1201	1500	.	+	0	ID=cds00001;Parent=mRNA00001;Name=edenprotein.1
ctg123	.	CDS	3000	3902	.	+	0	ID=cds00001;Parent=mRNA00001;Name=edenprotein.1
ctg123	.	CDS	5000	5500	.	+	0	ID=cds00001;Parent=mRNA00001;Name=edenprotein.1
ctg123	.	CDS	7000	7600	.	+	0	ID=cds00001;Parent=mRNA00001;Name=edenprotein.1
ctg123	.	CDS	1201	1500	.	+	0	ID=cds00002;Parent=mRNA00002;Name=edenprotein.2
ctg123	.	CDS	5000	5500	.	+	0	ID=cds00002;Parent=mRNA00002;Name=edenprotein.2
ctg123	.	CDS	7000	7600	.	+	0	ID=cds00002;Parent=mRNA00002;Name=edenprotein.2
ctg123	.	CDS	3301	3902	.	+	0	ID=cds00003;Parent=mRNA00003;Name=edenprotein.3
ctg123	.	CDS	5000	5500	.	+	1	ID=cds00003;Parent=mRNA00003;Name=edenprotein.3
ctg123	.	CDS	7000	7600	.	+	1	ID=cds00003;Parent=mRNA00003;Name=edenprotein.3
ctg123	.	CDS	3391	3902	.	+	0	ID=cds00004;Parent=mRNA00003;Name=edenprotein.4
ctg123	.	CDS	5000	5500	.	+	1	ID=cds00004;Parent=mRNA00003;Name=edenprotein.4
ctg123	.	CDS	7000	7600	.	+	1	ID=cds00004;Parent=mRNA00003;Name=edenprotein.4

Gene contains mRNA, exon, and CDS. CDSs have multiple locations under the same ID.
Source: https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md#the-canonical-gene

Ensembl GRCh38

See GFF3

##gff-version 3.1.26
##sequence-region ctg123 1 1497228
##gff-version 3
##sequence-region   19 1 58617616
#!genome-build Genome Reference Consortium GRCh38.p14
#!genome-version GRCh38
#!genome-date 2013-12
#!genome-build-accession GCA_000001405.29
#!genebuild-last-updated 2023-07
19	ensembl_havana	gene	44905791	44909393	.	+	.	ID=gene:ENSG00000130203;Name=APOE;biotype=protein_coding;description=apolipoprotein E [Source:HGNC Symbol%3BAcc:HGNC:613];gene_id=ENSG00000130203;logic_name=ensembl_havana_gene_homo_sapiens;version=10
19	havana	mRNA	44905791	44908944	.	+	.	ID=transcript:ENST00000446996;Parent=gene:ENSG00000130203;Name=APOE-204;biotype=protein_coding;transcript_id=ENST00000446996;transcript_support_level=2;version=5
19	havana	exon	44905791	44905841	.	+	.	Parent=transcript:ENST00000446996;Name=ENSE00001768924;constitutive=0;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=ENSE00001768924;rank=1;version=1
19	havana	five_prime_UTR	44905791	44905841	.	+	.	Parent=transcript:ENST00000446996
19	havana	five_prime_UTR	44906587	44906624	.	+	.	Parent=transcript:ENST00000446996
19	havana	exon	44906587	44906667	.	+	.	Parent=transcript:ENST00000446996;Name=ENSE00001667751;constitutive=0;ensembl_end_phase=1;ensembl_phase=-1;exon_id=ENSE00001667751;rank=2;version=1
19	havana	CDS	44906625	44906667	.	+	0	ID=CDS:ENSP00000413135;Parent=transcript:ENST00000446996;protein_id=ENSP00000413135
19	havana	exon	44907760	44907952	.	+	.	Parent=transcript:ENST00000446996;Name=ENSE00000893952;constitutive=0;ensembl_end_phase=2;ensembl_phase=1;exon_id=ENSE00000893952;rank=3;version=1
19	havana	CDS	44907760	44907952	.	+	2	ID=CDS:ENSP00000413135;Parent=transcript:ENST00000446996;protein_id=ENSP00000413135
19	havana	exon	44908533	44908944	.	+	.	Parent=transcript:ENST00000446996;Name=ENSE00001664168;constitutive=0;ensembl_end_phase=0;ensembl_phase=2;exon_id=ENSE00001664168;rank=4;version=1
19	havana	CDS	44908533	44908944	.	+	1	ID=CDS:ENSP00000413135;Parent=transcript:ENST00000446996;protein_id=ENSP00000413135
19	havana	lnc_RNA	44905796	44907326	.	+	.	ID=transcript:ENST00000485628;Parent=gene:ENSG00000130203;Name=APOE-205;biotype=retained_intron;transcript_id=ENST00000485628;transcript_support_level=1;version=2
19	havana	exon	44905796	44905841	.	+	.	Parent=transcript:ENST00000485628;Name=ENSE00001048576;constitutive=0;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=ENSE00001048576;rank=1;version=3
19	havana	exon	44906602	44907326	.	+	.	Parent=transcript:ENST00000485628;Name=ENSE00001943579;constitutive=0;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=ENSE00001943579;rank=2;version=2
19	ensembl_havana	mRNA	44905796	44909393	.	+	.	ID=transcript:ENST00000252486;Parent=gene:ENSG00000130203;Name=APOE-201;biotype=protein_coding;ccdsid=CCDS12647.1;tag=basic,Ensembl_canonical,MANE_Select;transcript_id=ENST00000252486;transcript_support_level=1 (assigned to previous version 8);version=9
19	ensembl_havana	exon	44905796	44905841	.	+	.	Parent=transcript:ENST00000252486;Name=ENSE00001048576;constitutive=0;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=ENSE00001048576;rank=1;version=3
19	ensembl_havana	five_prime_UTR	44905796	44905841	.	+	.	Parent=transcript:ENST00000252486
19	ensembl_havana	five_prime_UTR	44906602	44906624	.	+	.	Parent=transcript:ENST00000252486
19	ensembl_havana	exon	44906602	44906667	.	+	.	Parent=transcript:ENST00000252486;Name=ENSE00003577086;constitutive=0;ensembl_end_phase=1;ensembl_phase=-1;exon_id=ENSE00003577086;rank=2;version=1
19	ensembl_havana	CDS	44906625	44906667	.	+	0	ID=CDS:ENSP00000252486;Parent=transcript:ENST00000252486;protein_id=ENSP00000252486
19	ensembl_havana	exon	44907760	44907952	.	+	.	Parent=transcript:ENST00000252486;Name=ENSE00000893952;constitutive=0;ensembl_end_phase=2;ensembl_phase=1;exon_id=ENSE00000893952;rank=3;version=1
19	ensembl_havana	CDS	44907760	44907952	.	+	2	ID=CDS:ENSP00000252486;Parent=transcript:ENST00000252486;protein_id=ENSP00000252486
19	ensembl_havana	CDS	44908533	44909250	.	+	1	ID=CDS:ENSP00000252486;Parent=transcript:ENST00000252486;protein_id=ENSP00000252486
19	ensembl_havana	exon	44908533	44909393	.	+	.	Parent=transcript:ENST00000252486;Name=ENSE00000893954;constitutive=0;ensembl_end_phase=-1;ensembl_phase=2;exon_id=ENSE00000893954;rank=4;version=3
19	ensembl_havana	three_prime_UTR	44909251	44909393	.	+	.	Parent=transcript:ENST00000252486
19	havana	mRNA	44905812	44909025	.	+	.	ID=transcript:ENST00000434152;Parent=gene:ENSG00000130203;Name=APOE-203;biotype=protein_coding;transcript_id=ENST00000434152;transcript_support_level=2;version=5
19	havana	five_prime_UTR	44905812	44905868	.	+	.	Parent=transcript:ENST00000434152
19	havana	exon	44905812	44905923	.	+	.	Parent=transcript:ENST00000434152;Name=ENSE00001601606;constitutive=0;ensembl_end_phase=1;ensembl_phase=-1;exon_id=ENSE00001601606;rank=1;version=1
19	havana	CDS	44905869	44905923	.	+	0	ID=CDS:ENSP00000413653;Parent=transcript:ENST00000434152;protein_id=ENSP00000413653
19	havana	exon	44906602	44906667	.	+	.	Parent=transcript:ENST00000434152;Name=ENSE00003463686;constitutive=0;ensembl_end_phase=1;ensembl_phase=1;exon_id=ENSE00003463686;rank=2;version=1
19	havana	CDS	44906602	44906667	.	+	2	ID=CDS:ENSP00000413653;Parent=transcript:ENST00000434152;protein_id=ENSP00000413653
19	havana	exon	44907760	44907952	.	+	.	Parent=transcript:ENST00000434152;Name=ENSE00000893952;constitutive=0;ensembl_end_phase=2;ensembl_phase=1;exon_id=ENSE00000893952;rank=3;version=1
19	havana	CDS	44907760	44907952	.	+	2	ID=CDS:ENSP00000413653;Parent=transcript:ENST00000434152;protein_id=ENSP00000413653
19	havana	exon	44908533	44909025	.	+	.	Parent=transcript:ENST00000434152;Name=ENSE00001700162;constitutive=0;ensembl_end_phase=0;ensembl_phase=2;exon_id=ENSE00001700162;rank=4;version=1
19	havana	CDS	44908533	44909025	.	+	1	ID=CDS:ENSP00000413653;Parent=transcript:ENST00000434152;protein_id=ENSP00000413653
19	havana	mRNA	44906360	44908954	.	+	.	ID=transcript:ENST00000425718;Parent=gene:ENSG00000130203;Name=APOE-202;biotype=protein_coding;transcript_id=ENST00000425718;transcript_support_level=1;version=1
19	havana	five_prime_UTR	44906360	44906624	.	+	.	Parent=transcript:ENST00000425718
19	havana	exon	44906360	44906667	.	+	.	Parent=transcript:ENST00000425718;Name=ENSE00001620702;constitutive=0;ensembl_end_phase=1;ensembl_phase=-1;exon_id=ENSE00001620702;rank=1;version=1
19	havana	CDS	44906625	44906667	.	+	0	ID=CDS:ENSP00000410423;Parent=transcript:ENST00000425718;protein_id=ENSP00000410423
19	havana	exon	44907760	44907952	.	+	.	Parent=transcript:ENST00000425718;Name=ENSE00000893952;constitutive=0;ensembl_end_phase=2;ensembl_phase=1;exon_id=ENSE00000893952;rank=2;version=1
19	havana	CDS	44907760	44907952	.	+	2	ID=CDS:ENSP00000410423;Parent=transcript:ENST00000425718;protein_id=ENSP00000410423
19	havana	exon	44908533	44908954	.	+	.	Parent=transcript:ENST00000425718;Name=ENSE00001599675;constitutive=0;ensembl_end_phase=1;ensembl_phase=2;exon_id=ENSE00001599675;rank=3;version=1
19	havana	CDS	44908533	44908954	.	+	1	ID=CDS:ENSP00000410423;Parent=transcript:ENST00000425718;protein_id=ENSP00000410423

Gene contains mRNA, exon, CDS, five_prime_UTR, and three_prime_UTR. CDSs have multiple locations under the same ID.
Source: https://ftp.ensembl.org/pub/release-111/gff3/homo_sapiens/Homo_sapiens.GRCh38.111.gff3.gz

RefSeq GRCh38

See GFF3

##gff-version 3
#!gff-spec-version 1.21
#!processor NCBI annotwriter
#!genome-build GRCh38.p14
#!genome-build-accession NCBI_Assembly:GCF_000001405.40
#!annotation-date 10/02/2023
#!annotation-source NCBI RefSeq GCF_000001405.40-RS_2023_10
##sequence-region NC_000019.10 1 58617616
##species https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=9606
NC_000019.10	BestRefSeq	gene	44905796	44909393	.	+	.	ID=gene-APOE;Dbxref=GeneID:348,HGNC:HGNC:613,MIM:107741;Name=APOE;description=apolipoprotein E;gbkey=Gene;gene=APOE;gene_biotype=protein_coding;gene_synonym=AD2,APO-E,ApoE4,LDLCQ5,LPG
NC_000019.10	BestRefSeq	mRNA	44905796	44909393	.	+	.	ID=rna-NM_001302688.2;Parent=gene-APOE;Dbxref=GeneID:348,GenBank:NM_001302688.2,HGNC:HGNC:613,MIM:107741;Name=NM_001302688.2;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 1;transcript_id=NM_001302688.2
NC_000019.10	BestRefSeq	exon	44905796	44905923	.	+	.	ID=exon-NM_001302688.2-1;Parent=rna-NM_001302688.2;Dbxref=GeneID:348,GenBank:NM_001302688.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 1;transcript_id=NM_001302688.2
NC_000019.10	BestRefSeq	exon	44906602	44906667	.	+	.	ID=exon-NM_001302688.2-2;Parent=rna-NM_001302688.2;Dbxref=GeneID:348,GenBank:NM_001302688.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 1;transcript_id=NM_001302688.2
NC_000019.10	BestRefSeq	exon	44907760	44907952	.	+	.	ID=exon-NM_001302688.2-3;Parent=rna-NM_001302688.2;Dbxref=GeneID:348,GenBank:NM_001302688.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 1;transcript_id=NM_001302688.2
NC_000019.10	BestRefSeq	exon	44908533	44909393	.	+	.	ID=exon-NM_001302688.2-4;Parent=rna-NM_001302688.2;Dbxref=GeneID:348,GenBank:NM_001302688.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 1;transcript_id=NM_001302688.2
NC_000019.10	BestRefSeq	CDS	44905869	44905923	.	+	0	ID=cds-NP_001289617.1;Parent=rna-NM_001302688.2;Dbxref=GeneID:348,GenBank:NP_001289617.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289617.1;Note=isoform a precursor is encoded by transcript variant 1;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform a precursor;protein_id=NP_001289617.1
NC_000019.10	BestRefSeq	CDS	44906602	44906667	.	+	2	ID=cds-NP_001289617.1;Parent=rna-NM_001302688.2;Dbxref=GeneID:348,GenBank:NP_001289617.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289617.1;Note=isoform a precursor is encoded by transcript variant 1;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform a precursor;protein_id=NP_001289617.1
NC_000019.10	BestRefSeq	CDS	44907760	44907952	.	+	2	ID=cds-NP_001289617.1;Parent=rna-NM_001302688.2;Dbxref=GeneID:348,GenBank:NP_001289617.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289617.1;Note=isoform a precursor is encoded by transcript variant 1;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform a precursor;protein_id=NP_001289617.1
NC_000019.10	BestRefSeq	CDS	44908533	44909250	.	+	1	ID=cds-NP_001289617.1;Parent=rna-NM_001302688.2;Dbxref=GeneID:348,GenBank:NP_001289617.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289617.1;Note=isoform a precursor is encoded by transcript variant 1;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform a precursor;protein_id=NP_001289617.1
NC_000019.10	BestRefSeq	mRNA	44905796	44909393	.	+	.	ID=rna-NM_001302691.2;Parent=gene-APOE;Dbxref=GeneID:348,GenBank:NM_001302691.2,HGNC:HGNC:613,MIM:107741;Name=NM_001302691.2;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 5;transcript_id=NM_001302691.2
NC_000019.10	BestRefSeq	exon	44905796	44905841	.	+	.	ID=exon-NM_001302691.2-1;Parent=rna-NM_001302691.2;Dbxref=GeneID:348,GenBank:NM_001302691.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 5;transcript_id=NM_001302691.2
NC_000019.10	BestRefSeq	exon	44906587	44906667	.	+	.	ID=exon-NM_001302691.2-2;Parent=rna-NM_001302691.2;Dbxref=GeneID:348,GenBank:NM_001302691.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 5;transcript_id=NM_001302691.2
NC_000019.10	BestRefSeq	exon	44907760	44907952	.	+	.	ID=exon-NM_001302691.2-3;Parent=rna-NM_001302691.2;Dbxref=GeneID:348,GenBank:NM_001302691.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 5;transcript_id=NM_001302691.2
NC_000019.10	BestRefSeq	exon	44908533	44909393	.	+	.	ID=exon-NM_001302691.2-4;Parent=rna-NM_001302691.2;Dbxref=GeneID:348,GenBank:NM_001302691.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 5;transcript_id=NM_001302691.2
NC_000019.10	BestRefSeq	CDS	44906625	44906667	.	+	0	ID=cds-NP_001289620.1;Parent=rna-NM_001302691.2;Dbxref=CCDS:CCDS12647.1,GeneID:348,GenBank:NP_001289620.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289620.1;Note=isoform b precursor is encoded by transcript variant 5;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_001289620.1
NC_000019.10	BestRefSeq	CDS	44907760	44907952	.	+	2	ID=cds-NP_001289620.1;Parent=rna-NM_001302691.2;Dbxref=CCDS:CCDS12647.1,GeneID:348,GenBank:NP_001289620.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289620.1;Note=isoform b precursor is encoded by transcript variant 5;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_001289620.1
NC_000019.10	BestRefSeq	CDS	44908533	44909250	.	+	1	ID=cds-NP_001289620.1;Parent=rna-NM_001302691.2;Dbxref=CCDS:CCDS12647.1,GeneID:348,GenBank:NP_001289620.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289620.1;Note=isoform b precursor is encoded by transcript variant 5;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_001289620.1
NC_000019.10	BestRefSeq	mRNA	44905796	44909393	.	+	.	ID=rna-NM_000041.4;Parent=gene-APOE;Dbxref=Ensembl:ENST00000252486.9,GeneID:348,GenBank:NM_000041.4,HGNC:HGNC:613,MIM:107741;Name=NM_000041.4;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 2;tag=MANE Select;transcript_id=NM_000041.4
NC_000019.10	BestRefSeq	exon	44905796	44905841	.	+	.	ID=exon-NM_000041.4-1;Parent=rna-NM_000041.4;Dbxref=Ensembl:ENST00000252486.9,GeneID:348,GenBank:NM_000041.4,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 2;tag=MANE Select;transcript_id=NM_000041.4
NC_000019.10	BestRefSeq	exon	44906602	44906667	.	+	.	ID=exon-NM_000041.4-2;Parent=rna-NM_000041.4;Dbxref=Ensembl:ENST00000252486.9,GeneID:348,GenBank:NM_000041.4,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 2;tag=MANE Select;transcript_id=NM_000041.4
NC_000019.10	BestRefSeq	exon	44907760	44907952	.	+	.	ID=exon-NM_000041.4-3;Parent=rna-NM_000041.4;Dbxref=Ensembl:ENST00000252486.9,GeneID:348,GenBank:NM_000041.4,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 2;tag=MANE Select;transcript_id=NM_000041.4
NC_000019.10	BestRefSeq	exon	44908533	44909393	.	+	.	ID=exon-NM_000041.4-4;Parent=rna-NM_000041.4;Dbxref=Ensembl:ENST00000252486.9,GeneID:348,GenBank:NM_000041.4,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 2;tag=MANE Select;transcript_id=NM_000041.4
NC_000019.10	BestRefSeq	CDS	44906625	44906667	.	+	0	ID=cds-NP_000032.1;Parent=rna-NM_000041.4;Dbxref=CCDS:CCDS12647.1,Ensembl:ENSP00000252486.3,GeneID:348,GenBank:NP_000032.1,HGNC:HGNC:613,MIM:107741;Name=NP_000032.1;Note=isoform b precursor is encoded by transcript variant 2;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_000032.1;tag=MANE Select
NC_000019.10	BestRefSeq	CDS	44907760	44907952	.	+	2	ID=cds-NP_000032.1;Parent=rna-NM_000041.4;Dbxref=CCDS:CCDS12647.1,Ensembl:ENSP00000252486.3,GeneID:348,GenBank:NP_000032.1,HGNC:HGNC:613,MIM:107741;Name=NP_000032.1;Note=isoform b precursor is encoded by transcript variant 2;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_000032.1;tag=MANE Select
NC_000019.10	BestRefSeq	CDS	44908533	44909250	.	+	1	ID=cds-NP_000032.1;Parent=rna-NM_000041.4;Dbxref=CCDS:CCDS12647.1,Ensembl:ENSP00000252486.3,GeneID:348,GenBank:NP_000032.1,HGNC:HGNC:613,MIM:107741;Name=NP_000032.1;Note=isoform b precursor is encoded by transcript variant 2;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_000032.1;tag=MANE Select
NC_000019.10	BestRefSeq	mRNA	44906021	44909393	.	+	.	ID=rna-NM_001302689.2;Parent=gene-APOE;Dbxref=GeneID:348,GenBank:NM_001302689.2,HGNC:HGNC:613,MIM:107741;Name=NM_001302689.2;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 3;transcript_id=NM_001302689.2
NC_000019.10	BestRefSeq	exon	44906021	44906044	.	+	.	ID=exon-NM_001302689.2-1;Parent=rna-NM_001302689.2;Dbxref=GeneID:348,GenBank:NM_001302689.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 3;transcript_id=NM_001302689.2
NC_000019.10	BestRefSeq	exon	44906602	44906667	.	+	.	ID=exon-NM_001302689.2-2;Parent=rna-NM_001302689.2;Dbxref=GeneID:348,GenBank:NM_001302689.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 3;transcript_id=NM_001302689.2
NC_000019.10	BestRefSeq	exon	44907760	44907952	.	+	.	ID=exon-NM_001302689.2-3;Parent=rna-NM_001302689.2;Dbxref=GeneID:348,GenBank:NM_001302689.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 3;transcript_id=NM_001302689.2
NC_000019.10	BestRefSeq	exon	44908533	44909393	.	+	.	ID=exon-NM_001302689.2-4;Parent=rna-NM_001302689.2;Dbxref=GeneID:348,GenBank:NM_001302689.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 3;transcript_id=NM_001302689.2
NC_000019.10	BestRefSeq	CDS	44906625	44906667	.	+	0	ID=cds-NP_001289618.1;Parent=rna-NM_001302689.2;Dbxref=CCDS:CCDS12647.1,GeneID:348,GenBank:NP_001289618.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289618.1;Note=isoform b precursor is encoded by transcript variant 3;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_001289618.1
NC_000019.10	BestRefSeq	CDS	44907760	44907952	.	+	2	ID=cds-NP_001289618.1;Parent=rna-NM_001302689.2;Dbxref=CCDS:CCDS12647.1,GeneID:348,GenBank:NP_001289618.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289618.1;Note=isoform b precursor is encoded by transcript variant 3;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_001289618.1
NC_000019.10	BestRefSeq	CDS	44908533	44909250	.	+	1	ID=cds-NP_001289618.1;Parent=rna-NM_001302689.2;Dbxref=CCDS:CCDS12647.1,GeneID:348,GenBank:NP_001289618.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289618.1;Note=isoform b precursor is encoded by transcript variant 3;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_001289618.1
NC_000019.10	BestRefSeq	mRNA	44906401	44909393	.	+	.	ID=rna-NM_001302690.2;Parent=gene-APOE;Dbxref=GeneID:348,GenBank:NM_001302690.2,HGNC:HGNC:613,MIM:107741;Name=NM_001302690.2;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 4;transcript_id=NM_001302690.2
NC_000019.10	BestRefSeq	exon	44906401	44906524	.	+	.	ID=exon-NM_001302690.2-1;Parent=rna-NM_001302690.2;Dbxref=GeneID:348,GenBank:NM_001302690.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 4;transcript_id=NM_001302690.2
NC_000019.10	BestRefSeq	exon	44906602	44906667	.	+	.	ID=exon-NM_001302690.2-2;Parent=rna-NM_001302690.2;Dbxref=GeneID:348,GenBank:NM_001302690.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 4;transcript_id=NM_001302690.2
NC_000019.10	BestRefSeq	exon	44907760	44907952	.	+	.	ID=exon-NM_001302690.2-3;Parent=rna-NM_001302690.2;Dbxref=GeneID:348,GenBank:NM_001302690.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 4;transcript_id=NM_001302690.2
NC_000019.10	BestRefSeq	exon	44908533	44909393	.	+	.	ID=exon-NM_001302690.2-4;Parent=rna-NM_001302690.2;Dbxref=GeneID:348,GenBank:NM_001302690.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 4;transcript_id=NM_001302690.2
NC_000019.10	BestRefSeq	CDS	44906625	44906667	.	+	0	ID=cds-NP_001289619.1;Parent=rna-NM_001302690.2;Dbxref=CCDS:CCDS12647.1,GeneID:348,GenBank:NP_001289619.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289619.1;Note=isoform b precursor is encoded by transcript variant 4;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_001289619.1
NC_000019.10	BestRefSeq	CDS	44907760	44907952	.	+	2	ID=cds-NP_001289619.1;Parent=rna-NM_001302690.2;Dbxref=CCDS:CCDS12647.1,GeneID:348,GenBank:NP_001289619.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289619.1;Note=isoform b precursor is encoded by transcript variant 4;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_001289619.1
NC_000019.10	BestRefSeq	CDS	44908533	44909250	.	+	1	ID=cds-NP_001289619.1;Parent=rna-NM_001302690.2;Dbxref=CCDS:CCDS12647.1,GeneID:348,GenBank:NP_001289619.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289619.1;Note=isoform b precursor is encoded by transcript variant 4;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_001289619.1

Gene contains mRNA, exon, and CDS. CDSs have multiple locations under the same ID. (Matches Sequence Ontology spec)
Source: https://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/annotation/GRCh38_latest/refseq_identifiers/GRCh38_latest_genomic.gff.gz

Wormbase C. elegans

See GFF3

##gff-version 3
##sequence-region III 1 13783801
III	WormBase	gene	3573581	3578771	.	+	.	ID=Gene:WBGene00003605;Name=WBGene00003605;locus=nhr-6;sequence_name=C48D5.1;biotype=protein_coding;so_term_name=protein_coding_gene;curie=WB:WBGene00003605;Alias=nhr-6,C48D5.1
III	WormBase	mRNA	3573581	3578771	.	+	.	ID=Transcript:C48D5.1a.1;Parent=Gene:WBGene00003605;Name=C48D5.1a.1;wormpep=CE24859;locus=nhr-6;uniprot_id=P41829
III	WormBase	exon	3573581	3573740	.	+	.	Parent=Transcript:C48D5.1a.1
III	WormBase	five_prime_UTR	3573581	3573698	.	+	.	Parent=Transcript:C48D5.1a.1
III	WormBase	gene	3573678	3573736	.	-	.	ID=Gene:WBGene00200413;Name=WBGene00200413;interpolated_map_position=-5.51505;sequence_name=C48D5.8;biotype=ncRNA;so_term_name=ncRNA_gene;curie=WB:WBGene00200413;Alias=C48D5.8
III	WormBase	ncRNA	3573678	3573736	.	-	.	ID=Transcript:C48D5.8;Parent=Gene:WBGene00200413;Name=C48D5.8
III	WormBase	exon	3573678	3573736	.	-	.	Parent=Transcript:C48D5.8
III	WormBase	CDS	3573699	3573740	.	+	0	ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829
III	WormBase	CDS	3573858	3573923	.	+	0	ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829
III	WormBase	CDS	3574148	3574250	.	+	0	ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829
III	WormBase	CDS	3574294	3574376	.	+	2	ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829
III	WormBase	CDS	3576192	3576269	.	+	0	ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829
III	WormBase	CDS	3576317	3576468	.	+	0	ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829
III	WormBase	CDS	3576609	3576803	.	+	1	ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829
III	WormBase	CDS	3576870	3577159	.	+	1	ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829
III	WormBase	CDS	3577205	3577362	.	+	2	ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829
III	WormBase	CDS	3577410	3577770	.	+	0	ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829
III	WormBase	CDS	3577848	3577969	.	+	2	ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829
III	WormBase	CDS	3578181	3578390	.	+	0	ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829
III	WormBase	intron	3573741	3573857	.	+	.	Parent=Transcript:C48D5.1a.1;Note=Confirmed_EST CK584632 %3B Confirmed_cDNA AY204167 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B
III	WormBase	mRNA	3573851	3578390	.	+	.	ID=Transcript:C48D5.1b.1;Parent=Gene:WBGene00003605;Name=C48D5.1b.1;wormpep=CE42591;locus=nhr-6;uniprot_id=P41829
III	WormBase	exon	3573851	3573923	.	+	.	Parent=Transcript:C48D5.1b.1
III	WormBase	five_prime_UTR	3573851	3573923	.	+	.	Parent=Transcript:C48D5.1b.1
III	WormBase	exon	3573858	3573923	.	+	.	Parent=Transcript:C48D5.1a.1
III	WormBase	intron	3573924	3574147	.	+	.	Parent=Transcript:C48D5.1a.1;Note=Confirmed_EST CK584062 %3B Confirmed_cDNA AY204167 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B
III	WormBase	intron	3573924	3574147	.	+	.	Parent=Transcript:C48D5.1b.1;Note=Confirmed_EST CK584062 %3B Confirmed_cDNA AY204167 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B
III	WormBase	exon	3574148	3574250	.	+	.	Parent=Transcript:C48D5.1a.1
III	WormBase	exon	3574148	3574250	.	+	.	Parent=Transcript:C48D5.1b.1
III	WormBase	five_prime_UTR	3574148	3574250	.	+	.	Parent=Transcript:C48D5.1b.1
III	WormBase	intron	3574251	3574293	.	+	.	Parent=Transcript:C48D5.1a.1;Note=Confirmed_EST CK584062 %3B Confirmed_cDNA AY204167 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B
III	WormBase	intron	3574251	3574293	.	+	.	Parent=Transcript:C48D5.1b.1;Note=Confirmed_EST CK584062 %3B Confirmed_cDNA AY204167 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B
III	WormBase	exon	3574294	3574376	.	+	.	Parent=Transcript:C48D5.1a.1
III	WormBase	exon	3574294	3574376	.	+	.	Parent=Transcript:C48D5.1b.1
III	WormBase	five_prime_UTR	3574294	3574376	.	+	.	Parent=Transcript:C48D5.1b.1
III	WormBase	intron	3574377	3576191	.	+	.	Parent=Transcript:C48D5.1a.1;Note=Confirmed_EST CK582994 %3B Confirmed_cDNA AY204167 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B
III	WormBase	intron	3574377	3576191	.	+	.	Parent=Transcript:C48D5.1b.1;Note=Confirmed_EST CK582994 %3B Confirmed_cDNA AY204167 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B
III	WormBase	mRNA	3576060	3578754	.	+	.	ID=Transcript:C48D5.1b.2;Parent=Gene:WBGene00003605;Name=C48D5.1b.2;wormpep=CE42591;locus=nhr-6;uniprot_id=P41829
III	WormBase	exon	3576060	3576269	.	+	.	Parent=Transcript:C48D5.1b.2
III	WormBase	five_prime_UTR	3576060	3576269	.	+	.	Parent=Transcript:C48D5.1b.2
III	WormBase	exon	3576192	3576269	.	+	.	Parent=Transcript:C48D5.1a.1
III	WormBase	exon	3576192	3576269	.	+	.	Parent=Transcript:C48D5.1b.1
III	WormBase	five_prime_UTR	3576192	3576269	.	+	.	Parent=Transcript:C48D5.1b.1
III	WormBase	intron	3576270	3576316	.	+	.	Parent=Transcript:C48D5.1a.1;Note=Confirmed_EST CK582994 %3B Confirmed_cDNA AY204167 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B
III	WormBase	intron	3576270	3576316	.	+	.	Parent=Transcript:C48D5.1b.1;Note=Confirmed_EST CK582994 %3B Confirmed_cDNA AY204167 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B
III	WormBase	intron	3576270	3576316	.	+	.	Parent=Transcript:C48D5.1b.2;Note=Confirmed_EST CK582994 %3B Confirmed_cDNA AY204167 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B
III	WormBase	five_prime_UTR	3576317	3576352	.	+	.	Parent=Transcript:C48D5.1b.1
III	WormBase	five_prime_UTR	3576317	3576352	.	+	.	Parent=Transcript:C48D5.1b.2
III	WormBase	exon	3576317	3576468	.	+	.	Parent=Transcript:C48D5.1a.1
III	WormBase	exon	3576317	3576468	.	+	.	Parent=Transcript:C48D5.1b.1
III	WormBase	exon	3576317	3576468	.	+	.	Parent=Transcript:C48D5.1b.2
III	WormBase	CDS	3576353	3576468	.	+	0	ID=CDS:C48D5.1b;Parent=Transcript:C48D5.1b.1,Transcript:C48D5.1b.2;Name=C48D5.1b;prediction_status=Confirmed;wormpep=CE42591;protein_id=CAQ48391.1;locus=nhr-6;uniprot_id=P41829
III	WormBase	CDS	3576609	3576803	.	+	1	ID=CDS:C48D5.1b;Parent=Transcript:C48D5.1b.1,Transcript:C48D5.1b.2;Name=C48D5.1b;prediction_status=Confirmed;wormpep=CE42591;protein_id=CAQ48391.1;locus=nhr-6;uniprot_id=P41829
III	WormBase	CDS	3576870	3577159	.	+	1	ID=CDS:C48D5.1b;Parent=Transcript:C48D5.1b.1,Transcript:C48D5.1b.2;Name=C48D5.1b;prediction_status=Confirmed;wormpep=CE42591;protein_id=CAQ48391.1;locus=nhr-6;uniprot_id=P41829
III	WormBase	CDS	3577205	3577362	.	+	2	ID=CDS:C48D5.1b;Parent=Transcript:C48D5.1b.1,Transcript:C48D5.1b.2;Name=C48D5.1b;prediction_status=Confirmed;wormpep=CE42591;protein_id=CAQ48391.1;locus=nhr-6;uniprot_id=P41829
III	WormBase	CDS	3577410	3577770	.	+	0	ID=CDS:C48D5.1b;Parent=Transcript:C48D5.1b.1,Transcript:C48D5.1b.2;Name=C48D5.1b;prediction_status=Confirmed;wormpep=CE42591;protein_id=CAQ48391.1;locus=nhr-6;uniprot_id=P41829
III	WormBase	CDS	3577848	3577969	.	+	2	ID=CDS:C48D5.1b;Parent=Transcript:C48D5.1b.1,Transcript:C48D5.1b.2;Name=C48D5.1b;prediction_status=Confirmed;wormpep=CE42591;protein_id=CAQ48391.1;locus=nhr-6;uniprot_id=P41829
III	WormBase	CDS	3578181	3578390	.	+	0	ID=CDS:C48D5.1b;Parent=Transcript:C48D5.1b.1,Transcript:C48D5.1b.2;Name=C48D5.1b;prediction_status=Confirmed;wormpep=CE42591;protein_id=CAQ48391.1;locus=nhr-6;uniprot_id=P41829
III	WormBase	intron	3576469	3576608	.	+	.	Parent=Transcript:C48D5.1a.1;Note=Confirmed_EST CK579197 %3B Confirmed_cDNA U13076 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B
III	WormBase	intron	3576469	3576608	.	+	.	Parent=Transcript:C48D5.1b.1;Note=Confirmed_EST CK579197 %3B Confirmed_cDNA U13076 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B
III	WormBase	intron	3576469	3576608	.	+	.	Parent=Transcript:C48D5.1b.2;Note=Confirmed_EST CK579197 %3B Confirmed_cDNA U13076 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B
III	WormBase	exon	3576609	3576803	.	+	.	Parent=Transcript:C48D5.1a.1
III	WormBase	exon	3576609	3576803	.	+	.	Parent=Transcript:C48D5.1b.1
III	WormBase	exon	3576609	3576803	.	+	.	Parent=Transcript:C48D5.1b.2
III	WormBase	intron	3576804	3576869	.	+	.	Parent=Transcript:C48D5.1a.1;Note=Confirmed_EST CK579197 %3B Confirmed_cDNA U13076 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B
III	WormBase	intron	3576804	3576869	.	+	.	Parent=Transcript:C48D5.1b.1;Note=Confirmed_EST CK579197 %3B Confirmed_cDNA U13076 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B
III	WormBase	intron	3576804	3576869	.	+	.	Parent=Transcript:C48D5.1b.2;Note=Confirmed_EST CK579197 %3B Confirmed_cDNA U13076 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B
III	WormBase	exon	3576870	3577159	.	+	.	Parent=Transcript:C48D5.1a.1
III	WormBase	exon	3576870	3577159	.	+	.	Parent=Transcript:C48D5.1b.1
III	WormBase	exon	3576870	3577159	.	+	.	Parent=Transcript:C48D5.1b.2
III	WormBase	intron	3577160	3577204	.	+	.	Parent=Transcript:C48D5.1a.1;Note=Confirmed_EST CK582942 %3B Confirmed_cDNA U13076 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B
III	WormBase	intron	3577160	3577204	.	+	.	Parent=Transcript:C48D5.1b.1;Note=Confirmed_EST CK582942 %3B Confirmed_cDNA U13076 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B
III	WormBase	intron	3577160	3577204	.	+	.	Parent=Transcript:C48D5.1b.2;Note=Confirmed_EST CK582942 %3B Confirmed_cDNA U13076 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B
III	WormBase	exon	3577205	3577362	.	+	.	Parent=Transcript:C48D5.1a.1
III	WormBase	exon	3577205	3577362	.	+	.	Parent=Transcript:C48D5.1b.1
III	WormBase	exon	3577205	3577362	.	+	.	Parent=Transcript:C48D5.1b.2
III	WormBase	intron	3577363	3577409	.	+	.	Parent=Transcript:C48D5.1a.1;Note=Confirmed_EST CK581015 %3B Confirmed_cDNA U13076 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B
III	WormBase	intron	3577363	3577409	.	+	.	Parent=Transcript:C48D5.1b.1;Note=Confirmed_EST CK581015 %3B Confirmed_cDNA U13076 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B
III	WormBase	intron	3577363	3577409	.	+	.	Parent=Transcript:C48D5.1b.2;Note=Confirmed_EST CK581015 %3B Confirmed_cDNA U13076 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B
III	WormBase	exon	3577410	3577770	.	+	.	Parent=Transcript:C48D5.1a.1
III	WormBase	exon	3577410	3577770	.	+	.	Parent=Transcript:C48D5.1b.1
III	WormBase	exon	3577410	3577770	.	+	.	Parent=Transcript:C48D5.1b.2
III	WormBase	intron	3577771	3577847	.	+	.	Parent=Transcript:C48D5.1a.1;Note=Confirmed_cDNA U13076 %3B Confirmed_EST OSTR020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B
III	WormBase	intron	3577771	3577847	.	+	.	Parent=Transcript:C48D5.1b.1;Note=Confirmed_cDNA U13076 %3B Confirmed_EST OSTR020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B
III	WormBase	intron	3577771	3577847	.	+	.	Parent=Transcript:C48D5.1b.2;Note=Confirmed_cDNA U13076 %3B Confirmed_EST OSTR020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B
III	WormBase	exon	3577848	3577969	.	+	.	Parent=Transcript:C48D5.1a.1
III	WormBase	exon	3577848	3577969	.	+	.	Parent=Transcript:C48D5.1b.1
III	WormBase	exon	3577848	3577969	.	+	.	Parent=Transcript:C48D5.1b.2
III	WormBase	intron	3577970	3578180	.	+	.	Parent=Transcript:C48D5.1a.1;Note=Confirmed_cDNA U13076 %3B Confirmed_EST OSTR020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B
III	WormBase	intron	3577970	3578180	.	+	.	Parent=Transcript:C48D5.1b.1;Note=Confirmed_cDNA U13076 %3B Confirmed_EST OSTR020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B
III	WormBase	intron	3577970	3578180	.	+	.	Parent=Transcript:C48D5.1b.2;Note=Confirmed_cDNA U13076 %3B Confirmed_EST OSTR020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B
III	WormBase	exon	3578181	3578390	.	+	.	Parent=Transcript:C48D5.1b.1
III	WormBase	exon	3578181	3578754	.	+	.	Parent=Transcript:C48D5.1b.2
III	WormBase	exon	3578181	3578771	.	+	.	Parent=Transcript:C48D5.1a.1
III	WormBase	three_prime_UTR	3578391	3578754	.	+	.	Parent=Transcript:C48D5.1b.2
III	WormBase	three_prime_UTR	3578391	3578771	.	+	.	Parent=Transcript:C48D5.1a.1
III	WormBase	gene	3578695	3581736	.	-	.	ID=Gene:WBGene00008180;Name=WBGene00008180;interpolated_map_position=-5.46529;sequence_name=C48D5.3;biotype=protein_coding;so_term_name=protein_coding_gene;curie=WB:WBGene00008180;Alias=C48D5.3
III	WormBase	mRNA	3578695	3581736	.	-	.	ID=Transcript:C48D5.3.1;Parent=Gene:WBGene00008180;Name=C48D5.3.1;wormpep=CE44237;uniprot_id=Q7YX49
III	WormBase	three_prime_UTR	3578695	3578881	.	-	.	Parent=Transcript:C48D5.3.1
III	WormBase	exon	3578695	3579024	.	-	.	Parent=Transcript:C48D5.3.1
III	WormBase	CDS	3578882	3579024	.	-	2	ID=CDS:C48D5.3;Parent=Transcript:C48D5.3.1;Name=C48D5.3;prediction_status=Confirmed;wormpep=CE44237;protein_id=CAE17761.2;uniprot_id=Q7YX49
III	WormBase	CDS	3580321	3580458	.	-	2	ID=CDS:C48D5.3;Parent=Transcript:C48D5.3.1;Name=C48D5.3;prediction_status=Confirmed;wormpep=CE44237;protein_id=CAE17761.2;uniprot_id=Q7YX49
III	WormBase	CDS	3581261	3581341	.	-	2	ID=CDS:C48D5.3;Parent=Transcript:C48D5.3.1;Name=C48D5.3;prediction_status=Confirmed;wormpep=CE44237;protein_id=CAE17761.2;uniprot_id=Q7YX49
III	WormBase	CDS	3581558	3581693	.	-	0	ID=CDS:C48D5.3;Parent=Transcript:C48D5.3.1;Name=C48D5.3;prediction_status=Confirmed;wormpep=CE44237;protein_id=CAE17761.2;uniprot_id=Q7YX49
III	WormBase	intron	3579025	3580320	.	-	.	Parent=Transcript:C48D5.3.1;Note=Confirmed_EST FM247012 %3B Confirmed_EST elegans_PE_SS_GG2424%7Cc0_g1_i1 %3B Confirmed_EST L2_Nanopore_Roach_76766 %3B Confirmed_EST L2_Nanopore_Roach_76766 %3B
III	WormBase	exon	3580321	3580458	.	-	.	Parent=Transcript:C48D5.3.1
III	WormBase	intron	3580459	3581260	.	-	.	Parent=Transcript:C48D5.3.1;Note=Confirmed_EST FM247012 %3B Confirmed_EST elegans_PE_SS_GG2424%7Cc0_g1_i1 %3B Confirmed_EST L3_Nanopore_Roach_42997 %3B Confirmed_EST L3_Nanopore_Roach_42997 %3B
III	WormBase	exon	3581261	3581341	.	-	.	Parent=Transcript:C48D5.3.1
III	WormBase	intron	3581342	3581557	.	-	.	Parent=Transcript:C48D5.3.1;Note=Confirmed_EST FM247012 %3B Confirmed_EST elegans_PE_SS_GG2424%7Cc0_g1_i1 %3B Confirmed_EST L3_Nanopore_Roach_42997 %3B Confirmed_EST L3_Nanopore_Roach_42997 %3B
III	WormBase	exon	3581558	3581736	.	-	.	Parent=Transcript:C48D5.3.1
III	WormBase	five_prime_UTR	3581694	3581736	.	-	.	Parent=Transcript:C48D5.3.1
III	WormBase	gene	3586143	3586247	.	-	.	ID=Gene:WBGene00199419;Name=WBGene00199419;interpolated_map_position=-5.42563;sequence_name=C48D5.7;biotype=ncRNA;so_term_name=ncRNA_gene;curie=WB:WBGene00199419;Alias=C48D5.7
III	WormBase	ncRNA	3586143	3586247	.	-	.	ID=Transcript:C48D5.7;Parent=Gene:WBGene00199419;Name=C48D5.7
III	WormBase	exon	3586143	3586247	.	-	.	Parent=Transcript:C48D5.7
III	WormBase_transposon	transposable_element	3586983	3587234	.	+	.	ID=Transposon:Predicted_PALTTTAAA2_10197;Name=Predicted_PALTTTAAA2_10197;family=PALTTTAAA2
III	WormBase	gene	3587601	3587747	.	-	.	ID=Gene:WBGene00197446;Name=WBGene00197446;interpolated_map_position=-5.41583;sequence_name=C48D5.6;biotype=ncRNA;so_term_name=ncRNA_gene;curie=WB:WBGene00197446;Alias=C48D5.6
III	WormBase	ncRNA	3587601	3587747	.	-	.	ID=Transcript:C48D5.6;Parent=Gene:WBGene00197446;Name=C48D5.6
III	WormBase	exon	3587601	3587747	.	-	.	Parent=Transcript:C48D5.6
III	WormBase_transposon	transposable_element	3588011	3588163	.	+	.	ID=Transposon:Predicted_HAT1_CE_10198;Name=Predicted_HAT1_CE_10198;family=HAT1_CE
III	WormBase	gene	3588607	3588756	.	-	.	ID=Gene:WBGene00196486;Name=WBGene00196486;interpolated_map_position=-5.40914;sequence_name=C48D5.5;biotype=ncRNA;so_term_name=ncRNA_gene;curie=WB:WBGene00196486;Alias=C48D5.5
III	WormBase	ncRNA	3588607	3588756	.	-	.	ID=Transcript:C48D5.5;Parent=Gene:WBGene00196486;Name=C48D5.5
III	WormBase	exon	3588607	3588756	.	-	.	Parent=Transcript:C48D5.5
III	WormBase	gene	3589207	3589355	.	-	.	ID=Gene:WBGene00196329;Name=WBGene00196329;interpolated_map_position=-5.40517;sequence_name=C48D5.4;biotype=ncRNA;so_term_name=ncRNA_gene;curie=WB:WBGene00196329;Alias=C48D5.4
III	WormBase	ncRNA	3589207	3589355	.	-	.	ID=Transcript:C48D5.4;Parent=Gene:WBGene00196329;Name=C48D5.4
III	WormBase	exon	3589207	3589355	.	-	.	Parent=Transcript:C48D5.4
III	WormBase	gene	3590722	3611971	.	+	.	ID=Gene:WBGene00004213;Name=WBGene00004213;locus=ptp-1;sequence_name=C48D5.2;biotype=protein_coding;so_term_name=protein_coding_gene;curie=WB:WBGene00004213;Alias=ptp-1,C48D5.2
III	WormBase	mRNA	3590722	3611971	.	+	.	ID=Transcript:C48D5.2a.1;Parent=Gene:WBGene00004213;Name=C48D5.2a.1;wormpep=CE17578;locus=ptp-1;uniprot_id=P28191
III	WormBase	exon	3590722	3590907	.	+	.	Parent=Transcript:C48D5.2a.1
III	WormBase	five_prime_UTR	3590722	3590769	.	+	.	Parent=Transcript:C48D5.2a.1
III	WormBase	CDS	3590770	3590907	.	+	0	ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191
III	WormBase	CDS	3591624	3591740	.	+	0	ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191
III	WormBase	CDS	3591785	3592001	.	+	0	ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191
III	WormBase	CDS	3592397	3592523	.	+	2	ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191
III	WormBase	CDS	3592604	3592850	.	+	1	ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191
III	WormBase	CDS	3593165	3593391	.	+	0	ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191
III	WormBase	CDS	3594210	3594323	.	+	1	ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191
III	WormBase	CDS	3594440	3594592	.	+	1	ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191
III	WormBase	CDS	3604815	3604917	.	+	1	ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191
III	WormBase	CDS	3607049	3607232	.	+	0	ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191
III	WormBase	CDS	3607558	3607745	.	+	2	ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191
III	WormBase	CDS	3608124	3608424	.	+	0	ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191
III	WormBase	CDS	3608994	3609222	.	+	2	ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191
III	WormBase	CDS	3609685	3609772	.	+	1	ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191
III	WormBase	CDS	3609824	3610099	.	+	0	ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191
III	WormBase	CDS	3610561	3610845	.	+	0	ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191
III	WormBase	CDS	3611478	3611564	.	+	0	ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191
III	WormBase	intron	3590908	3591623	.	+	.	Parent=Transcript:C48D5.2a.1;Note=Confirmed_EST yk417b1.5 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc2_g2_i2 %3B Confirmed_EST L2_Nanopore_Roach_6195 %3B Confirmed_EST L2_Nanopore_Roach_6195 %3B
III	WormBase	exon	3591624	3591740	.	+	.	Parent=Transcript:C48D5.2a.1
III	WormBase	intron	3591741	3591784	.	+	.	Parent=Transcript:C48D5.2a.1;Note=Confirmed_EST FM247485 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc2_g2_i2 %3B Confirmed_EST L2_Nanopore_Roach_6195 %3B Confirmed_EST L2_Nanopore_Roach_6195 %3B

Gene contains mRNA, exon, CDS, five_prime_UTR, three_prime_UTR, and intron. CDSs have multiple locations under the same ID.
Source: https://downloads.wormbase.org/releases/WS292/species/c_elegans/PRJNA13758/c_elegans.PRJNA13758.WS292.annotations.gff3.gz

PlasmoDB P. falciparum

See GFF3

##gff-version 3
##species http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=36329
##sequence-region Pf3D7_01_v3 1 640851
Pf3D7_01_v3	VEuPathDB	protein_coding_gene	74295	75622	.	+	.	ID=PF3D7_0101300;Name=MC-2TM;description=Pfmc-2TM Maurer's cleft two transmembrane protein;ebi_biotype=protein_coding
Pf3D7_01_v3	VEuPathDB	mRNA	74295	75622	.	+	.	ID=PF3D7_0101300.1;Parent=PF3D7_0101300;description=Pfmc-2TM Maurer's cleft two transmembrane protein;gene_ebi_biotype=protein_coding
Pf3D7_01_v3	VEuPathDB	exon	74295	74631	.	+	.	ID=exon_PF3D7_0101300.1-E1;Parent=PF3D7_0101300.1;gene_id=PF3D7_0101300
Pf3D7_01_v3	VEuPathDB	exon	74728	75622	.	+	.	ID=exon_PF3D7_0101300.1-E2;Parent=PF3D7_0101300.1;gene_id=PF3D7_0101300
Pf3D7_01_v3	VEuPathDB	CDS	74563	74631	.	+	0	ID=PF3D7_0101300.1-p1-CDS1;Parent=PF3D7_0101300.1;gene_id=PF3D7_0101300;protein_source_id=PF3D7_0101300.1-p1
Pf3D7_01_v3	VEuPathDB	CDS	74728	75366	.	+	0	ID=PF3D7_0101300.1-p1-CDS2;Parent=PF3D7_0101300.1;gene_id=PF3D7_0101300;protein_source_id=PF3D7_0101300.1-p1
Pf3D7_01_v3	VEuPathDB	five_prime_UTR	74295	74562	.	+	.	ID=utr_PF3D7_0101300.1_1;Parent=PF3D7_0101300.1
Pf3D7_01_v3	VEuPathDB	three_prime_UTR	75367	75622	.	+	.	ID=utr_PF3D7_0101300.1_2;Parent=PF3D7_0101300.1

Gene contains mRNA, exon, CDS, five_prime_UTR, and three_prime_UTR. Each CDS location has a unique ID.
Source: https://plasmodb.org/common/downloads/Current_Release/Pfalciparum3D7/gff/data/PlasmoDB-67_Pfalciparum3D7.gff

We need to figure out what our standard internal representation will be so that we can start figuring out how to standardize the data.

The text was updated successfully, but these errors were encountered:

dariober · 2024-04-10T11:04:04Z

Here's some other examples:

The Arabidopsis Information Resource

Source https://www.arabidopsis.org/download_files/Genes/TAIR10_genome_release/TAIR10_gff3/TAIR10_GFF3_genes_transposons.gff

Features with no children have no ID (e.g. CDS)
Parent attribute can have multiple values

See GFF3

Chr1    TAIR10  gene    5928    8737    .       -       .       ID=AT1G01020;Note=protein_coding_gene;Name=AT1G01020
Chr1    TAIR10  mRNA    5928    8737    .       -       .       ID=AT1G01020.1;Parent=AT1G01020;Name=AT1G01020.1;Index=1
Chr1    TAIR10  protein 6915    8666    .       -       .       ID=AT1G01020.1-Protein;Name=AT1G01020.1;Derives_from=AT1G01020.1
Chr1    TAIR10  five_prime_UTR  8667    8737    .       -       .       Parent=AT1G01020.1
Chr1    TAIR10  CDS     8571    8666    .       -       0       Parent=AT1G01020.1,AT1G01020.1-Protein;
Chr1    TAIR10  exon    8571    8737    .       -       .       Parent=AT1G01020.1
Chr1    TAIR10  CDS     8417    8464    .       -       0       Parent=AT1G01020.1,AT1G01020.1-Protein;
Chr1    TAIR10  exon    8417    8464    .       -       .       Parent=AT1G01020.1
Chr1    TAIR10  CDS     8236    8325    .       -       0       Parent=AT1G01020.1,AT1G01020.1-Protein;
Chr1    TAIR10  exon    8236    8325    .       -       .       Parent=AT1G01020.1
Chr1    TAIR10  CDS     7942    7987    .       -       0       Parent=AT1G01020.1,AT1G01020.1-Protein;
Chr1    TAIR10  exon    7942    7987    .       -       .       Parent=AT1G01020.1
Chr1    TAIR10  CDS     7762    7835    .       -       2       Parent=AT1G01020.1,AT1G01020.1-Protein;
Chr1    TAIR10  exon    7762    7835    .       -       .       Parent=AT1G01020.1
Chr1    TAIR10  CDS     7564    7649    .       -       0       Parent=AT1G01020.1,AT1G01020.1-Protein;
Chr1    TAIR10  exon    7564    7649    .       -       .       Parent=AT1G01020.1
Chr1    TAIR10  CDS     7384    7450    .       -       1       Parent=AT1G01020.1,AT1G01020.1-Protein;
Chr1    TAIR10  exon    7384    7450    .       -       .       Parent=AT1G01020.1
Chr1    TAIR10  CDS     7157    7232    .       -       0       Parent=AT1G01020.1,AT1G01020.1-Protein;
Chr1    TAIR10  exon    7157    7232    .       -       .       Parent=AT1G01020.1
Chr1    TAIR10  CDS     6915    7069    .       -       2       Parent=AT1G01020.1,AT1G01020.1-Protein;
Chr1    TAIR10  three_prime_UTR 6437    6914    .       -       .       Parent=AT1G01020.1
Chr1    TAIR10  exon    6437    7069    .       -       .       Parent=AT1G01020.1
Chr1    TAIR10  three_prime_UTR 5928    6263    .       -       .       Parent=AT1G01020.1
Chr1    TAIR10  exon    5928    6263    .       -       .       Parent=AT1G01020.1
Chr1    TAIR10  mRNA    6790    8737    .       -       .       ID=AT1G01020.2;Parent=AT1G01020;Name=AT1G01020.2;Index=1
Chr1    TAIR10  protein 7315    8666    .       -       .       ID=AT1G01020.2-Protein;Name=AT1G01020.2;Derives_from=AT1G01020.2
Chr1    TAIR10  five_prime_UTR  8667    8737    .       -       .       Parent=AT1G01020.2
Chr1    TAIR10  CDS     8571    8666    .       -       0       Parent=AT1G01020.2,AT1G01020.2-Protein;
Chr1    TAIR10  exon    8571    8737    .       -       .       Parent=AT1G01020.2
Chr1    TAIR10  CDS     8417    8464    .       -       0       Parent=AT1G01020.2,AT1G01020.2-Protein;
Chr1    TAIR10  exon    8417    8464    .       -       .       Parent=AT1G01020.2
Chr1    TAIR10  CDS     8236    8325    .       -       0       Parent=AT1G01020.2,AT1G01020.2-Protein;
Chr1    TAIR10  exon    8236    8325    .       -       .       Parent=AT1G01020.2
Chr1    TAIR10  CDS     7942    7987    .       -       0       Parent=AT1G01020.2,AT1G01020.2-Protein;
Chr1    TAIR10  exon    7942    7987    .       -       .       Parent=AT1G01020.2
Chr1    TAIR10  CDS     7762    7835    .       -       2       Parent=AT1G01020.2,AT1G01020.2-Protein;
Chr1    TAIR10  exon    7762    7835    .       -       .       Parent=AT1G01020.2
Chr1    TAIR10  CDS     7564    7649    .       -       0       Parent=AT1G01020.2,AT1G01020.2-Protein;
Chr1    TAIR10  exon    7564    7649    .       -       .       Parent=AT1G01020.2
Chr1    TAIR10  CDS     7315    7450    .       -       1       Parent=AT1G01020.2,AT1G01020.2-Protein;
Chr1    TAIR10  three_prime_UTR 7157    7314    .       -       .       Parent=AT1G01020.2
Chr1    TAIR10  exon    7157    7450    .       -       .       Parent=AT1G01020.2
Chr1    TAIR10  three_prime_UTR 6790    7069    .       -       .       Parent=AT1G01020.2
Chr1    TAIR10  exon    6790    7069    .       -       .       Parent=AT1G01020.2

Braker

Braker is a popular genome annotation program

Output depends on the settings. For one of our gff file from braker 2 we get these types:

cut -f 3 output/ME49/braker/augustus.hints.gff3 | sort | uniq -c
  47527 CDS
  47527 exon
   6724 gene
  40227 intron
   7302 mRNA
   7300 start_codon
   7302 stop_codon

Note that it includes: intron, start_codon, stop_codon

See GFF3

CM033580.1      AUGUSTUS        gene    15529   16566   0.92    -       .       ID=g1;
CM033580.1      AUGUSTUS        mRNA    15529   16566   0.92    -       .       ID=g1.t1;Parent=g1;
CM033580.1      AUGUSTUS        stop_codon      15529   15531   .       -       0       ID=g1.t1.stop1;Parent=g1.t1;
CM033580.1      AUGUSTUS        CDS     15529   15659   0.92    -       2       ID=g1.t1.CDS1;Parent=g1.t1;
CM033580.1      AUGUSTUS        exon    15529   15659   .       -       .       ID=g1.t1.exon1;Parent=g1.t1;
CM033580.1      AUGUSTUS        intron  15660   16112   0.96    -       .       ID=g1.t1.intron1;Parent=g1.t1;
CM033580.1      AUGUSTUS        CDS     16113   16314   0.96    -       0       ID=g1.t1.CDS2;Parent=g1.t1;
CM033580.1      AUGUSTUS        exon    16113   16314   .       -       .       ID=g1.t1.exon2;Parent=g1.t1;
CM033580.1      AUGUSTUS        intron  16315   16536   0.96    -       .       ID=g1.t1.intron2;Parent=g1.t1;
CM033580.1      AUGUSTUS        CDS     16537   16566   0.99    -       0       ID=g1.t1.CDS3;Parent=g1.t1;
CM033580.1      AUGUSTUS        exon    16537   16566   .       -       .       ID=g1.t1.exon3;Parent=g1.t1;
CM033580.1      AUGUSTUS        start_codon     16564   16566   .       -       0       ID=g1.t1.start1;Parent=g1.t1;
CM033580.1      AUGUSTUS        gene    19185   21532   0.28    -       .       ID=g2;
CM033580.1      AUGUSTUS        mRNA    19185   21532   0.28    -       .       ID=g2.t1;Parent=g2;
CM033580.1      AUGUSTUS        stop_codon      19185   19187   .       -       0       ID=g2.t1.stop1;Parent=g2.t1;
CM033580.1      AUGUSTUS        CDS     19185   19234   0.5     -       2       ID=g2.t1.CDS1;Parent=g2.t1;
CM033580.1      AUGUSTUS        exon    19185   19234   .       -       .       ID=g2.t1.exon1;Parent=g2.t1;
CM033580.1      AUGUSTUS        intron  19235   19334   0.5     -       .       ID=g2.t1.intron1;Parent=g2.t1;
CM033580.1      AUGUSTUS        CDS     19335   19412   0.79    -       2       ID=g2.t1.CDS2;Parent=g2.t1;
CM033580.1      AUGUSTUS        exon    19335   19412   .       -       .       ID=g2.t1.exon2;Parent=g2.t1;
CM033580.1      AUGUSTUS        intron  19413   20696   0.74    -       .       ID=g2.t1.intron2;Parent=g2.t1;
CM033580.1      AUGUSTUS        CDS     20697   20802   0.71    -       0       ID=g2.t1.CDS3;Parent=g2.t1;
CM033580.1      AUGUSTUS        exon    20697   20802   .       -       .       ID=g2.t1.exon3;Parent=g2.t1;
CM033580.1      AUGUSTUS        intron  20803   21349   0.83    -       .       ID=g2.t1.intron3;Parent=g2.t1;
CM033580.1      AUGUSTUS        CDS     21350   21532   0.72    -       0       ID=g2.t1.CDS4;Parent=g2.t1;
CM033580.1      AUGUSTUS        exon    21350   21532   .       -       .       ID=g2.t1.exon4;Parent=g2.t1;
CM033580.1      AUGUSTUS        start_codon     21530   21532   .       -       0       ID=g2.t1.start1;Parent=g2.t1;
CM033580.1      AUGUSTUS        gene    21699   25646   0.16    -       .       ID=g3;
CM033580.1      AUGUSTUS        mRNA    21699   25646   0.16    -       .       ID=g3.t1;Parent=g3;
CM033580.1      AUGUSTUS        stop_codon      21699   21701   .       -       0       ID=g3.t1.stop1;Parent=g3.t1;
CM033580.1      AUGUSTUS        CDS     21699   21735   0.51    -       1       ID=g3.t1.CDS1;Parent=g3.t1;
CM033580.1      AUGUSTUS        exon    21699   21735   .       -       .       ID=g3.t1.exon1;Parent=g3.t1;
CM033580.1      AUGUSTUS        intron  21736   22067   0.5     -       .       ID=g3.t1.intron1;Parent=g3.t1;
CM033580.1      AUGUSTUS        CDS     22068   22152   0.56    -       2       ID=g3.t1.CDS2;Parent=g3.t1;
CM033580.1      AUGUSTUS        exon    22068   22152   .       -       .       ID=g3.t1.exon2;Parent=g3.t1;
CM033580.1      AUGUSTUS        intron  22153   22645   1       -       .       ID=g3.t1.intron2;Parent=g3.t1;
CM033580.1      AUGUSTUS        CDS     22646   22687   1       -       2       ID=g3.t1.CDS3;Parent=g3.t1;
CM033580.1      AUGUSTUS        exon    22646   22687   .       -       .       ID=g3.t1.exon3;Parent=g3.t1;
CM033580.1      AUGUSTUS        intron  22688   23081   0.93    -       .       ID=g3.t1.intron3;Parent=g3.t1;
CM033580.1      AUGUSTUS        CDS     23082   23107   0.93    -       1       ID=g3.t1.CDS4;Parent=g3.t1;
CM033580.1      AUGUSTUS        exon    23082   23107   .       -       .       ID=g3.t1.exon4;Parent=g3.t1;
CM033580.1      AUGUSTUS        intron  23108   23332   0.89    -       .       ID=g3.t1.intron4;Parent=g3.t1;
CM033580.1      AUGUSTUS        CDS     23333   23374   0.89    -       1       ID=g3.t1.CDS5;Parent=g3.t1;
CM033580.1      AUGUSTUS        exon    23333   23374   .       -       .       ID=g3.t1.exon5;Parent=g3.t1;
CM033580.1      AUGUSTUS        intron  23375   23746   0.88    -       .       ID=g3.t1.intron5;Parent=g3.t1;
CM033580.1      AUGUSTUS        CDS     23747   23793   0.76    -       0       ID=g3.t1.CDS6;Parent=g3.t1;
CM033580.1      AUGUSTUS        exon    23747   23793   .       -       .       ID=g3.t1.exon6;Parent=g3.t1;
CM033580.1      AUGUSTUS        intron  23794   24077   0.74    -       .       ID=g3.t1.intron6;Parent=g3.t1;
CM033580.1      AUGUSTUS        CDS     24078   24250   0.83    -       2       ID=g3.t1.CDS7;Parent=g3.t1;
CM033580.1      AUGUSTUS        exon    24078   24250   .       -       .       ID=g3.t1.exon7;Parent=g3.t1;
CM033580.1      AUGUSTUS        intron  24251   24669   0.99    -       .       ID=g3.t1.intron7;Parent=g3.t1;
CM033580.1      AUGUSTUS        CDS     24670   24742   0.67    -       0       ID=g3.t1.CDS8;Parent=g3.t1;
CM033580.1      AUGUSTUS        exon    24670   24742   .       -       .       ID=g3.t1.exon8;Parent=g3.t1;
CM033580.1      AUGUSTUS        intron  24743   25466   0.59    -       .       ID=g3.t1.intron8;Parent=g3.t1;
CM033580.1      AUGUSTUS        CDS     25467   25646   0.34    -       0       ID=g3.t1.CDS9;Parent=g3.t1;
CM033580.1      AUGUSTUS        exon    25467   25646   .       -       .       ID=g3.t1.exon9;Parent=g3.t1;
CM033580.1      AUGUSTUS        start_codon     25644   25646   .       -       0       ID=g3.t1.start1;Parent=g3.t1;

Tomato

Source: https://solgenomics.net/ftp/tomato_genome/annotation/ITAG4.0_release/ITAG4.0_gene_models.gff

There is nothing unusual here. All features have unique identifier. Genes have: CDS, exon, five_prime_UTR, gene, mRNA, three_prime_UTR

See GFF3

##gff-version 3
##sequence-regionSL4.0ch00      1       9643250
##sequence-regionSL4.0ch01      1       90863682
##sequence-regionSL4.0ch02      1       53473368
##sequence-regionSL4.0ch03      1       65298490
##sequence-regionSL4.0ch04      1       64459972
##sequence-regionSL4.0ch05      1       65269487
##sequence-regionSL4.0ch06      1       47258699
##sequence-regionSL4.0ch07      1       67883646
##sequence-regionSL4.0ch08      1       63995357
##sequence-regionSL4.0ch09      1       68513564
##sequence-regionSL4.0ch10      1       64792705
##sequence-regionSL4.0ch11      1       54379777
##sequence-regionSL4.0ch12      1       66688036
SL4.0ch00       maker_ITAG      gene    93750   94430   .       +       .       ID=gene:Solyc00g500001.1;Alias=Solyc00g500001;Name=Solyc00g500001.1;length=680
SL4.0ch00       maker_ITAG      mRNA    93750   94430   .       +       .       ID=mRNA:Solyc00g500001.1.1;Parent=gene:Solyc00g500001.1;Name=Solyc00g500001.1.1;Note=Retrovirus-related Pol polyprotein from transposon TNT 1-94 (AHRD V3.3 *-* A0A2I0VJ33_9ASPA);_AED=0.01;_QI=0|-1|0|1|-1|0|1|0|227;_eAED=0.01
SL4.0ch00       maker_ITAG      exon    93750   94430   .       +       .       ID=exon:Solyc00g500001.1.1.1;Parent=mRNA:Solyc00g500001.1.1
SL4.0ch00       maker_ITAG      CDS     93750   94430   .       +       0       ID=CDS:Solyc00g500001.1.1.1;Parent=mRNA:Solyc00g500001.1.1
###
SL4.0ch00       maker_ITAG      gene    305442  306257  .       -       .       ID=gene:Solyc00g500002.1;Alias=Solyc00g500002;Name=Solyc00g500002.1;length=815
SL4.0ch00       maker_ITAG      mRNA    305442  306257  .       -       .       ID=mRNA:Solyc00g500002.1.1;Parent=gene:Solyc00g500002.1;Name=Solyc00g500002.1.1;Note=Retrovirus-related Pol polyprotein from transposon TNT 1-94 (AHRD V3.3 *-* A0A2I0VBY8_9ASPA);_AED=0.10;_QI=384|-1|0|1|-1|0|1|0|144;_eAED=0.41
SL4.0ch00       maker_ITAG      CDS     305442  305873  .       -       0       ID=CDS:Solyc00g500002.1.1.1;Parent=mRNA:Solyc00g500002.1.1
SL4.0ch00       maker_ITAG      exon    305442  306257  .       -       .       ID=exon:Solyc00g500002.1.1.1;Parent=mRNA:Solyc00g500002.1.1
SL4.0ch00       maker_ITAG      five_prime_UTR  305874  306257  .       -       .       ID=five_prime_UTR:Solyc00g500002.1.1.0;Parent=mRNA:Solyc00g500002.1.1
###
SL4.0ch00       maker_ITAG      gene    311496  382066  .       -       .       ID=gene:Solyc00g500003.1;Alias=Solyc00g500003;Name=Solyc00g500003.1;length=70570
SL4.0ch00       maker_ITAG      mRNA    311496  382066  .       -       .       ID=mRNA:Solyc00g500003.1.1;Parent=gene:Solyc00g500003.1;Name=Solyc00g500003.1.1;Note=MP domain-containing protein (AHRD V3.3 *-* A0A1Q3D0H5_CEPFO);_AED=0.30;_QI=0|0|0|0.16|0|0|6|0|554;_eAED=0.30
SL4.0ch00       maker_ITAG      exon    311496  311570  .       -       .       ID=exon:Solyc00g500003.1.1.1;Parent=mRNA:Solyc00g500003.1.1
SL4.0ch00       maker_ITAG      CDS     311496  311570  .       -       0       ID=CDS:Solyc00g500003.1.1.1;Parent=mRNA:Solyc00g500003.1.1
SL4.0ch00       maker_ITAG      exon    330270  330628  .       -       .       ID=exon:Solyc00g500003.1.1.2;Parent=mRNA:Solyc00g500003.1.1
SL4.0ch00       maker_ITAG      CDS     330270  330628  .       -       2       ID=CDS:Solyc00g500003.1.1.2;Parent=mRNA:Solyc00g500003.1.1
SL4.0ch00       maker_ITAG      exon    344080  344133  .       -       .       ID=exon:Solyc00g500003.1.1.3;Parent=mRNA:Solyc00g500003.1.1
SL4.0ch00       maker_ITAG      CDS     344080  344133  .       -       2       ID=CDS:Solyc00g500003.1.1.3;Parent=mRNA:Solyc00g500003.1.1
SL4.0ch00       maker_ITAG      exon    347298  347428  .       -       .       ID=exon:Solyc00g500003.1.1.4;Parent=mRNA:Solyc00g500003.1.1
SL4.0ch00       maker_ITAG      CDS     347298  347428  .       -       1       ID=CDS:Solyc00g500003.1.1.4;Parent=mRNA:Solyc00g500003.1.1
SL4.0ch00       maker_ITAG      exon    351799  352644  .       -       .       ID=exon:Solyc00g500003.1.1.5;Parent=mRNA:Solyc00g500003.1.1
SL4.0ch00       maker_ITAG      CDS     351799  352644  .       -       1       ID=CDS:Solyc00g500003.1.1.5;Parent=mRNA:Solyc00g500003.1.1
SL4.0ch00       maker_ITAG      exon    381867  382066  .       -       .       ID=exon:Solyc00g500003.1.1.6;Parent=mRNA:Solyc00g500003.1.1
SL4.0ch00       maker_ITAG      CDS     381867  382066  .       -       0       ID=CDS:Solyc00g500003.1.1.6;Parent=mRNA:Solyc00g500003.1.1

garrettjstevens · 2024-06-12T03:00:05Z

@kyostiebi Attached here is a GFF3 that has genes in several different formats. Currently the changes that load data in the new feature model are only guaranteed to work with the first gene format in this file.

Could you update the importing code in the new feature model branch you've been working on so that it handles all the cases in the attached GFF3? All cases in this file should end up with the same gene model (just with the position offset by 10000 bases).

gene_representations.gff3.gz

garrettjstevens assigned garrettjstevens and kyostiebi and unassigned garrettjstevens Jun 12, 2024

kyostiebi added this to To Do in Apollo team board via automation Jun 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standardize gene representation #381

Standardize gene representation #381

garrettjstevens commented Apr 10, 2024

dariober commented Apr 10, 2024

garrettjstevens commented Jun 12, 2024

Standardize gene representation #381

Standardize gene representation #381

Comments

garrettjstevens commented Apr 10, 2024

Sequence Ontology GFF3 Spec

Ensembl GRCh38

RefSeq GRCh38

Wormbase C. elegans

PlasmoDB P. falciparum

dariober commented Apr 10, 2024

The Arabidopsis Information Resource

Braker

Tomato

garrettjstevens commented Jun 12, 2024