Gtf parse #453

Merged
merged 5 commits into from May 6, 2014

Conversation

Projects
None yet
3 participants
@aswarren
Contributor

aswarren commented Mar 3, 2014

Based on existing GFF3 parser. GTF parser takes into account the differences in the GTF format. This parser also tries to account for GTF variations with both specific lines for transcript features and (following the GTF spec) where transcripts are implied by transcript_id.

Tested with Cufflinks output and the following:
ftp://ftp.ensembl.org/pub/current_fasta/mus_musculus/dna/Mus_musculus.GRCm38.75.dna.chromosome.1.fa.gz
ftp://ftp.ensembl.org/pub/current_gtf/mus_musculus/Mus_musculus.GRCm38.75.gtf.gz (chromosome 1 portion)

@rbuels

This comment has been minimized.

Show comment
Hide comment
@rbuels

rbuels Mar 4, 2014

Collaborator

Thanks for this Andrew!

Do you think you could add a volvox test track that uses the GTF backend, with a little test GTF file that has a couple of features on volvox ctgA and ctgB? Just add the commands to create the track in the volvox-formatting section of setup.sh.

Collaborator

rbuels commented Mar 4, 2014

Thanks for this Andrew!

Do you think you could add a volvox test track that uses the GTF backend, with a little test GTF file that has a couple of features on volvox ctgA and ctgB? Just add the commands to create the track in the volvox-formatting section of setup.sh.

@@ -29,7 +29,7 @@ _defaultConfig: function() {
}
},
- subParts: 'CDS, UTR, five_prime_UTR, three_prime_UTR',
+ subParts: 'exon, CDS, UTR, five_prime_UTR, three_prime_UTR',

This comment has been minimized.

@rbuels

rbuels Mar 4, 2014

Collaborator

When both exons and CDSs are present, I think this change could cause problems with displaying CDSs and UTRs, since the exons might draw boxes over them.

subParts is configurable, isn't that good enough?

@rbuels

rbuels Mar 4, 2014

Collaborator

When both exons and CDSs are present, I think this change could cause problems with displaying CDSs and UTRs, since the exons might draw boxes over them.

subParts is configurable, isn't that good enough?

@rbuels rbuels modified the milestones: Release 1.11.3, Release 1.11.4 Mar 4, 2014

@aswarren

This comment has been minimized.

Show comment
Hide comment
@aswarren

aswarren Mar 6, 2014

Contributor

Sure I can add the GTF example. This is shaping up to be a very busy month so I'm not sure when I can get to it. Also I will try and look at the question above and come up with a decent answer. Cheers.

Contributor

aswarren commented Mar 6, 2014

Sure I can add the GTF example. This is shaping up to be a very busy month so I'm not sure when I can get to it. Also I will try and look at the question above and come up with a decent answer. Cheers.

@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin Apr 21, 2014

Contributor

I'd say if we aren't going to go for a backend parser then this is pretty much good to go. I made only small modifications here https://github.com/GMOD/jbrowse/tree/aswarren-gtf_parse

Contributor

cmdcolin commented Apr 21, 2014

I'd say if we aren't going to go for a backend parser then this is pretty much good to go. I made only small modifications here https://github.com/GMOD/jbrowse/tree/aswarren-gtf_parse

@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin Apr 22, 2014

Contributor

I will try to get an example GTF for the volvox script and use the 'in-memory' adaptor similar the the GFF example. Here's an example gtf that i made with the gffread utility from cufflinks (gffread -O -T volvox.gff3 -o volvox.gtf) with manually added UTRs...

Contributor

cmdcolin commented Apr 22, 2014

I will try to get an example GTF for the volvox script and use the 'in-memory' adaptor similar the the GFF example. Here's an example gtf that i made with the gffread utility from cufflinks (gffread -O -T volvox.gff3 -o volvox.gtf) with manually added UTRs...

cmdcolin added a commit that referenced this pull request Apr 22, 2014

Add example GTF file to the example volvox browser. The gtf is itself…
… converted from the volvox.gff3 with the Cufflinks gffread utility. See pull #453
@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin Apr 22, 2014

Contributor

I added one more code change (see commit 9e1016f) which effectively removes assumptions about gene_id and transcript_id being defined on every line. The motivation was that the gffread generated file above has lines with no gene_id and they were being given undefined values and causing problems.

Otherwise, I think this is ready to merge https://github.com/GMOD/jbrowse/tree/aswarren-gtf_parse

Contributor

cmdcolin commented Apr 22, 2014

I added one more code change (see commit 9e1016f) which effectively removes assumptions about gene_id and transcript_id being defined on every line. The motivation was that the gffread generated file above has lines with no gene_id and they were being given undefined values and causing problems.

Otherwise, I think this is ready to merge https://github.com/GMOD/jbrowse/tree/aswarren-gtf_parse

@cmdcolin cmdcolin merged commit 00ec0f9 into GMOD:master May 6, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment