Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing documentation on writing GFF files? #37

Closed
rmzelle opened this issue Aug 21, 2014 · 8 comments
Closed

Missing documentation on writing GFF files? #37

rmzelle opened this issue Aug 21, 2014 · 8 comments

Comments

@rmzelle
Copy link
Contributor

rmzelle commented Aug 21, 2014

I gathered from https://github.com/daler/gffutils/blob/master/gffutils/gffwriter.py that it's possible to write GFF files from a (modified) gffutils database, but I can't find any examples in the documentation at http://pythonhosted.org/gffutils/contents.html or online on how to do so.

Does this feature exist, and if so, can you provide any examples on how to invoke the function?

@daler
Copy link
Owner

daler commented Aug 21, 2014

You're right, this is an under-developed part of gffutils. The gffwriter module is mostly for outputting files in a specific format (exons must immediately follow transcripts, which much immediately follow genes, that sort of thing).

In general I just print the features to file, since the string representation of a Feature object is a valid GFF line. That way you can be as complicated or as simple as you need:

# Simply output all features in the db
with open('all.gff', 'w') as fout:
    for f in db.all_features():
        fout.write(str(f) + '\n')

# This outputs just exons, and attaches attributes on-the-fly for exon length
# and "grandparent" gene id
with open('custom.gff', 'w') as fout:
    for exon in db.features_of_type('exon'):
        genes = [i.id for i in  db.parents(exon, featuretype='gene')]
        exon.attributes['gene_id'] = genes
        exon.attributes['length'] = str(len(exon))
        fout.write(str(exon), '\n')

@rmzelle
Copy link
Contributor Author

rmzelle commented Aug 21, 2014

Thanks, that's very useful.

In a somewhat related question, is it possible to remove attributes from features? dict.pop() doesn't seem to work.

@daler
Copy link
Owner

daler commented Aug 21, 2014

Are you calling pop on the Attributes object? Doing it this way works:

# make an example exon
line = "chr2L   FlyBase exon    7529    8116    .   +   .   parent_type=mRNA;Name=CG11023:1;Parent=FBtr0300689,FBtr0300690,FBtr0330654"
exon = gffutils.feature.feature_from_line(line, strict=False)
repr(exon)
# <Feature exon (chr2L:7529-8116[+]) at 0x7f6063212c10>

print exon.attributes
# parent_type: ['mRNA']
# Name: ['CG11023:1']
# Parent: ['FBtr0300689', 'FBtr0300690', 'FBtr0330654']

name = exon.attributes.pop('Name')
print exon.attributes
# parent_type: ['mRNA']
# Parent: ['FBtr0300689', 'FBtr0300690', 'FBtr0330654']

print str(exon)
# chr2L FlyBase exon    7529    8116    .   +   .   parent_type=mRNA;Parent=FBtr0300689,FBtr0300690,FBtr0330654

@rmzelle
Copy link
Contributor Author

rmzelle commented Aug 21, 2014

Ah, sorry, I only tried exon.pop('Name'). Thanks again!

@rmzelle rmzelle closed this as completed Oct 6, 2014
@rmzelle
Copy link
Contributor Author

rmzelle commented Jan 16, 2015

I'm also wondering: does gffutils have any way to add and remove individual features from a database? (apart from using gffutils.FeatureDB.execute)

I guess I could do these modifications when I'm writing away the database to file, but that's not as flexible.

@daler
Copy link
Owner

daler commented Jan 16, 2015

Currently, no. The implementation would simply call gffutils.FeatureDB.execute and get rid of the record in the features table and anything in the relations table. I added a new issue (#43) to remind myself to add this.

@rmzelle
Copy link
Contributor Author

rmzelle commented Jan 18, 2015

And for moving features between two databases?

@daler
Copy link
Owner

daler commented Jan 18, 2015

Yep, there's been a FeatureDB.update method for a while.

Now, as of 87c2a37, there's a FeatureDB.delete method to go along with the existing FeatureDB.update.

For moving features between two databases, first delete what you don't want with FeatureDB.delete and then either select what you do want from the other database (or just use the FeatureDB object if you want everything) and use FeatureDB.update.

This was referenced Sep 21, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants