Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

correct_gff_feature_order.pl doesn't work #52

Closed
arsilan324 opened this issue Feb 8, 2018 · 5 comments
Closed

correct_gff_feature_order.pl doesn't work #52

arsilan324 opened this issue Feb 8, 2018 · 5 comments

Comments

@arsilan324
Copy link

arsilan324 commented Feb 8, 2018

Hello,

When I run the script correct_gff_feature_order.pl, I get this error

Can't locate bioUtils.pm in @INC (you may need to install the bioUtils module) (@INC contains: /Users/arslan/Documents/Juncus/EMBL/EMBLmyGFF3/../lib /Library/Perl/5.18/darwin-thread-multi-2level /Library/Perl/5.18 /Network/Library/Perl/5.18/darwin-thread-multi-2level /Network/Library/Perl/5.18 /Library/Perl/Updates/5.18.2/darwin-thread-multi-2level /Library/Perl/Updates/5.18.2 /System/Library/Perl/5.18/darwin-thread-multi-2level /System/Library/Perl/5.18 /System/Library/Perl/Extras/5.18/darwin-thread-multi-2level /System/Library/Perl/Extras/5.18 .) at correct_gff_feature_order.pl line 76.
BEGIN failed--compilation aborted at correct_gff_feature_order.pl line 76.

Can you please comment how can I fix?
Thanks

@jorvis
Copy link
Owner

jorvis commented Feb 8, 2018

This is one of the few perl scripts left (which also rely on the bioUtils.pm module). Did you install biocode via pip? Within that python framework I can't properly place the perl module. You should be able to run this if you check out biocode instead from GitHub and then make sure biocode/lib/ is in your PERL5LIB env variable. Or copy bioUtils.pm from a checkout into a directory where perl will see it.

@bernt-matthias
Copy link

Hi I'm a colleague of @arsilan324 . Got it running on my computer (thanks for your explanations), but ran into other problems:

panic! don't know what to do with feat type: 3'UTR at ./gff/correct_gff_feature_order.pl line 186, <$ifh> line 6.

found more than one gene at position 1 on molecule Transcript_100004

The gff3 looks as follows

Transcript_100004       transdecoder    gene    1       813     .       +       .       ID=Transcript_100004|g.119957;Name=ORF%20Transcript_100004%7Cg.119957%20Transcript_100004%7Cm.119957%20type%3A5prime_partial%20len%3A200%20%28%2B%29
Transcript_100004       transdecoder    mRNA    1       813     .       +       .       ID=Transcript_100004|m.119957;Parent=Transcript_100004|g.119957;Name=ORF%20Transcript_100004%7Cg.119957%20Transcript_100004%7Cm.119957%20type%3A5prime_partial%20len%3A200%20%28%2B%29
Transcript_100004       transdecoder    CDS     1       600     .       +       .       ID=cds.Transcript_100004|m.119957;Parent=Transcript_100004|m.119957
Transcript_100004       transdecoder    exon    1       813     .       +       .       ID=Transcript_100004|m.119957.exon1;Parent=Transcript_100004|m.119957
Transcript_100004       transdecoder    3'UTR   601     813     .       +       .       ID=Transcript_100004|m.119957.utr3p1;Parent=Transcript_100004|m.119957

Transcript_100004       transdecoder    gene    1       813     .       -       .       ID=Transcript_100004|g.119958;Name=ORF%20Transcript_100004%7Cg.119958%20Transcript_100004%7Cm.119958%20type%3Acomplete%20len%3A128%20%28-%29
Transcript_100004       transdecoder    mRNA    1       813     .       -       .       ID=Transcript_100004|m.119958;Parent=Transcript_100004|g.119958;Name=ORF%20Transcript_100004%7Cg.119958%20Transcript_100004%7Cm.119958%20type%3Acomplete%20len%3A128%20%28-%29
Transcript_100004       transdecoder    CDS     322     705     .       -       .       ID=cds.Transcript_100004|m.119958;Parent=Transcript_100004|m.119958
Transcript_100004       transdecoder    exon    1       813     .       -       .       ID=Transcript_100004|m.119958.exon1;Parent=Transcript_100004|m.119958
Transcript_100004       transdecoder    5'UTR   706     813     .       -       .       ID=Transcript_100004|m.119958.utr5p1;Parent=Transcript_100004|m.119958
Transcript_100004       transdecoder    3'UTR   1       321     .       -       .       ID=Transcript_100004|m.119958.utr3p1;Parent=Transcript_100004|m.119958

I guess there should be only one gene with two child transcripts. Then this would be a bug in the upstream software that produced the gff file.?

@jorvis
Copy link
Owner

jorvis commented Feb 8, 2018

I don't mind modifying it, but in legal GFF3 that third column is supposed to correspond to a Sequence Ontology (SO) term. The parent is UTR but there are also five_prime_UTR and three_prime_UTR. You'd have to change your input file to have those feature types instead. It wouldn't hurt to make a transdecoder ticket too and tell Brian to make his GFF right. :)

@jorvis
Copy link
Owner

jorvis commented Feb 8, 2018

Supported added for these two types in commit 57e92bd

@bernt-matthias
Copy link

great. thanks

@jorvis jorvis closed this as completed Feb 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants