Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with temporary GFF file #12

Closed
blackFirefly opened this issue Jun 14, 2017 · 12 comments
Closed

Problem with temporary GFF file #12

blackFirefly opened this issue Jun 14, 2017 · 12 comments

Comments

@blackFirefly
Copy link

I tried flo yesterday, but it ended up in an error. It seems like there is a problem in a temorary GFF file?
So the question is if the program or my input GFF is the problem?

It created a file called "lifted.gff3" and one called "unlifted.gff3". Both of them are filled. But there is also a third file "Aarabicum.v2.5.gff-liftover-aethionema-arabicum_v3.0.fasta.gff3" which is empty.

Here are the last lines flo printed:

Processing Scaffold_3140
mkdir Aarabicum.v2.5.gff-liftover-aethionema-arabicum_v3.0.fasta
liftOver -gff /home/muehlich/Desktop/aethionema/data/Aarabicum.v2.5.gff run/liftover.chn Aarabicum.v2.5.gff-liftover-aethionema-arabicum_v3.0.fasta/lifted.gff3 Aarabicum.v2.5.gff-liftover-aethionema-arabicum_v3.0.fasta/unlifted.gff3
Reading liftover chains
Mapping coordinates
WARNING: -gff is not recommended.
Use 'ldHgGene -out=<file.gp>' and then 'liftOver -genePred <file.gp>'
gt gff3 -tidy -sort -addids -retainids /tmp/lifted20170614-22821-oyvvge > Aarabicum.v2.5.gff-liftover-aethionema-arabicum_v3.0.fasta/Aarabicum.v2.5.gff-liftover-aethionema-arabicum_v3.0.fasta.gff3
warning: line 1 in file "/tmp/lifted20170614-22821-oyvvge" does not begin with "##gff-version" or "##gvf-version", create "##gff-version 3" line automatically
gt gff3: error: Parent "AA1G00001" on line 2 in file "/tmp/lifted20170614-22821-oyvvge" was not defined (via "ID=")
rake aborted!
Command failed with status (1): [gt gff3 -tidy -sort -addids -retainids /tm...]
/home/muehlich/flo/Rakefile:113:in process_gff' /home/muehlich/flo/Rakefile:234:in block (2 levels) in <top (required)>'
/home/muehlich/flo/Rakefile:223:in each' /home/muehlich/flo/Rakefile:223:in block in <top (required)>'
Tasks: TOP => default
(See full trace by running task with --trace)

@yeban
Copy link
Collaborator

yeban commented Jun 22, 2017

I can take a look if you can send me "lifted.gff3".

@blackFirefly
Copy link
Author

That would be great!
Since the file has a size of around 30MB, I sent you a dropbox link to the email adress stated in your profile.

@cmdcolin
Copy link

I am seeing the same problem as well... need anymore test data?

@cmdcolin
Copy link

I think the specific issue with these parents not being defined happens due to them being in the unlifted file

For example I had child features with Parent=SP_0.1_T008586-R3 in lifted.gff3 but then unlifted.gff3 had the actual parent where ID=PKINGS_0.1_T008586-R3

@yeban
Copy link
Collaborator

yeban commented Jun 27, 2017 via email

@cmdcolin
Copy link

Ah...I think I remember at one point writing a script to synthesize a parent features for features without parents for something like this...is that what process_gff does?

@yeban
Copy link
Collaborator

yeban commented Jun 28, 2017 via email

@cmdcolin
Copy link

cmdcolin commented Jun 28, 2017

Gotcha...I was considering maybe using crossmap, but it looks like it has the same issue

Maybe need to convert from gff to something else, bed12 or similar

@yeban
Copy link
Collaborator

yeban commented Jul 4, 2017

@cmdcolin:

I am seeing the same problem as well... need anymore test data?

There was a bug. I have made some changes. Can you give it a spin?

@blackFirefly - please see my email

@cmdcolin
Copy link

cmdcolin commented Jul 5, 2017

@yeban I believe it is working better now, it now gets to the genometools stage, but the genometools ends up crashing

Could maybe ask their team about it, error message isn't easy to interpret

$ rake
mkdir annotations.gff-liftover-target
liftOver -gff annotations.gff run/liftover.chn annotations.gff-liftover-target/lifted.gff3 annotations.gff-liftover-target/unlifted.gff3
Reading liftover chains
Mapping coordinates
WARNING: -gff is not recommended.
Use 'ldHgGene -out=<file.gp>' and then 'liftOver -genePred <file.gp>'
/home/me/flo/gff_recover.rb annotations.gff-liftover-target/lifted.gff3 | gt gff3 -tidy -sort -addids -retainids - > annotations.gff-liftover-target/annotations.gff-liftover-target.gff3
warning: line 1 in file "-" does not begin with "##gff-version" or "##gvf-version", create "##gff-version 3" line automatically
Assertion failed: (elemidx >= q->front), function gt_queue_remove, file src/core/queue.c, line 135.
This is a bug, please report it at
https://github.com/genometools/genometools/issues
Please make sure you are running the latest release which can be found at
http://genometools.org/pub/
You can check your version number with `gt -version`.
Aborted (core dumped)
/home/me/flo/gff_recover.rb:60:in `write': Broken pipe @ io_write - <STDOUT> (Errno::EPIPE)
        from /home/me/flo/gff_recover.rb:60:in `puts'
        from /home/me/flo/gff_recover.rb:60:in `puts'
        from /home/me/flo/gff_recover.rb:60:in `<main>'
rake aborted!

@cmdcolin
Copy link

cmdcolin commented Jul 5, 2017

At least one thing that could be suspicious is that there are still lines that exist without parents. If I save the file from

gff_recover.rb annotations.gff-liftover-target/lifted.gff3 > out.gff then out.gff (first feature in file) has an mRNA that references a parent gene that is not in out.gff

yeban added a commit that referenced this issue Jul 14, 2017
Following up on SHA: b629cb9 and trying to address: #12.

Notable changes:
1. We allow transcripts to be annotated as mRNA, transcript, or gene.
Yet, for reconstructed transcripts the script would always use 'mRNA'
type. Fix that.
2. Features that could not be processed are now put on stderr. This is
captured by the pipeline to provide lifted_but_rejected.gff.

Signed-off-by: Anurag Priyam <anurag.priyam@qmul.ac.uk>
@yeban
Copy link
Collaborator

yeban commented Jul 14, 2017

@blackFirefly's problem was partly flo and partly the gff. The former is now fixed.

@cmdcolin I can't be sure what the problem is without looking at the input / lifted gff. Please could you open a new issue with test data?

@yeban yeban closed this as completed Jul 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants