Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BACANNOT:INTEGRON_FINDER_2GFF (vibrio9)` terminated with an error exit status (255) #116

Closed
JavariaAshraf opened this issue Feb 7, 2024 · 10 comments · Fixed by #117
Closed
Assignees
Labels
bug Something isn't working

Comments

@JavariaAshraf
Copy link

Describe the bug
Hi Felipe Marques de Almeida,
I am running the pipeline bacannot for a large number of bacterial genome. For some sequences, the integron_Finder_GFF is failing and causing the pipeline to terminate.
the error is following
"Error executing process > 'BACANNOT:INTEGRON_FINDER_2GFF (vibrio9)'

Caused by:
Process BACANNOT:INTEGRON_FINDER_2GFF (vibrio9) terminated with an error exit status (255)

Command executed:

convert to gff if available

touch vibrio9_integrons.gff ;
for gbk in $(ls *.gbk) ; do
conda run -n perl bp_genbank2gff3 $gbk -o - | grep 'integron_id' | sed 's|ID=.*integron_id=|ID=|g' | sed 's/GenBank/Integron_Finder/g' >> vibrio9_integrons.gff
done

Command exit status:
255

Command output:
(empty)

Command error:

------------- EXCEPTION -------------
MSG: start must be set or is zero:
STACK Bio::SeqFeature::Tools::IDHandler::generate_unique_persistent_id /opt/conda/envs/perl/lib/perl5/site_perl/Bio/SeqFeature/Tools/IDHandler.pm:266
STACK main::add_generic_id /opt/conda/envs/perl/bin/bp_genbank2gff3:964
STACK toplevel /opt/conda/envs/perl/bin/bp_genbank2gff3:563
"
Please help me solve this issue.

@fmalmeida
Copy link
Owner

Hi @JavariaAshraf ,
I believe it may be something related to the gbk files generated by integron finder tool. Maybe some are empty, maybe some have inproper headers.

Can you send me a copy of these gbk files that the pipeline is trying to convert?

Ps. They are the integron finder gbk results, not the whole annotation ones :)

@JavariaAshraf
Copy link
Author

Thank you so much for your prompt response.
Much appreciated.
Kindly see the complete integron folder of the troubled sequence"
integron_finder.tar.gz

@fmalmeida
Copy link
Owner

Thanks,

I will take a look and try to figure out what it is. I shall get back later today with some update and maybe with some possible solutions.

cheers.

@fmalmeida
Copy link
Owner

fmalmeida commented Feb 7, 2024

Hi @JavariaAshraf ,
The problem is that one of the generated gbk files, is on the very start of the contig, and the integron_finder tool writes it as 0 instead of 1, which the file converter does not accept.

I will have to perform a quick release in order to modify this from 0 to 1 when it happens.

The problem is, that you mentioned that you are running for a "large" number of genomes, and the fact is that, when producing this new release and running with it, it may be that nextflow does not resume all processes and run it again.

So, I have to ask, how many genomes are? Because if not an option to run all of them again with the patch release when done, I would recommend removing the genomes that are failling from your samplesheet and try running all the ones that are working. Then running the other ones later.

Otherwise, you can wait and try to run all with this new patch release. The problem is that I can only work on it after my work (I can try to go slowly updating through pauses).

@fmalmeida fmalmeida self-assigned this Feb 7, 2024
@fmalmeida fmalmeida added the bug Something isn't working label Feb 7, 2024
@JavariaAshraf
Copy link
Author

Thank you fmalmeida, for your help.
I have already removed the troubled sequences from pipeline. There are 5 sequences which are causing trouble uptil now.
I hope it run smoothly for the remaining.
Waiting for the update,
Thank you

@fmalmeida
Copy link
Owner

Perfect. Will tell you when the fix is on for the other genomes.

@fmalmeida
Copy link
Owner

fmalmeida commented Feb 7, 2024

Hi @JavariaAshraf ,

Before I merge the code to make a new release, I would like to guarantee that the fix works or if something else is required.
So, could you run the pipeline with these problematic genomes using the branch where I have the code-fix?

You would need the command line like this:

nextflow run \
    fmalmeida/bacannot \
    -r 116-integron_finder_2gff-terminated-with-an-error \
    -latest \
    <the rest of your parameters>

The params -r 116-integron_finder_2gff-terminated-with-an-error -latest will make sure you run the branch with the new code.

Then let me know how it goes.

Cheers.

@JavariaAshraf
Copy link
Author

Hi @fmalmeida
The pipeline run smoothly this time with the troubled sequences.
Thank you.
but three functions were not called. (Resfinder, FLYE, Call_Methylation) Kindly see the screenshot attached
Screenshot from 2024-02-08 12-44-19

@fmalmeida
Copy link
Owner

Glad to hear that. During the week I will finalize the branch so that I make a small patch release with this important fix.

For the non-called functions, it is fine. The functions are only called if they receive information.

For example, FLYE only runs if you give long reads to the pipeline, so it assembles it. CALL_METHYLATION only runs if you give nanopore FAST5 data to the pipeline so it uses NanoPolish to call methylation. And resfinder, only runs if you select a species from Resfinder panels with --resfinder_species parameter.

@JavariaAshraf
Copy link
Author

Thank you Sir.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants