Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with filterGenesIn_mRNAname.pl when running BRAKER with singularity #828

Open
cvargas88 opened this issue May 18, 2024 · 3 comments
Open

Comments

@cvargas88
Copy link

cvargas88 commented May 18, 2024

Hi! Thank you for developing such a useful tool! I am running BRAKER using singularity, however the pipeline was interrupted when filtering train.gb for "good" mRNAs. The error at that point was the following:

Sat May 18 23:29:42 2024: Genbank format file LEPN/braker/train.gb contains 9181 genes. \# Sat May 18 23:29:42 2024: Filtering train.gb for "good" mRNAs: /usr/bin/perl miniconda3/bin/filterGenesIn_mRNAname.pl LEPN/braker/traingenes.gtf LEPN/braker/train.gb > LEPN/braker/train.f.gb 2>LEPN/braker/errors/filterGenesIn_mRNAname.stderr \# Sat May 18 23:29:44 2024: Genbank format file LEPN/braker/train.f.gb contains 0 genes. \# Sat May 18 23:29:44 2024: ERROR: in file /opt/BRAKER/scripts/braker.pl at line 6249 \# Training gene file in genbank format LEPN/braker/train.f.gb does not contain any training genes. Possible known causes: \# (a) The AUGUSTUS script filterGenesIn_mRNAname.pl is not up-to-date with this version of BRAKER. To solve this issue, either get the latest AUGUSTUS from its master branch with git clone git@github.com:Gaius-Augustus/Augustus.git or download the latest version of filterGenesIn_mRNAname.pl from https://github.com/Gaius-Augustus/Augustus/blob/master/scripts/filterGenesIn_mRNAname.pl and replace the old script in your AUGUSTUS installation folder. \# (b) No training genes with sufficient extrinsic evidence support or of sufficient length were produced by GeneMark-EX. If you think this is the cause for your problem, consider running BRAKER with different evidence or without any evidence (--esmode) for training.

I have checked and the version seems to be 20.02.2018, so I believe that is not the issue. However, the gb file contains 9181 genes and I have a large amount of RNA-seq data so I don't believe that there are no genes with evidence support. How could I verify if the issue is with this file?
Thanks a lot!

@KatharinaHoff
Copy link
Member

If you provided BAM files as input, it is possible that you ran an aligner that does not perform spliced alignment. That would lead to this. But it's a guess, there's to little information to confirm.

@cvargas88
Copy link
Author

cvargas88 commented May 19, 2024

Dear Katharina,
Thank you very much for your prompt reply. I provided several paired-end libraries so the alignments were performed by the pipeline. I believe that the issue is with the filterGenesIn_mRNAname.pl script. It seems that the version I was using had the following condition:
if ( $_ =~ m/transcript_id \"(.*)\"/ ) {

While the version in the github has the following:
if ( $_ =~ m/transcript_id \"([^"]*)\"/ ) {

I tried with the version in the github and indeed it produces the gb file. I will try changing it and seeing if I can relaunch it.
Thanks a lot!

@KatharinaHoff
Copy link
Member

I think you are not using a very recent image. Did you build the container, yourself, or did you pull from dockerhub? For a while, we installed AUGUSTUS from debian in the container (for convenience). However, that does come with outdated scripts. I changed it a while ago that we clone from github... if you build the image from our docker repository, this should not happen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants