New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with Evidencemodeler #528
Comments
This seems odd from logfile. [01/11/21 08:23:44]: 9,557 total contigs; skipping -51,760 contigs with no genes Do you have the predict logfile that I could look at as well? |
Yes, here it is attached. Thanks for your help. Alexandre |
Hmm, okay thanks. I can't quite tell, but maybe looks like the command line around the
I don't know how that would necessarily be causing problems per say with EVM.... but seems like maybe just a typo? In your initial command above there is clearly a space.
So assuming above is not related to error, you can try to run the EVM command from that same directory and maybe that will yield more info to stdout, ie:
|
Actually that will probably fail based on what I have in the bash script, you can create a new bash wrapper like this that will just run the image (it is same just doesn't include call to funannotate):
|
Here is a generalized version of this bash script -- you could run with any docker container: https://github.com/nextgenusfs/dw/ |
Hello again, Thanks for the answers. I also tried running the EVM step using the bash script through dw, but again I get the exact same output as I did when running the whole pipeline. I also get (I had it before aswell), a single file called : genes.1.bed in the predict_mis/EVM folder. It feels like EVM can't go past the first scaffold, could this be possible? Thanks, |
Nevermind, saw your log file and it is already 264 GB. When you call this are all of the files you are passing to the docker container located in the same run directory? Other thing to try would be to just move into the docker image interactively and then try to run the EVM workflow, ie And then lastly, I assume the test dataset runs on your system?
|
One other thing to try would be to delete all of the EVM temp files and then try to add |
But going back to my original thought in the EVM log file, that this line seems strange:
What is happening in the code is this:
This suggests something is wrong with the input files (something I've not seen before), it it is saying that it somehow found >50k contigs that don't have genes associated with them. This suggests that something is wrong with the headers on one of these input files -- can you validate that the input files have appropriate FASTA/Sequence headers? For example, the custom GFF that you are passing do they match the genome FASTA headers? And the BAM file as well, do the headers match? |
Ok, maybe the problem is there! My GFF file comes from Transdecoder, but I used the transcriptome as an input. So obviously, the transcriptome and the genome don't have the same headers. Could the problem come from there? What could I use as an alternative then? Thanks, Alexandre |
So if the transcripts aren't aligned to the genome reference then it shouldn't be passed as GFF_other. If you have transcripts from Transdecoder that you want to align, you can pass those as FASTA format to Maybe its not obvious -- but the pipeline might work a lot better if you let |
Hi, Sorry for the long delay. Just to let you know that I ran it as you suggested and I was able to finish the whole pipeline successfully, so thank you! Best, |
Hello funannotate users,
I am currently using funanotate v1.8.4, installed through docker, and funannotate check and testing works without issues.
I am trying to run funannotate predict on some fish genome assembly.
So, when I run:
funannotate-docker predict -i ~softmasked.genome.fasta -o ./output1 -s "Species name" --transcript_evidence Transcriptome.fasta --optimize_augustus --other_gff /home/alexandre/funannotate/Species.transdecoder.gff3 --protein_evidence uniprot.reviewed.fasta uniprot-reviewed.fasta --organism other --rna_bam ~/funannotate/alignment.bam --weights codingquarry:1 --cpus 4
Everything runs smoothly until the EvidenceModeler part. Then, I get this message :
funannotate-EVM.log
EVM: partitioning input to ~ 35 genes per partition
Traceback (most recent call last):
File "/venv/lib/python3.7/site-packages/funannotate/aux_scripts/funannotate-runEVM.py", line 433, in
partitions=args.no_partitions)
File "/venv/lib/python3.7/site-packages/funannotate/aux_scripts/funannotate-runEVM.py", line 203, in create_partitions
k, len(SeqRecords[k])))
File "/venv/lib/python3.7/site-packages/Bio/File.py", line 248, in getitem
record = self._proxy.get(self._offsets[key])
KeyError: 'scaffold_1'
[Jan 11 08:24 AM]: Evidence modeler has failed, exiting
Traceback (most recent call last):
File "/venv/bin/funannotate", line 713, in
main()
File "/venv/bin/funannotate", line 703, in main
mod.main(arguments)
File "/venv/lib/python3.7/site-packages/funannotate/predict.py", line 1730, in main
os.remove(EVM_out)
FileNotFoundError: [Errno 2] No such file or directory: '~/output1/predict_misc/evm.round1.gff3'
The EVM logfile (attached) does not show any error, so I am a bit confused with what's going on here.
Thanks for the help,
Best,
Alexandre
funannotate-EVM.log
The text was updated successfully, but these errors were encountered: