Fail in EVM #646

hychen86 · 2021-10-01T14:43:08Z

Hi Jon,

When i run 'funannotate predict' for same of my assembly data, I found the process failed during EVM:

[Oct 01 03:34 AM]: Summary of gene models passed to EVM (weights):
Source Weight Count
Augustus 1 2653
Augustus HiQ 2 98
GlimmerHMM 1 3189
snap 1 3059
Total - 8999
[Oct 01 03:34 AM]: EVM: partitioning input to ~ 35 genes per partition using min 1500 bp interval
Traceback (most recent call last):
File "/home/intact/miniconda3/envs/funannotate/lib/python3.7/site-packages/funannotate/aux_scripts/funannotate-runEVM.py", line 484, in
partitions=args.no_partitions)
File "/home/intact/miniconda3/envs/funannotate/lib/python3.7/site-packages/funannotate/aux_scripts/funannotate-runEVM.py", line 224, in create_partitions
splitTup, idx = getBreakPoint(v, loc, direction='reverse', gap=interval)
TypeError: cannot unpack non-iterable bool object
[Oct 01 03:34 AM]: Evidence modeler has failed, exiting

the command I used is "funannotate predict -i DE-Pool-2-D04.mask -o DE-Pool-2-D04_funannotate --augustus_species daldinia_eschscholizii -s Daldinia_eschscholizii --cpus 12"

This is work for same data but not for others.

Best

Hongyu

nextgenusfs · 2021-10-04T03:58:40Z

What version of funannotate? if it is not latest release please check with latest release to see if it is still an issue. This reminds me of an error that was fixed awhile ago.

hychen86 · 2021-10-05T14:56:46Z

Hi Jon
I update to latest release, the problem has been fixed.
Thanks

sunnycqcn · 2021-12-10T15:31:32Z

Hi Jon,
When I used your old version, I did not met this error. I updated the latest version, V1.8.10.
I met the same error.

[Dec 07 02:54 PM]: OS: CentOS Linux 7, 80 cores, ~ 1057 GB RAM. Python: 3.8.12
[Dec 07 02:54 PM]: Running funannotate v1.8.10
[Dec 07 02:54 PM]: Found training files, will re-use these files:
  --rna_bam SH/training/funannotate_train.coordSorted.bam
  --pasa_gff SH/training/funannotate_train.pasa.gff3
  --stringtie SH/training/funannotate_train.stringtie.gtf
  --transcript_alignments SH/training/funannotate_train.transcripts.gff3
[Dec 07 02:54 PM]: Skipping CodingQuarry as --organism=other. Pass a weight larger than 0 to run CQ, ie --weights codingquarry:1
[Dec 07 02:54 PM]: Parsed training data, run ab-initio gene predictors as follows:
  Program      Training-Method
  augustus     pasa
  genemark     selftraining
  glimmerhmm   pasa
  snap         pasa
[Dec 07 02:59 PM]: Loading genome assembly and parsing soft-masked repetitive sequences
[Dec 07 02:59 PM]: Genome loaded: 20 scaffolds; 1,022,923,124 bp; 11.76% repeats masked
[Dec 07 02:59 PM]: Parsed 185,175 transcript alignments from: SH/training/funannotate_train.transcripts.gff3
[Dec 07 03:00 PM]: Aligning 691,461 unique transcripts [not found in exising alignments] with minimap2
[Dec 07 03:05 PM]: Mapped 439,731 of these transcripts to the genome, adding to alignments
[Dec 07 03:05 PM]: Creating transcript EVM alignments and Augustus transcripts hintsfile
[Dec 07 03:05 PM]: Existing RNA-seq BAM hints found: SH/predict_misc/hints.BAM.gff
[Dec 07 03:06 PM]: Existing protein alignments found: SH/predict_misc/protein_alignments.gff3
[Dec 07 03:07 PM]: Running GeneMark-ES on assembly
[Dec 10 09:10 AM]: 220,606 predictions from GeneMark
[Dec 10 09:10 AM]: Filtering PASA data for suitable training set
[Dec 10 09:18 AM]: 6,381 of 44,490 models pass training parameters
[Dec 10 09:18 AM]: Existing Augustus annotations found: SH/predict_misc/augustus.gff3
[Dec 10 09:19 AM]: Pulling out high quality Augustus predictions
[Dec 10 09:19 AM]: Found 14,429 high quality predictions from Augustus (>90% exon evidence)
[Dec 10 09:19 AM]: Existing snap predictions found SH/predict_misc/snap-predictions.gff3
[Dec 10 09:19 AM]: 40 predictions from SNAP
[Dec 10 09:19 AM]: Existing GlimmerHMM predictions found: SH/predict_misc/glimmerhmm-predictions.gff3
[Dec 10 09:19 AM]: 333,448 predictions from GlimmerHMM
[Dec 10 09:19 AM]: Summary of gene models passed to EVM (weights):
[Dec 10 09:20 AM]: EVM: partitioning input to ~ 35 genes per partition using min 1500 bp interval
Traceback (most recent call last):
  File "/isilon/saskatoon-rdc/users/fuf/comDIR/miniconda/envs/FUN/lib/python3.8/site-packages/funannotate/aux_scripts/funannotate-runEVM.py", line 479, in <module>
    cmdinfo = create_partitions(args.fasta, args.genes, partitions,
  File "/isilon/saskatoon-rdc/users/fuf/comDIR/miniconda/envs/FUN/lib/python3.8/site-packages/funannotate/aux_scripts/funannotate-runEVM.py", line 131, in create_partitions
    SeqRecords = SeqIO.index_db(f_idx, fasta, 'fasta')
  File "/isilon/saskatoon-rdc/users/fuf/comDIR/miniconda/envs/FUN/lib/python3.8/site-packages/Bio/SeqIO/__init__.py", line 963, in index_db
    return _SQLiteManySeqFilesDict(
  File "/isilon/saskatoon-rdc/users/fuf/comDIR/miniconda/envs/FUN/lib/python3.8/site-packages/Bio/File.py", line 311, in __init__
    self._load_index()
  File "/isilon/saskatoon-rdc/users/fuf/comDIR/miniconda/envs/FUN/lib/python3.8/site-packages/Bio/File.py", line 411, in _load_index
    raise ValueError("Not a Biopython index database? %s" % err) from None
ValueError: Not a Biopython index database? no such table: meta_data
  Source         Weight   Count
  Augustus       1        221539
  Augustus HiQ   2        14429
  GeneMark       1        220606
  GlimmerHMM     1        333448
  pasa           6        44490
  snap           1        40
  Total          -        834552
[Dec 10 09:20 AM]: Evidence modeler has failed, exiting
Traceback (most recent call last):
  File "/isilon/saskatoon-rdc/users/fuf/comDIR/miniconda/envs/FUN/bin/funannotate", line 10, in <module>
  File "/isilon/saskatoon-rdc/users/fuf/comDIR/miniconda/envs/FUN/lib/python3.8/site-packages/funannotate/funannotate.py", line 705, in main
    mod = importlib.import_module('{:}.{:}.{:}'.format(
  File "/isilon/saskatoon-rdc/users/fuf/comDIR/miniconda/envs/FUN/lib/python3.8/site-packages/funannotate/predict.py", line 1790, in main
    if total < 1:
FileNotFoundError: [Errno 2] No such file or directory: '/isilon/saskatoon-rdc/users/fuf/striga/annotation/SH/predict_misc/evm.round1.gff3'

I can not figure out what it happened.
Could you help me check this error?
Thanks,
Fuyou

sunnycqcn · 2021-12-13T21:15:22Z

Hi Jon,
I forgot to tell you about my run command
funannotate test -t all --cpus 40
In fact, I can pass all test run.
I think the problem is my genome size. Because my genome is about 1Gb.
Thanks,
Fuyou

nextgenusfs · 2021-12-13T23:02:36Z

Hmm, it seems that the SQLite backend of SeqIO.index_db() has failed. I'll try to change this to SeqIO.index() which works differently, pushing to master in a minute after I run through tests.

sunnycqcn · 2021-12-14T17:41:47Z

Hi Jon,
It is working now.
Thanks,
Fuyou

nextgenusfs closed this as completed Oct 6, 2021

nextgenusfs pushed a commit that referenced this issue Dec 13, 2021

use SeqIO.index instead of SeqIO.index_db in EVM partition #646

8d38e31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fail in EVM #646

Fail in EVM #646

hychen86 commented Oct 1, 2021

nextgenusfs commented Oct 4, 2021

hychen86 commented Oct 5, 2021

sunnycqcn commented Dec 10, 2021 •

edited by nextgenusfs

sunnycqcn commented Dec 13, 2021

nextgenusfs commented Dec 13, 2021

sunnycqcn commented Dec 14, 2021

Fail in EVM #646

Fail in EVM #646

Comments

hychen86 commented Oct 1, 2021

nextgenusfs commented Oct 4, 2021

hychen86 commented Oct 5, 2021

sunnycqcn commented Dec 10, 2021 • edited by nextgenusfs

sunnycqcn commented Dec 13, 2021

nextgenusfs commented Dec 13, 2021

sunnycqcn commented Dec 14, 2021

sunnycqcn commented Dec 10, 2021 •

edited by nextgenusfs