Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail in EVM #646

Closed
hychen86 opened this issue Oct 1, 2021 · 6 comments
Closed

Fail in EVM #646

hychen86 opened this issue Oct 1, 2021 · 6 comments

Comments

@hychen86
Copy link

hychen86 commented Oct 1, 2021

Hi Jon,

When i run 'funannotate predict' for same of my assembly data, I found the process failed during EVM:

[Oct 01 03:34 AM]: Summary of gene models passed to EVM (weights):
Source Weight Count
Augustus 1 2653
Augustus HiQ 2 98
GlimmerHMM 1 3189
snap 1 3059
Total - 8999
[Oct 01 03:34 AM]: EVM: partitioning input to ~ 35 genes per partition using min 1500 bp interval
Traceback (most recent call last):
File "/home/intact/miniconda3/envs/funannotate/lib/python3.7/site-packages/funannotate/aux_scripts/funannotate-runEVM.py", line 484, in
partitions=args.no_partitions)
File "/home/intact/miniconda3/envs/funannotate/lib/python3.7/site-packages/funannotate/aux_scripts/funannotate-runEVM.py", line 224, in create_partitions
splitTup, idx = getBreakPoint(v, loc, direction='reverse', gap=interval)
TypeError: cannot unpack non-iterable bool object
[Oct 01 03:34 AM]: Evidence modeler has failed, exiting

the command I used is "funannotate predict -i DE-Pool-2-D04.mask -o DE-Pool-2-D04_funannotate --augustus_species daldinia_eschscholizii -s Daldinia_eschscholizii --cpus 12"

This is work for same data but not for others.

Best

Hongyu

@nextgenusfs
Copy link
Owner

What version of funannotate? if it is not latest release please check with latest release to see if it is still an issue. This reminds me of an error that was fixed awhile ago.

@hychen86
Copy link
Author

hychen86 commented Oct 5, 2021

Hi Jon
I update to latest release, the problem has been fixed.
Thanks

@sunnycqcn
Copy link

sunnycqcn commented Dec 10, 2021

Hi Jon,
When I used your old version, I did not met this error. I updated the latest version, V1.8.10.
I met the same error.

[Dec 07 02:54 PM]: OS: CentOS Linux 7, 80 cores, ~ 1057 GB RAM. Python: 3.8.12
[Dec 07 02:54 PM]: Running funannotate v1.8.10
[Dec 07 02:54 PM]: Found training files, will re-use these files:
  --rna_bam SH/training/funannotate_train.coordSorted.bam
  --pasa_gff SH/training/funannotate_train.pasa.gff3
  --stringtie SH/training/funannotate_train.stringtie.gtf
  --transcript_alignments SH/training/funannotate_train.transcripts.gff3
[Dec 07 02:54 PM]: Skipping CodingQuarry as --organism=other. Pass a weight larger than 0 to run CQ, ie --weights codingquarry:1
[Dec 07 02:54 PM]: Parsed training data, run ab-initio gene predictors as follows:
  Program      Training-Method
  augustus     pasa
  genemark     selftraining
  glimmerhmm   pasa
  snap         pasa
[Dec 07 02:59 PM]: Loading genome assembly and parsing soft-masked repetitive sequences
[Dec 07 02:59 PM]: Genome loaded: 20 scaffolds; 1,022,923,124 bp; 11.76% repeats masked
[Dec 07 02:59 PM]: Parsed 185,175 transcript alignments from: SH/training/funannotate_train.transcripts.gff3
[Dec 07 03:00 PM]: Aligning 691,461 unique transcripts [not found in exising alignments] with minimap2
[Dec 07 03:05 PM]: Mapped 439,731 of these transcripts to the genome, adding to alignments
[Dec 07 03:05 PM]: Creating transcript EVM alignments and Augustus transcripts hintsfile
[Dec 07 03:05 PM]: Existing RNA-seq BAM hints found: SH/predict_misc/hints.BAM.gff
[Dec 07 03:06 PM]: Existing protein alignments found: SH/predict_misc/protein_alignments.gff3
[Dec 07 03:07 PM]: Running GeneMark-ES on assembly
[Dec 10 09:10 AM]: 220,606 predictions from GeneMark
[Dec 10 09:10 AM]: Filtering PASA data for suitable training set
[Dec 10 09:18 AM]: 6,381 of 44,490 models pass training parameters
[Dec 10 09:18 AM]: Existing Augustus annotations found: SH/predict_misc/augustus.gff3
[Dec 10 09:19 AM]: Pulling out high quality Augustus predictions
[Dec 10 09:19 AM]: Found 14,429 high quality predictions from Augustus (>90% exon evidence)
[Dec 10 09:19 AM]: Existing snap predictions found SH/predict_misc/snap-predictions.gff3
[Dec 10 09:19 AM]: 40 predictions from SNAP
[Dec 10 09:19 AM]: Existing GlimmerHMM predictions found: SH/predict_misc/glimmerhmm-predictions.gff3
[Dec 10 09:19 AM]: 333,448 predictions from GlimmerHMM
[Dec 10 09:19 AM]: Summary of gene models passed to EVM (weights):
[Dec 10 09:20 AM]: EVM: partitioning input to ~ 35 genes per partition using min 1500 bp interval
Traceback (most recent call last):
  File "/isilon/saskatoon-rdc/users/fuf/comDIR/miniconda/envs/FUN/lib/python3.8/site-packages/funannotate/aux_scripts/funannotate-runEVM.py", line 479, in <module>
    cmdinfo = create_partitions(args.fasta, args.genes, partitions,
  File "/isilon/saskatoon-rdc/users/fuf/comDIR/miniconda/envs/FUN/lib/python3.8/site-packages/funannotate/aux_scripts/funannotate-runEVM.py", line 131, in create_partitions
    SeqRecords = SeqIO.index_db(f_idx, fasta, 'fasta')
  File "/isilon/saskatoon-rdc/users/fuf/comDIR/miniconda/envs/FUN/lib/python3.8/site-packages/Bio/SeqIO/__init__.py", line 963, in index_db
    return _SQLiteManySeqFilesDict(
  File "/isilon/saskatoon-rdc/users/fuf/comDIR/miniconda/envs/FUN/lib/python3.8/site-packages/Bio/File.py", line 311, in __init__
    self._load_index()
  File "/isilon/saskatoon-rdc/users/fuf/comDIR/miniconda/envs/FUN/lib/python3.8/site-packages/Bio/File.py", line 411, in _load_index
    raise ValueError("Not a Biopython index database? %s" % err) from None
ValueError: Not a Biopython index database? no such table: meta_data
  Source         Weight   Count
  Augustus       1        221539
  Augustus HiQ   2        14429
  GeneMark       1        220606
  GlimmerHMM     1        333448
  pasa           6        44490
  snap           1        40
  Total          -        834552
[Dec 10 09:20 AM]: Evidence modeler has failed, exiting
Traceback (most recent call last):
  File "/isilon/saskatoon-rdc/users/fuf/comDIR/miniconda/envs/FUN/bin/funannotate", line 10, in <module>
  File "/isilon/saskatoon-rdc/users/fuf/comDIR/miniconda/envs/FUN/lib/python3.8/site-packages/funannotate/funannotate.py", line 705, in main
    mod = importlib.import_module('{:}.{:}.{:}'.format(
  File "/isilon/saskatoon-rdc/users/fuf/comDIR/miniconda/envs/FUN/lib/python3.8/site-packages/funannotate/predict.py", line 1790, in main
    if total < 1:
FileNotFoundError: [Errno 2] No such file or directory: '/isilon/saskatoon-rdc/users/fuf/striga/annotation/SH/predict_misc/evm.round1.gff3'

I can not figure out what it happened.
Could you help me check this error?
Thanks,
Fuyou

@sunnycqcn
Copy link

Hi Jon,
I forgot to tell you about my run command
funannotate test -t all --cpus 40
In fact, I can pass all test run.
I think the problem is my genome size. Because my genome is about 1Gb.
Thanks,
Fuyou

@nextgenusfs
Copy link
Owner

Hmm, it seems that the SQLite backend of SeqIO.index_db() has failed. I'll try to change this to SeqIO.index() which works differently, pushing to master in a minute after I run through tests.

@sunnycqcn
Copy link

Hi Jon,
It is working now.
Thanks,
Fuyou

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants