Skip to content

genemark prediction mistake crashes pyhmmer #28

@camiel-m

Description

@camiel-m
[Sep 11 02:47 AM] Measuring assembly completeness with buscolite [lineage=metazoa_odb12] for all ab initio predictions
Traceback (most recent call last):
  File "/lustre1/project/stg_00002/mambaforge/vsc37429/envs/funannotate2/bin/funannotate2", line 7, in <module>
    sys.exit(main())
             ^^^^^^
  File "/lustre1/project/stg_00002/mambaforge/vsc37429/envs/funannotate2/lib/python3.12/site-packages/funannotate2/__main__.py", line 26, in main
    predict(args)
  File "/lustre1/project/stg_00002/mambaforge/vsc37429/envs/funannotate2/lib/python3.12/site-packages/funannotate2/predict.py", line 576, in predict
    d, m, stats, cfg = runbusco(
                       ^^^^^^^^^
  File "/lustre1/project/stg_00002/mambaforge/vsc37429/envs/funannotate2/lib/python3.12/site-packages/buscolite/busco.py", line 512, in runbusco
    if isinstance(r.result(), list):
                  ^^^^^^^^^^
  File "/lustre1/project/stg_00002/mambaforge/vsc37429/envs/funannotate2/lib/python3.12/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/lustre1/project/stg_00002/mambaforge/vsc37429/envs/funannotate2/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/lustre1/project/stg_00002/mambaforge/vsc37429/envs/funannotate2/lib/python3.12/concurrent/futures/thread.py", line 59, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lustre1/project/stg_00002/mambaforge/vsc37429/envs/funannotate2/lib/python3.12/site-packages/buscolite/search.py", line 640, in hmmer_search
    for top_hits in pyhmmer.hmmsearch([hmm], sequences):
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lustre1/project/stg_00002/mambaforge/vsc37429/envs/funannotate2/lib/python3.12/site-packages/pyhmmer/hmmer/_base.py", line 483, in _multi_threaded
    raise e
  File "/lustre1/project/stg_00002/mambaforge/vsc37429/envs/funannotate2/lib/python3.12/site-packages/pyhmmer/hmmer/_base.py", line 469, in _multi_threaded
    result = results[0].get()  # <-- blocks until result is available
             ^^^^^^^^^^^^^^^^
  File "/lustre1/project/stg_00002/mambaforge/vsc37429/envs/funannotate2/lib/python3.12/site-packages/pyhmmer/hmmer/_base.py", line 146, in get
    raise self.result
  File "/lustre1/project/stg_00002/mambaforge/vsc37429/envs/funannotate2/lib/python3.12/site-packages/pyhmmer/hmmer/_base.py", line 293, in run
    hits = self.process(chore.query)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lustre1/project/stg_00002/mambaforge/vsc37429/envs/funannotate2/lib/python3.12/site-packages/pyhmmer/hmmer/_base.py", line 325, in process
    hits = self.query(query)
           ^^^^^^^^^^^^^^^^^
  File "/lustre1/project/stg_00002/mambaforge/vsc37429/envs/funannotate2/lib/python3.12/site-packages/pyhmmer/utils.py", line 70, in _method
    return method.__get__(obj, cls)(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/lustre1/project/stg_00002/mambaforge/vsc37429/envs/funannotate2/lib/python3.12/site-packages/pyhmmer/hmmer/_hmmsearch.py", line 44, in _
    return self.pipeline.search_hmm(query, self.targets)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pyhmmer/plan7.pyx", line 5750, in pyhmmer.plan7.Pipeline.__pyx_fuse_0_1search_hmm
  File "pyhmmer/plan7.pyx", line 5813, in pyhmmer.plan7.Pipeline.search_hmm
ValueError: sequence length over comparison pipeline limit (100000)

I encountered an edgecase where genemark predict a gene with a coding sequence longer than 100,000 amino acids. After removing this sequence from predictions.genemark.gff3 and rerunning funannotate 2 predict the pipeline finishes without issues. Perhaps a check for valid gene predictions prior to pyhmmer could help prevent this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions