Respect --tmpdir across annotate (diamond) and predict (miniprot) steps#55
Merged
nextgenusfs merged 2 commits intomainfrom Apr 22, 2026
Merged
Respect --tmpdir across annotate (diamond) and predict (miniprot) steps#55nextgenusfs merged 2 commits intomainfrom
nextgenusfs merged 2 commits intomainfrom
Conversation
added 2 commits
April 21, 2026 23:05
…ent variable in temporary file handling Agent-Id: agent-46564fa7-02ab-4480-8cd2-5657e9e74535
…ent variable in temporary file handling Agent-Id: agent-46564fa7-02ab-4480-8cd2-5657e9e74535
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #52.
Summary
The
--tmpdirCLI option was not honored for all intermediate files. Two distinct places hard-coded/tmp, causingFileNotFoundErroron nodes where/tmpis not writable (HPC/slurm, containers) even when the user supplied--tmpdir.This PR threads the user-supplied tmp directory end-to-end through both the
annotatediamond searches and thepredictminiprot alignment, and makescreate_tmpdirrobust against accidental deletion of a user-supplied volume.Root cause
1.
annotate→swissprot_blast/merops_blast→diamond_blastdiamond_blastdefaulted totmpdir="/tmp"and its two wrappers did not accept atmpdirkwarg at all.annotate.pynever forwardedargs.tmpdir, so diamond intermediates were always written to/tmp/diamond_<uuid>— exactly the traceback in #52.2.
predict→align_proteinsalign_proteinsdefaulted totmpdir="/tmp"and wrotemini_tmp = os.path.join(tmpdir, f"{uuid.uuid1()}.miniprot.out").predict.pynever passed atmpdir, so miniprot output hit/tmpregardless of--tmpdir. Same class of bug, different tool.3.
create_tmpdirsemanticsWhen the user passed any
--tmpdirother than/tmp,create_tmpdirreturned the directory verbatim — meaning the downstreamshutil.rmtree(tmp_dir)inpredict.pywould wipe the user-supplied volume. It also blocked concurrent runs sharing a scratch volume.Changes
funannotate2/search.pydiamond_blast(..., tmpdir="/tmp")→diamond_blast(..., tmpdir=None)with fallback totempfile.gettempdir()(honors$TMPDIR).swissprot_blast(..., tmpdir=None)andmerops_blast(..., tmpdir=None)— new kwarg, forwarded todiamond_blast.funannotate2/annotate.pyswissprot_blast(Proteins, cpus=args.cpus)→swissprot_blast(Proteins, cpus=args.cpus, tmpdir=tmp_dir).merops_blast(Proteins, cpus=args.cpus)→merops_blast(Proteins, cpus=args.cpus, tmpdir=tmp_dir).funannotate2/align.pyimport tempfile.align_proteins(..., tmpdir="/tmp")→align_proteins(..., tmpdir=None)with fallback totempfile.gettempdir().align_transcripts(..., tmpdir="/tmp")→align_transcripts(..., tmpdir=None)for API parity (kwarg was and remains unused internally — gapmm2 writes directly tooutput=evidence; docstring updated to explain).funannotate2/predict.pytmpdir=tmp_dirto bothalign_transcripts(...)andalign_proteins(...).funannotate2/utilities.pycreate_tmpdirnow always creates a unique<base>_<uuid>subdirectory inside the supplied volume (previously only for/tmp). Falls back totempfile.gettempdir()when no volume is supplied (was CWD). This prevents theshutil.rmtreehazard on user-supplied volumes and makes concurrent runs safe on a shared scratch directory.Behavior change for users
funannotate2 annotate --tmpdir /scratch/mydirnow writes/scratch/mydir/<base>_<uuid>/...— previously it would write directly into/scratch/mydirand could later delete it.--tmpdiris omitted, temporary files now land intempfile.gettempdir()(which honors$TMPDIR), not a hardcoded/tmp.Tests
All new tests extend existing test modules — no new test files created.
tests/unit/test_utilities.py::TestCreateTmpdir— 5 tests covering unique-subdir creation, absolute paths,$TMPDIRfallback, and subdir isolation from the user-supplied volume.tests/unit/test_search_comprehensive.py::TestDiamondTmpdirPlumbing— 4 new tests that mocksubprocess.Popenand assert diamond's--outargument is under the caller-supplied tmpdir; plus two existing assertions updated for the newtmpdir=Nonekwarg onswissprot_blast/merops_blast.tests/unit/test_predict.py::TestAlignProteinsTmpdir— 2 new tests that mockrunSubprocessand assert thestdoutpath passed to miniprot is inside the caller-supplied tmpdir (explicit case) or insidetempfile.gettempdir()(fallback case).Results
The 2 failures (
test_funannotate_cli::test_version_command,test_help_command) are pre-existing environment-specific failures —subprocess.run("python -m funannotate2 --version")returns 127 in the integration harness on this machine. Verified identical ongit stash(before any change in this PR).Non-goals / out of scope
abinitio.py'sfolder="/tmp"kwargs — these wrap work intempfile.TemporaryDirectory()internally, so$TMPDIRis already honored at runtime. Stylistic cleanup only; left for a separate change.evm.py::evm_consensushas atmpdir="/tmp"default but is not imported or called anywhere else in the package (dead code).tmpdirkwarg — gapmm2 writes directly to the caller-suppliedoutputpath; a staging tmp is out of scope.Rollback
All changes are confined to
funannotate2/{search,annotate,align,predict,utilities}.pyand three existing unit-test modules. All added kwargs default toNone, so the public API is backwards compatible.Pull Request opened by Augment Code with guidance from the PR author