Skip to content

Fix --out_filtered_bam <file> creating a directory for a single BAM (2.1.1)#3

Merged
ishinder merged 1 commit into
mainfrom
fix-out-filtered-bam-directory
Jun 8, 2026
Merged

Fix --out_filtered_bam <file> creating a directory for a single BAM (2.1.1)#3
ishinder merged 1 commit into
mainfrom
fix-out-filtered-bam-directory

Conversation

@ishinder

@ishinder ishinder commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Summary

A bare single-file --out_filtered_bam (e.g. --out_filtered_bam filtered.bam) created a directory filtered.bam/ and wrote the filtered BAM inside it, instead of writing the file. 2.1.0 fixed OutputWriter::generate_bam_output_paths, but two inline sites in main.cpp run first and create the directory — which then flips is_file_path (it checks fs::is_directory) into directory mode, defeating the 2.1.0 fix:

  • the auto-built bowtie2 index directory (main.cpp:340-342fs::create_directories in build_bowtie2_index), and
  • the temporary spurious-junction BED directory (main.cpp:514-516).

Both used the anti-pattern "if parent_path() is empty, use the output path itself as a directory."

Fix

Two pure, unit-testable helpers in path_utils that never treat a single output file as a directory:

  • index_output_dir() — bare filename → "" so the index builds in its own <reference>_bt2_idx/ directory next to the reference FASTA (the existing default), not in a dir named after the file.
  • sidecar_dir() — bare filename → "." so the temp BED has a concrete directory without creating one named after the file.

main.cpp now calls these. The temp BED name is also made pid-unique to avoid collisions between concurrent runs sharing an output directory.

Net effect: nothing creates filtered.bam/; a bare single-file --out_filtered_bam writes that file (and --removed_alignments_bam writes <name>_removed_alignments.bam beside it), with no stray directory. Directory mode for BAM lists is unchanged.

Tests / verification

  • New regression unit tests for index_output_dir / sidecar_dir and the single-BAM+directory case (tests/test_path_utils.cpp).
  • Verified end-to-end against the real binary on a synthesized BAM: bare filename (with and without -i), --removed_alignments_bam, single→directory, and a 2-BAM list→directory. All produce the expected files with no stray directory.

Docs / version

  • README documents where the auto-built index lives + read-only-reference caveat.
  • CHANGELOG: honest 2.1.1 entry; corrects the stale 2.1.0 "unreleased" header and softens its --out_filtered_bam over-claim.
  • Version bumped to 2.1.1 (CMakeLists + conda/meta.yaml).

A bare single-file --out_filtered_bam (e.g. `filtered.bam`) created a
directory `filtered.bam/` and wrote the output inside it, instead of
writing the file. 2.1.0 fixed OutputWriter::generate_bam_output_paths
but missed two inline sites in main.cpp that run first and create the
directory, which then flips is_file_path (it checks fs::is_directory)
into directory mode, defeating the 2.1.0 fix:

- the auto-built bowtie2 index directory (main.cpp:340-342 -> the
  fs::create_directories in build_bowtie2_index), and
- the temporary spurious-junction BED directory (main.cpp:514-516).

Both used the anti-pattern "if parent_path() is empty, use the output
path itself as a directory".

Fix by adding two pure, unit-testable helpers in path_utils that never
treat a single output file as a directory:

- index_output_dir(): bare filename -> "" so the index builds in the
  default location (its own <reference>_bt2_idx/ directory next to the
  reference FASTA), not in a dir named after the file.
- sidecar_dir(): bare filename -> "." so the temp BED has a concrete
  directory without creating one named after the file.

main.cpp now calls these. The temp BED name is also made pid-unique to
avoid collisions between concurrent runs sharing an output directory.

Net effect: nothing creates `filtered.bam/`, so the file/dir heuristic
can't flip; a bare single-file --out_filtered_bam writes that file (and
--removed_alignments_bam writes <name>_removed_alignments.bam beside it),
with no stray directory. Directory mode for BAM lists is unchanged.

Add regression unit tests for index_output_dir/sidecar_dir and the
single-BAM+directory case. Update README (index location + read-only
reference caveat) and CHANGELOG. Bump to 2.1.1.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@ishinder ishinder merged commit e1a7181 into main Jun 8, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant