Fix --out_filtered_bam <file> creating a directory for a single BAM (2.1.1)#3
Merged
Merged
Conversation
A bare single-file --out_filtered_bam (e.g. `filtered.bam`) created a directory `filtered.bam/` and wrote the output inside it, instead of writing the file. 2.1.0 fixed OutputWriter::generate_bam_output_paths but missed two inline sites in main.cpp that run first and create the directory, which then flips is_file_path (it checks fs::is_directory) into directory mode, defeating the 2.1.0 fix: - the auto-built bowtie2 index directory (main.cpp:340-342 -> the fs::create_directories in build_bowtie2_index), and - the temporary spurious-junction BED directory (main.cpp:514-516). Both used the anti-pattern "if parent_path() is empty, use the output path itself as a directory". Fix by adding two pure, unit-testable helpers in path_utils that never treat a single output file as a directory: - index_output_dir(): bare filename -> "" so the index builds in the default location (its own <reference>_bt2_idx/ directory next to the reference FASTA), not in a dir named after the file. - sidecar_dir(): bare filename -> "." so the temp BED has a concrete directory without creating one named after the file. main.cpp now calls these. The temp BED name is also made pid-unique to avoid collisions between concurrent runs sharing an output directory. Net effect: nothing creates `filtered.bam/`, so the file/dir heuristic can't flip; a bare single-file --out_filtered_bam writes that file (and --removed_alignments_bam writes <name>_removed_alignments.bam beside it), with no stray directory. Directory mode for BAM lists is unchanged. Add regression unit tests for index_output_dir/sidecar_dir and the single-BAM+directory case. Update README (index location + read-only reference caveat) and CHANGELOG. Bump to 2.1.1. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A bare single-file
--out_filtered_bam(e.g.--out_filtered_bam filtered.bam) created a directoryfiltered.bam/and wrote the filtered BAM inside it, instead of writing the file. 2.1.0 fixedOutputWriter::generate_bam_output_paths, but two inline sites inmain.cpprun first and create the directory — which then flipsis_file_path(it checksfs::is_directory) into directory mode, defeating the 2.1.0 fix:main.cpp:340-342→fs::create_directoriesinbuild_bowtie2_index), andmain.cpp:514-516).Both used the anti-pattern "if
parent_path()is empty, use the output path itself as a directory."Fix
Two pure, unit-testable helpers in
path_utilsthat never treat a single output file as a directory:index_output_dir()— bare filename →""so the index builds in its own<reference>_bt2_idx/directory next to the reference FASTA (the existing default), not in a dir named after the file.sidecar_dir()— bare filename →"."so the temp BED has a concrete directory without creating one named after the file.main.cppnow calls these. The temp BED name is also made pid-unique to avoid collisions between concurrent runs sharing an output directory.Net effect: nothing creates
filtered.bam/; a bare single-file--out_filtered_bamwrites that file (and--removed_alignments_bamwrites<name>_removed_alignments.bambeside it), with no stray directory. Directory mode for BAM lists is unchanged.Tests / verification
index_output_dir/sidecar_dirand the single-BAM+directory case (tests/test_path_utils.cpp).-i),--removed_alignments_bam, single→directory, and a 2-BAM list→directory. All produce the expected files with no stray directory.Docs / version
--out_filtered_bamover-claim.