Develop by joshfactorial · Pull Request #265 · ncsa/NEAT

joshfactorial · 2026-04-08T18:07:19Z

No description provided.

…T into feature/karen_bacterial_wrapper

Markov integration

Bug fixing

…terial_wrapper

test_error_models.py (21 tests): - TraditionalQualityModel: default construction, quality score range, error rate dict, uniform mode init and score bounds, get_quality_scores for uniform/non-uniform/exact-length/shorter-than-model cases, ndarray return, reproducibility - SequencingErrorModel: default construction, variant prob sum, custom error rate, zero-error returns empty list, high-error produces ErrorContainers with valid locations and alt bases, padding returned - ErrorContainer: field storage for SNV, Deletion, Insertion types test_single_runner.py (37 tests, 3 classes): - TestInitializeAllModels: returns 4-tuple of correct types, rng attached, mutation_rate override, fragment_mean path, default mean=read_len*2 - TestWriteBlockVcf: single SNV written, ref/alt/qual correct, empty input produces no output, multiple SNVs in sorted order, 10 VCF columns - TestReadSimulatorSingle: return tuple structure, thread_idx/contig_name passthrough, ContigVariants return, all file_dict keys present, FASTQ file created and valid, content has FASTQ records, seed reproducibility, vcf=None when produce_vcf=False Note: quality_score_model.py is a standalone Markov analysis script with hardcoded BAM paths — not a library module and not tested here. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- test_error_models.py: 4 new indel error tests (deletion, insertion, blacklist deduplication); fix monkeypatch approach for numpy Generator - test_single_runner.py: 2 new tests covering produce_bam=True path (lines 127-152) and bam=None when not requested - test_runner.py: 8 new integration tests covering discard_bed, mutation_bed, ploidy=1, ploidy=4, min_mutations, and produce_bam=True; also input VCF, target_bed, and mutation_rate override - tests/test_variants/: new test_variant_types.py (73 tests for all comparison operators, __repr__, contains(), get_alt()) and test_contig_variants.py (24 tests documenting known bugs in remove_variant and check_if_ins) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

generate_variants.py: - N in mutation region (lines 139-157 avoidance logic) - N-heavy subsequence skip (line 204) - Trinucleotide with N triggers disallowed_chars path (lines 246-249) - High-rate deletion overlap handling (lines 291-302) error_models.py: - MarkovQualityModel stub coverage (lines 112-118) - Score clamped to min 1 and max 42 via extreme qual_score_probs (lines 102-104) - Documents unreachable indel branches (lines 209-235, 252) caused by the total_indel_length > read_length//4 circular gate; includes TODO comment with proposed assertions for after the bug fix vcf_func.py: - Three WP-genotype tests document that the WP condition is always False (string "WP" never equals a list); each includes a TODO comment with exact updated assertions to apply once the condition is corrected to any(x.split('=')[0] == "WP" for x in ...) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Duplicates removed: - test_error_and_mut_models.py: test_sequencing_error_model_zero_error_returns_none_or_empty (covered by test_error_models.py::test_sem_zero_error_rate_returns_empty) - test_error_and_mut_models.py: test_traditional_quality_model_reproducible_with_seed (covered by test_error_models.py::test_tqm_get_quality_scores_reproducible) - test_seq_error.py: test_no_errors_when_avg_zero (same as above) - test_models/test_stitch_outputs.py: test_concat_joins_files_in_order (covered by test_read_simulator/test_stitch_outputs.py) Vacuous assertions fixed: - test_error_models.py: rename indel dead-code tests and assert == 0 (not >= 0) - test_output_file_writer.py: replace `assert True` with bam_handle.tell() > pos_before - test_output_file_writer.py: strengthen test_reg2bin_same_16kb_bin to check determinism - test_runner.py: test_filter_bed_regions_returns_list now checks content - test_single_runner.py: test_returns_four_element_tuple now checks element types - test_vcf_func.py: test_variant_genotype_returns_correct_ploidy_length checks values - test_generate_variants.py: test_generate_variants_variant_types_are_valid checks attributes - test_contig_variants.py: test_remove_variant_method_exists adds TODO comment for post-fix update Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- generate_reads.py lines 234-236: paired-end discard check for read2 (test_generate_reads_paired_discard_region_removes_all) - generate_reads.py: paired no-discard regression guard (test_generate_reads_paired_no_discard_produces_read_pairs) - single_runner.py line 85: "Record too small" debug log path, with generate_reads/generate_variants patched to avoid infinite loop (test_record_too_small_logs_and_continues) - bed_func.py line 209: mutation rate > 0.3 warning log (test_parse_single_bed_mutation_high_rate_logs_warning) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2. Change version to 4.3.6 in pyproject.toml 3. Fixed channel order in environment.ylm which cause crash due to not satisfied requirements of bcftools

… repo dir

Feature/claude assisted tests

keshav-gandhi · 2026-04-25T15:52:02Z

I have tested all of the changes listed in these commits. Notably I have not found further coverage bugs across several tests. I believe the code is ready to merge into main.

(I believe the discrepancy I found earlier could have come from a Git-related artifact.)

Feature/karen bacterial wrapper

Markov integration

Solving VCF-related issues.

…ut requested The no-files guard in OutputFileWriter.__init__ was raising ValueError whenever file_handles was empty, but in BAM-only mode the top-level OFW intentionally has no open file handle (pysam manages the merged BAM directly). Guard now exempts the case where a BAM output path is set. Adds unit tests for BAM-only and VCF-only OFW construction, and integration tests for all three produce_fastq=false output combinations through the full runner, which would have caught this regression. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Fix OutputFileWriter crash when produce_fastq=false and only BAM outp…

joshfactorial · 2026-04-28T14:25:39Z

still seeing duplicated variants at the same position. The fix corrected exact duplicates, but we're still seeing same position variants, which shouldn't happen except maybe rarely, in large datasets.

Removed excess details and streamlined the introduction to focus on key features and improvements in NEAT 4.4.

test_qual_score_models.py: - Fix two tests that used -1 as a quality score key in position_distributions (invalid data that bypassed real validation); replace with valid scores - Add 11 new tests covering the previously untested Markov chain path: transition chain is exercised, fallback to marginal for unknown q_prev, wrong transition_distributions length raises, out-of-range score clamping, length=0/1 edge cases, read-length interpolation via _position_index_for_length test_error_models.py: - Remove two stub tests that imported from error_models.MarkovQualityModel (a dead TODO class never called at runtime); they passed unconditionally and gave false confidence about the real model test_markov_utils.py (new): - 36 tests covering all previously untested functions in markov_utils.py: _down_bin_quality (6), compute_initial_distribution (4), compute_position_distributions (4), compute_transition_distributions (5), read_quality_lists (6), build_markov_model (3) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Updating version in README.md.

Add Markov quality model test coverage and fix invalid test data

keshav-gandhi

Tests make sense and look great!

karenhx2 and others added 30 commits October 4, 2025 00:38

Developed script for wrapping bacterial chromosomes

8aa2032

Cleaned up files

fbca683

Merge branch 'main' into feature/karen_bacterial_wrapper

1440b21

Worked on testing functions for stitching outputs together

edc67b8

Merge branch 'feature/karen_bacterial_wrapper' of github.com:ncsa/NEA…

ba60be7

…T into feature/karen_bacterial_wrapper

Removed reference and data files

541b3b1

Updated script to include paired-ended runs

85a9c51

More error model and mutation model tests.

fdb8d2c

Readability updates to README.

069272c

Full Markov integration set of scripts.

9d43fa3

Merge branch 'main' into markov-integration

b539df2

Implemented CLI for wrapper

acef491

Removed extra copy of runner code

e75e596

Merge branch 'main' into markov-integration

313e1e8

Merge pull request #224 from ncsa/markov-integration

a666171

Markov integration

Renaming to parallel_block_size and parallel_mode; fixing tests.

7d6808f

Fixed bugs with config files and output stitching

8459c8b

Installation instructions.

55541ff

Merge branch 'develop' into feature/karen_bacterial_wrapper

3fd4eb2

Finished final debugging

dc28b19

Accidentally removed poetry file and re-adding it

aaaeed4

Merge pull request #240 from ncsa/quality-of-life

2e8e30c

Bug fixing

Merge branch 'develop' of github.com:ncsa/NEAT into feature/karen_bac…

ec1c70a

…terial_wrapper

Markov quality score model changes copied in here.

5a4d347

Merge branch 'develop' of github.com:ncsa/NEAT into feature/karen_bac…

349857c

…terial_wrapper

Updated wrapper

4a33b46

Developed script for wrapping bacterial chromosomes

383a4ef

Cleaned up files

49ada66

Worked on testing functions for stitching outputs together

7c9c546

Removed reference and data files

17a5340

joshfactorial and others added 11 commits April 17, 2026 23:11

Updating versions

3dce75f

1. Update README

cc549dc

2. Change version to 4.3.6 in pyproject.toml 3. Fixed channel order in environment.ylm which cause crash due to not satisfied requirements of bcftools

Tests final final_final

7c0c852

Added neat log isolation so it didn't keep writing test logs into the…

265d4e8

… repo dir

Merge pull request #263 from ncsa/feature/claude-assisted-tests

cb4bde9

Feature/claude assisted tests

Merge branch 'main' into develop

670bc10

keshav-gandhi and others added 8 commits April 26, 2026 15:42

Solving VCF-related issues.

ce60026

Merge branch 'develop' into feature/karen_bacterial_wrapper

9cd65cc

Merge pull request #237 from ncsa/feature/karen_bacterial_wrapper

60ea4fd

Feature/karen bacterial wrapper

Merge pull request #239 from ncsa/markov-integration

ead3e07

Markov integration

Merge pull request #267 from ncsa/vcf-fixes

342356c

Solving VCF-related issues.

Merge pull request #268 from ncsa/fix/bam-only-output

cce0e82

Fix OutputFileWriter crash when produce_fastq=false and only BAM outp…

Updating version in README.md.

5119e05

keshav-gandhi and others added 6 commits April 28, 2026 16:35

Revised README for NEAT 4.4.

1ba5320

Removed excess details and streamlined the introduction to focus on key features and improvements in NEAT 4.4.

Merge pull request #269 from ncsa/update-readme

7302162

Updating version in README.md.

found an off-by-one bug and added claude-assisted tests

68a7126

Updating code to prevent multi-allelic variants

9fecb38

Merge pull request #270 from ncsa/test/markov-coverage

f7beb1e

Add Markov quality model test coverage and fix invalid test data

joshfactorial requested a review from keshav-gandhi May 2, 2026 22:35

keshav-gandhi approved these changes May 3, 2026

View reviewed changes

joshfactorial merged commit f6baaf1 into main May 3, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Develop#265

Develop#265
joshfactorial merged 103 commits into
mainfrom
develop

joshfactorial commented Apr 8, 2026

Uh oh!

keshav-gandhi commented Apr 25, 2026

Uh oh!

joshfactorial commented Apr 28, 2026

Uh oh!

keshav-gandhi left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

joshfactorial commented Apr 8, 2026

Uh oh!

keshav-gandhi commented Apr 25, 2026

Uh oh!

joshfactorial commented Apr 28, 2026

Uh oh!

keshav-gandhi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants