`ZeroDivisionError` in `write_read_str_stats` when an STR length bucket has no coverage

**Version:** GQC 1.1.0 (also reproduced against `main` at `GQC/stats.py:869`)

## Summary

`readbench --strs` crashes with `ZeroDivisionError: division by zero` in `stats.write_read_str_stats` when the included STR set contains a `runlength` for which **no reads pass the qualification filters** (alignment span / soft-clip / 5 bp flank checks).

## Traceback

```
File ".../GQC/readbench.py", line 176, in main
  stats.write_read_str_stats('trinuc', trinucstats, outputfiles, args)
File ".../GQC/stats.py", line 869, in write_read_str_stats
  accrate = correctbylength[refstrlength]/(correctbylength[refstrlength] + lengtherrorbylength[refstrlength] + complexbylength[refstrlength] + flankbylength[refstrlength])
ZeroDivisionError: division by zero
```

## Root cause

In `errors.assess_str_read_coverage_require_flank` (`errors.py:376-377`), every STR in the included BED is registered in `strdict` *before* read tallying:

```python
runlength = end - start
if runlength not in strdict:
    strdict[runlength] = {}
```

If zero reads at any STR with that `runlength` survive the filters at lines 401-434 (`reference_start`/`reference_end` checks, soft-clip requirement, `<5 bp` flank check), `strdict[runlength]` remains `{}`.

Then in `stats.write_read_str_stats` the inner loop over `stats[refstrlength].keys()` never runs, so `correctbylength`, `lengtherrorbylength`, `complexbylength`, and `flankbylength` all stay at 0, and the per-row division on line 869 fails.

Note: even when reads *are* present, this can still fire if the only category seen for a given length is `HET`, since `HET` is excluded from the denominator.

## Reproduction

Run `readbench --strs` against a BAM with sparse coverage at a particular STR locus. We hit it on a low-coverage diploid mapping where one trinuc STR (`TNR_7289_ATT` at `chr21_MATERNAL:47134596-47134638`, runlength=42) had zero qualifying reads. The same crash also occurred on `chr7_MATERNAL` (trinuc) and on a genome-wide `--rerun` (tetranuc) in the same dataset.

## Suggested fix

Guard the division and skip the row when there's no data:

```python
for refstrlength in sorted(correctbylength.keys()):
    denom = (correctbylength[refstrlength]
             + lengtherrorbylength[refstrlength]
             + complexbylength[refstrlength]
             + flankbylength[refstrlength])
    if denom == 0:
        continue
    accrate = correctbylength[refstrlength] / denom
    lfh.write(...)
```

Alternatively, write the row with `accrate=NA` so empty buckets are still represented in the output.

## Environment

- GQC 1.1.0, installed via `uvx --from gqc readbench`
- Python 3.14
- Linux x86_64

Bug patched and issue text generated with the help of claude opus 4.7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`ZeroDivisionError` in `write_read_str_stats` when an STR length bucket has no coverage #5

Summary

Traceback

Root cause

Reproduction

Suggested fix

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ZeroDivisionError in write_read_str_stats when an STR length bucket has no coverage #5

Description

Summary

Traceback

Root cause

Reproduction

Suggested fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`ZeroDivisionError` in `write_read_str_stats` when an STR length bucket has no coverage #5