Skip to content

BNGResult loader crashes (np.loadtxt) on ragged .cdat files that BNG2.pl emits for SSA #103

@wshlavacek

Description

@wshlavacek

Summary

bionetgen.run() raises ValueError while building its BNGResult whenever BNG2.pl writes a ragged .cdat. run_network does this for SSA on networks where some species are unpopulated at t=0 and become populated during the run: the per-row species-column count grows over the trajectory while the header line stays fixed at the initial count. result.py:_load_dat builds a fixed NumPy dtype from the header and calls np.loadtxt(path, dtype={...}), which requires every row to have the same width and aborts on the first wider row.

The simulation itself succeeds and the output files are written correctly — only the post-run load-back into BNGResult fails, which makes the whole run() call fatal even though a perfectly good .gdat is on disk.

Error

File ".../bionetgen/core/tools/result.py", line 176, in _load_dat
    return np.rec.array(np.loadtxt(path, dtype={"names": names, "formats": formats}))
...
ValueError: the dtype passed requires 26 columns but 27 were found at row 2; use `usecols` to select a subset and avoid this error

Reproduction

Any SSA run on a network with species unpopulated at t=0 triggers it. For example, agentCellSystem with a capped network:

generate_network({overwrite=>1,max_iter=>3,max_agg=>4})
simulate({method=>"ssa",seed=>1,t_start=>0,t_end=>1,n_steps=>20,print_CDAT=>1})
import bionetgen
bionetgen.run("model.bngl", out="out")

The produced .cdat has a fixed-width header but data rows whose width grows as species first appear, e.g.:

header (incl. # and time): 27 tokens
data row widths:           26, 27, 28, 34, 36, 38, 40, 42

The .gdat (a fixed observable set) loads fine. The same model under method=>"ode" writes a dense .cdat that loads fine. The ragged .cdat is itself reproducible byte-for-byte across runs (fixed seed) — it is well-defined output, just not rectangular.

Root cause

result.py:_load_dat assumes a rectangular table (a fixed dtype derived from the header width). run_network's SSA .cdat output is not guaranteed rectangular.

Possible directions

  • Tolerate ragged rows (read max width / pad missing trailing species with 0 / usecols), since species absent from a row have population 0.
  • Or make a failed .cdat load non-fatal so a valid .gdat still yields a usable BNGResult.

Environment

BNG2.pl 2.9.3; Python 3.12; current numpy.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions