0.12.0 — storage CLI Parquet output
0.12.0 — typed Parquet output from the storage CLI
protokit storage scan now writes typed Parquet directly: --format parquet -o out.parquet converts proto records straight to columnar through the optional protokit[parquet] extra, with no JSON intermediate (#24, #26).
Highlights
- All-or-nothing atomic publish: the file appears at
-oonly after a complete, fault-free scan; a pre-existing output survives any fault and is overwritten only by a complete result. - Misuse fails before any record is read (missing
-o,--on-error skip|warn,--fields,--explicit-defaults, env-sourcedPROTOKIT_FORMAT=parquet,-ocolliding with an input file — all exit 2). - Fault reports now name the first fault's location, not just a count.
- Parquet values are Arrow-native by design (bytes → binary, enums → int32, timestamps at microsecond resolution) — deliberately divergent from the JSON view.
BREAKING (pre-1.0 policy): IncompleteScanError(fault_count: int) → IncompleteScanError(faults: tuple[FrameError, ...]). Consumers constructing the exception directly must update; read-only consumers are unaffected (fault_count is preserved as len(faults)).
Full details in CHANGELOG.md.