Releases: marc-chiesa/protokit
0.12.0 — storage CLI Parquet output
0.12.0 — typed Parquet output from the storage CLI
protokit storage scan now writes typed Parquet directly: --format parquet -o out.parquet converts proto records straight to columnar through the optional protokit[parquet] extra, with no JSON intermediate (#24, #26).
Highlights
- All-or-nothing atomic publish: the file appears at
-oonly after a complete, fault-free scan; a pre-existing output survives any fault and is overwritten only by a complete result. - Misuse fails before any record is read (missing
-o,--on-error skip|warn,--fields,--explicit-defaults, env-sourcedPROTOKIT_FORMAT=parquet,-ocolliding with an input file — all exit 2). - Fault reports now name the first fault's location, not just a count.
- Parquet values are Arrow-native by design (bytes → binary, enums → int32, timestamps at microsecond resolution) — deliberately divergent from the JSON view.
BREAKING (pre-1.0 policy): IncompleteScanError(fault_count: int) → IncompleteScanError(faults: tuple[FrameError, ...]). Consumers constructing the exception directly must update; read-only consumers are unaffected (fault_count is preserved as len(faults)).
Full details in CHANGELOG.md.
v0.11.0
Expressive proto test-assertion layer over the protokit.message differ.
Added
- Expressive message matchers + comparison parity (
protokit.message). A framework-agnostic test-assertion layer over the differ:proto_match(actual, expected, *, partial=, as_set=, ignore=, presence=, approx=)(single-call) andexpect_proto(expected).partially().as_set("items").ignoring(pred).approximately(...).matches(actual)(fluent) raiseAssertionErrorwith the existing per-field diff on mismatch. The same policy is exposed as ahamcrest.BaseMatcherviaequals_proto(...)behind a new optionalprotokit[hamcrest]extra (used asassert_that(actual, equals_proto(...))), and as aproto_matcherpytest fixture; the bareassert msg1 == msg2rich-diff rendering gains presentation-only config. Backing the matcher, theMessageDifferencerengine gains five opt-in, additive capabilities, all targeted by one unified field selector (a dotted path or a(FieldDescriptor, path)predicate): partial / sub-shape scope (set_partial()); keyless set comparison for repeated fields (treat_as_set(selector)); predicate-based field ignore (ignore_fields(...)now also accepts a predicate); EQUAL vs EQUIVALENT presence (set_message_field_comparison(...)); and selective per-field float tolerance (set_float_comparison(..., selector=)). New public surface:proto_match,expect_proto,MatchPolicy,Approx,MatcherError,equals_proto,HamcrestExtraNotInstalledError,MessageFieldComparison. The CLI matcher surface is a separate later effort.
Behavior note: the default EQUIVALENT presence mode now treats a presence-bearing field set to its default value as equal to an unset field (previously a presence difference); a non-default value vs unset is unchanged. Opt into MessageFieldComparison.EQUAL for strict set-vs-unset presence.
Install: pip install protokit==0.11.0 · with the PyHamcrest adapter: pip install 'protokit[hamcrest]==0.11.0'
v0.10.0
0.10.0
Two additive protokit storage deliveries:
- Field selection + dense JSON (
--fields/--explicit-defaults) —
protokit storage scan/headgain--fields a,b.c(presence-faithful
nested view, snake_case) and--explicit-defaults(dense full-record JSON,
camelCase). New publicprotokit.storage.project()+FieldSelectionError. - Columnar / Parquet output — optional
protokit[parquet]extra
(Rust-backed ptars + pyarrow) adds a library-first proto→Arrow→Parquet path
(to_arrow_batches/to_parquet), skipping the proto→JSON→Parquet
double-encode. New typed exceptions:ParquetExtraNotInstalledError,
SchemaMismatchError,UnknownStreamError,HandlerBuildError,
IncompleteScanError.
See CHANGELOG.md for full details. pip install protokit==0.10.0
v0.9.0 — data-at-rest
protokit 0.9.0 — data-at-rest
BREAKING — the message differ's Difference.old_value/new_value are renamed to left_value/right_value (the two compared messages aren't a before/after pair). Read-only deprecation aliases are retained and removed at 1.0; constructing with the old kwargs raises. Pre-1.0, pin protokit~=0.9.0.
Added — data-at-rest
protokit.storage— schema-aware scan/filter engine for stored protobuf.scan(source, registry, *, predicate, on_error)routes each(stream_id, record_bytes)record to its stream's isolated descriptor pool and yields a taggedScanRecord.Sourceaccepts any iterable of(stream_id, bytes | memoryview); fail-loud by default with opt-inskip/collect/routetolerant modes.protokit storageCLI —scan/head/countover length-delimited files: the minimal--wheregrammar (path == scalar/!=/has:path),--desc/--proto(+--proto-path) schema sources with--type,--format human|json, and--on-error raise|skip|warn. Exit0/2(+count --quietgrep-like1).ProtoFileSchema(.proto→compile schema source, neverSystemExit),on_error='route'+ anerror_sinkcallback onscan(), and the typedSchemaCompileError/WhereErrorexceptions.
Full details in CHANGELOG.md ## 0.9.0.
v0.8.0 — D7: protokit compat --rule-pack rename
protokit compat's --rule-pack flag collided in name with the
protokit lint --rule-pack flag added in D3 — they shared a name
across subcommands but loaded different rule systems (compat's are
FieldPlugin-shaped via SchemaChecker.load_rule_pack; lint's are
LintRuleSpec-shaped). D7 renames compat's flag to
--compat-rule-pack and keeps the old name as a deprecation alias.
Both flags work today; the legacy name is removed in protokit 1.0.
No behavior change for users who migrate to the new name; no break
for users who don't.
Added
--compat-rule-pack MODULEon every sub-subcommand within
protokit compat(check,history,bisect,ci). Same
semantics as the legacy--rule-pack(Python module exposing a
RULES = [(rule_id, plugin_fn), ...]list; repeatable; loaded via
SchemaChecker.load_rule_pack). The new name is the canonical
name going forward and resolves the cross-CLI naming collision
withprotokit lint --rule-pack(which retains the unqualified
name).
Deprecated
--rule-packonprotokit compat {check,history,bisect,ci}is now
a deprecation alias for--compat-rule-pack. The flag still loads
packs identically and remains accepted in 0.8.x, but each
invocation emits aUserWarningto stderr:
"--rule-pack is deprecated and will be removed in protokit 1.0; use --compat-rule-pack instead."
The flag ishidden=Truein--helpoutput to nudge new code
toward the canonical name.UserWarning(notDeprecationWarning) is the deliberate class
choice.DeprecationWarningis hidden from CLI users by
Python's default warning filter, and it gets promoted to an
exception under-W error::DeprecationWarningstrict-warning CI
(which Click traps in its arg-parse pipeline).UserWarningis
visible by default and matches the in-repo precedent at
src/protokit/formatters/_registry.py.- The warning fires exactly once per invocation regardless of how
many times--rule-packis repeated on the command line (Click
invokes per-option callbacks once per option-collection cycle).
Mixing the old and new flag names in a single invocation
accumulates both sets of packs (per Clickmultiple=True
semantics, not last-wins) and emits exactly one warning.
Migration note
Old:
protokit compat check OLD NEW --type acme.User --rule-pack myorg.proto_rulesNew:
protokit compat check OLD NEW --type acme.User --compat-rule-pack myorg.proto_rulesMechanical search-and-replace of the flag name. No changes to the
rule-pack module structure (RULES = [(rule_id, plugin_fn), ...]),
no changes to plugin context APIs, no changes to other compat flags
(--formatter-module, --ignore, --dedupe-by-type, --quiet,
etc.). protokit lint --rule-pack is unchanged — only compat's
flag is renamed.
If you'd rather keep --rule-pack working until 1.0, you can — the
deprecation warning is informational, not a CI break. Strict-warning
test environments running -W error::UserWarning should wrap legacy
--rule-pack invocations in warnings.catch_warnings() until the
migration is complete.
Test coverage
Three AE-driven tests in tests/schema/test_cli.py::TestRulePack
extend the existing class with coverage for the new flag's load path
(test_compat_rule_pack_loads_pack_no_warning), the deprecation
warning's exact-token presence
(test_rule_pack_legacy_emits_user_warning — asserts --rule-pack,
deprecated, 1.0, and --compat-rule-pack are all present in the
warning message), and the "exactly one warning when both flags
supplied" semantics (test_both_flags_accumulate_warn_once). Three
smoke binding tests in
tests/schema/test_cli.py::TestCompatRulePackBinding assert the new
flag is registered on history, bisect, and ci (one test each).
All six tests use warnings.catch_warnings(record=True) + warnings.simplefilter("always") to bypass Python's per-message
warning-dedupe registry, which would otherwise suppress the warning
on second+ invocation in the same test session.
Tag: v0.8.0. PyPI: pip install protokit==0.8.0.