v0.9.3
A focused review pass (structural → domain-correctness → security) that found
and closed four genuinely-broken things — two data-correctness, one security,
one precision — and added an opt-in high-precision timestamp. No public-API
breaks; the one runtime behaviour change (chunked NULL-key) converts a silent
data loss into a clear error.
Fixed
fix(chunked)— chunked export silently dropped rows with a NULL chunk
column. Range chunking emitsWHERE col BETWEEN min AND maxand date
chunking>= start AND < end; both exclude NULL, so a row whose
chunk_columnwas NULL fell into no chunk and vanished from the export with
no signal (the keyset path already refused a nullable key; the range path did
not). Detection now runs oneCOUNT(*) - COUNT(col)and bails with
remediation (NOT NULL column /WHERE col IS NOT NULL/chunk_dense/
mode: full) rather than dropping rows. Detected on the live data, so a
nullable column with zero actual NULLs is unaffected.fix(redact)— credentials could leak to stderr via the log sink. The
redaction module names "logs" in its scope, but enforcement was wired only at
the error/summary call sites; theenv_loggersink was bare, so a
log::warn!("…{e}", e)whose error captured ascheme://user:password@host
connect string printed the password (and the default filter iswarn, so
those lines show). The sink now routes every record through the redaction
chokepoint — no call site has to remember, present and future.
Added
feat(types)— opt-in nanosecond timestamps (timestamp_ns/
timestamp_tz_nscolumn overrides). Preserves a source's sub-microsecond
fractional seconds — notably SQL Serverdatetime2(7)'s 100 ns tick that the
default microsecond mapping truncates. The default stays microsecond on
purpose: Arrow nanosecond timestamps span only 1677–2262, so a blanket ns
mapping would corrupt out-of-range dates — the opt-in keeps the full-range
default and lets in-range columns carry the extra precision. Per-target
autoload verified live: DuckDB → nativeTIMESTAMP_NS(lossless); Snowflake →
NUMBER,TO_TIMESTAMP_NTZ(col, 9)recovers losslessly; BigQuery →INT64
raw nanos (a nativeTIMESTAMPcast is lossy µs). The incremental cursor
carries the full 9-digit value, fixing thedatetime2(7)boundary re-export.
Changed
docs(incremental)— documented the strict->cursor tie hazard.
Incremental resume usesWHERE cursor > last_value; rows that tie on the
high-watermark value and arrive after it is passed are skipped. This is
inherent high-watermark semantics (keyset is unaffected — its key is unique),
now stated where keyset's uniqueness contract already lives (semantics.md
"Known non-guarantees", config.md, the query builder).refactor(config)— split the 379-lineConfig::validateinto three
cohesive, independently-testable validators (exports-list / source-connection
/ per-export). Behaviour-preserving; all 27 validation/secops tests green.refactor(pipeline)— deletedsrc/pipeline/child.rs, a 461-line orphan
fork never wired into the module tree (so never compiled, its tests never
ran); its byte-identicalrender_child_stderrtests moved onto the live
parallel_children.rs. Gave the parallel chunk drivers' mutex
poison-recovery decision a single documented home (chunked/poison.rs).test(csv)— pinned the floatNaN/±InfinityCSV contract (emit the
literal, never an empty cell that would conflate with NULL) with a
characterization test + docs; verified correct-by-design.
Dependency policy
ci(deps)— Dependabot retuned to cut churn: routine crate updates are
scoped to major bumps only, grouped into one weekly PR, and held by a
14-day cooldown so a freshly shipped.0never lands before its first
round of fixes. GitHub Actions bumps are likewise grouped + cooled down.
Security PRs still arrive out-of-band (no cooldown), and thecargo audit
gate (pre-commit + CI) remains the hard backstop for any fixable advisory.