Release v0.3.2 · kayhendriksen/foehn

What's new in 0.3.2

A reliability and security release — no new datasets, but downloads, parsing,
and the release pipeline are all hardened.

Fixed an infinite loop in CSV parsing. The dtype-recovery fallback in
parse_csv_bytes() retried forever when a column failed to parse even as
Float64 (e.g. a stray non-numeric value past the schema-inference window).
Reachable from foehn.load() and the MCP server's load_data — it now
raises a clear error instead of hanging.
Climate normals (C6) recover from interrupted runs. The skip check now
looks at the extracted TXT files rather than the ZIP, so a run that died
between download and extraction is retried instead of silently skipped
forever.
State files can no longer brick the pipeline. _etags.json and
_last_run.json are written atomically, and a corrupt state file is treated
as empty (with a warning) instead of crashing every subsequent run.

All downloads are now atomic: CSVs and ZIPs join the binary assets in
streaming to a .part file and renaming on completion (#21 + this release),
so an interrupted transfer never leaves a truncated file behind.
STAC listing and pagination use retrying HTTP sessions, and retries now also
cover 429 rate limits (honouring Retry-After).
The ETag store is pruned of stale entries on clean full runs — it no longer
grows forever as forecast assets cycle.
CSV assets with query strings (e.g. ?token=...) are detected correctly, and
time slices are parsed from the trailing filename segment so a coincidental
"now" elsewhere in a URL can't be misread (#21).
The library logs through standard logging (foehn.*, silent when imported);
the CLI attaches its own stdout handler (#21).
CSV decoding is total: the Windows-1252 fallback replaces unmappable bytes
instead of raising.

foehn.download() gained a force= flag to re-download ZIP-shipped datasets
(e.g. climate_scenarios_indoor) that would otherwise skip when already
extracted:
```
import foehn

foehn.download("climate_scenarios_indoor", force=True)
```
list_datasets() no longer advertises frequencies for datasets where the
frequency filter isn't supported (forecast_local, climate_scenarios,
climate_scenarios_indoor) — the granularity is named in the description
instead.
All download functions return a DownloadResult summary (counts + new
filenames) so callers can gate downstream work (#21).

ZIP extraction now guards against decompression bombs (10 GiB declared-size
cap) on top of the existing path-traversal checks — including the in-memory
indoor-scenarios archive.
The Databricks ingest escapes backslashes as well as quotes when setting
column comments via Spark SQL.
Release pipeline hardening: mcp-publisher is pinned by version and SHA-256,
PyPI uploads explicitly enable PEP 740 attestations, build tooling is pinned,
CodeQL runs the security-extended suite, and all workflow checkouts use
persist-credentials: false.

Full changelog: v0.3.1...v0.3.2