Releases: spiraldb/raincloud
Releases Β· spiraldb/raincloud
v0.1.0 β initial public release
Initial public release.
Raincloud is a client-reproducible pipeline for building a curated catalog
of public datasets as analytics-ready Parquet + Vortex files. See
README.md for the user-facing overview,
AGENTS.md for the architecture, and
SKILLS.md for procedural playbooks.
This release bundles:
- The 7-stage build pipeline (fetch β extract β parse β transform β write
β validate β convert) plus the optional opt-in hydrate stage. - 249 dataset specs across 5 families (
direct,kaggle-upstream,
nyc-tlc,public-bi,uci). - 24 named transform handlers covering CSV / Parquet / JSONL / XML / PBF /
custom-format upstreams plus streaming variants for memory-constrained
shapes. - A read-only Textual TUI for browsing the catalog
(python -m scripts.pipeline.browse, requires--extra tui). - Per-dataset Vortex conversion via the
convert.vortexflag. - Apache License 2.0, with SPDX file headers on all Python sources.
- Governance:
SECURITY.md,CONTRIBUTING.md,CODE_OF_CONDUCT.md
(Contributor Covenant 2.1),DISCLAIMER.md(AS IS posture, content
and license disclaimers, dataset-removal reporting), and
HYDRATING.md(policy for the optional hydrate stage). - Tooling:
rufflint (rulesE,F,W,I) + GitHub Actions CI
(.github/workflows/ci.yml) running lint, manifest validation, and
pyteston every push and PR todevelop. - Dataset-removal issue template
(.github/ISSUE_TEMPLATE/dataset-removal.yml) β structured form for
the channelDISCLAIMER.mdpoints readers at. - Pull-request template (
.github/pull_request_template.md) prompting
for summary, test-plan checkbox list against the standard pre-PR gate,
and change-type tags. CITATION.cffβ GitHub-native citation metadata; surfaces the "Cite
this repository" button in the repo sidebar with BibTeX / APA / Chicago
exports.