Skip to content

History / Recipe Diff and Audit

Revisions

  • wiki: Phase C complete - final 5 cookbook recipes Recipe-Stats-to-Insights: stats -> moarstats -> pragmastat -> describegpt flow on NYC 311 + Allegheny. Heavy-tailed-aware Pragmastat for sale prices, SQL-RAG chat sub-mode for natural-language queries, multilingual outputs, controlled tag vocabularies. Recipe-Diff-and-Audit: 6-step pipeline for weekly regulatory CSVs. BLAKE3 fingerprint gate -> extdedup uniqueness check -> sortcheck/extsort -> diff (<600ms / 1M rows) -> split delta by Add/Remove/Modify -> short- hash lineage. Variations: composite keys, schema validation, larger- than-RAM prerequisites, email summary. Recipe-Build-a-Data-Pipeline: end-to-end on Allegheny property sales - clean -> profile -> validate -> analyze -> report -> publish -> gate. Stage-by-stage walkthrough with Polars schema, sqlp aggregations, pivotp, template-generated Markdown reports, Parquet/Data Package outputs, GitHub Actions CI integration, Make-based orchestration. Recipe-Fetch-and-Cache: NOAA GHCN-Daily weather + GitHub stargazer harvesting with --url-template, --disk-cache, --redis-cache, jaq filters, rate-limit handling. fetchpost for OCR/ML endpoints with MiniJinja templated JSON bodies. Cache management matrix. Recipe-Larger-than-RAM: 27M-row / 16 GB NYC 311 end-to-end without OOM. Index, approx stats with DataSketches, extsort/extdedup, Polars lazy sqlp/joinp, schema --polars, multithreaded snappy, split for parallel chunking, env var tuning matrix (QSV_AUTOINDEX_SIZE, QSV_TMPDIR, QSV_STATS_CHUNK_MEMORY_MB, QSV_MEMORY_CHECK). All 13 Phase C cookbook pages now live. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

    @jqnatividad jqnatividad committed May 13, 2026
  • wiki: add stubs for Phase B/C/D/E pages so sidebar links resolve Adds 39 placeholder pages so every sidebar entry resolves to real content rather than a 404. Each stub declares its tier, the phase it will be filled in, and a one-paragraph preview of what's coming. They link back to Home / Getting-Started / Command-Reference / Cookbook for navigation. Pages added: - Phase B (Command Reference, 13): Command-Reference, Selection-and- Inspection, Transform-and-Reshape, Aggregation-and-Statistics, Joins- and-Set-Ops, SQL-and-Polars, Validation-and-Schema, Conversion-and-IO, Geospatial, HTTP-and-Web, Scripting-Luau-Python, Indexing-Compression- Diff, AI-and-Documentation - Phase C (Cookbook recipes, 12): Recipe-Inspect-Unknown-CSV, Recipe- Clean-and-Normalize, Recipe-Geographic-Enrichment, Recipe-Date- Enrichment, Recipe-CKAN-Integration, Recipe-JSON-Schema-Validate, Recipe-Build-a-Data-Pipeline, Recipe-Stats-to-Insights, Recipe-Fetch- and-Cache, Recipe-Larger-than-RAM, Recipe-Diff-and-Audit, Recipe-Multi- Table-Joins - Phase D (Tuning + ecosystem, 8): Performance-Tuning, Environment- Variables, Stats-Cache-and-Caching, Lookup-Tables, Claude-Cowork-Plugin, MCP-Server, qsv-pro-Spotlight, Integrations - Phase E (Polish, 6): Troubleshooting, FAQ, Comparison, Glossary, External-Resources, Contributing-to-the-Wiki Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

    @jqnatividad jqnatividad committed May 13, 2026