Skip to content

History / Joins and Set Ops

Revisions

  • wiki: adopt GitHub Alerts for callouts across the wiki Convert advisory blockquotes and inline callouts to semantic GitHub Alerts (NOTE/TIP/IMPORTANT/CAUTION): - [!NOTE] for the standard category-page "workflow layer" ledes and other top-of-page orientation / "canonical reference" notes - [!IMPORTANT] for behavior-affecting gotchas (auto approx-stats on OOM, group-by unsupported aggs, MiniJinja filter-errors-as-values, synthesize cross-column correlation, profile always-warn RFC4180) - [!TIP] for the Binary-Variants TL;DR and Why-qsv "why it matters" - [!CAUTION] for foreach shell-injection risk and joinp --cross blowup Also update the Contributing-to-the-Wiki category template to use the [!NOTE] lede so new pages follow the convention. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

    @jqnatividad jqnatividad committed May 30, 2026
  • wiki: expand joinp section with detailed examples Restructure Joins-and-Set-Ops joinp coverage into workflow subsections drawn from src/cmd/joinp.rs USAGE: full join-type matrix (inner/left/ right/full + anti/semi/cross), non-equi joins with _left/_right suffixes, asof joins (strategy/tolerance/asof_by/--allow-exact-matches/--no-sort), pre-join --filter-left/right + post-join --sql-filter, --validate, --maintain-order, --coalesce, key transforms (--ignore-case/ --ignore-leading-zeros/--norm-unicode), and --cache-schema modes/output. Corrects the prior asof example to the actual flags (--asof, --strategy, --left_by/--right_by) and notes joinp output is CSV-only and stdin is unsupported. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

    @jqnatividad jqnatividad committed May 30, 2026
  • wiki: flesh out Aggregation-and-Statistics + Joins-and-Set-Ops Aggregation-and-Statistics covers stats, moarstats, frequency, pragmastat, dedup, extdedup, extsort. Examples emphasize: stats cache for downstream speed (NYC 311), Apache DataSketches approx mode for huge cardinalities, Pragmastat robust statistics for skewed data (Allegheny property sales), extsort/extdedup for files > RAM. Joins-and-Set-Ops covers join, joinp, exclude, partition, split. Examples: wcp + country_continent lookup, NYC 311 + NOAA weather asof join, salary band non-equi join, partition NYC 311 by Borough, chunk 27M-row exports for parallel processing. Both pages: quick decision table, per-command sections with real-world anchored examples, deep-links to /docs/help/, "See also" cross-links. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

    @jqnatividad jqnatividad committed May 13, 2026
  • wiki: add stubs for Phase B/C/D/E pages so sidebar links resolve Adds 39 placeholder pages so every sidebar entry resolves to real content rather than a 404. Each stub declares its tier, the phase it will be filled in, and a one-paragraph preview of what's coming. They link back to Home / Getting-Started / Command-Reference / Cookbook for navigation. Pages added: - Phase B (Command Reference, 13): Command-Reference, Selection-and- Inspection, Transform-and-Reshape, Aggregation-and-Statistics, Joins- and-Set-Ops, SQL-and-Polars, Validation-and-Schema, Conversion-and-IO, Geospatial, HTTP-and-Web, Scripting-Luau-Python, Indexing-Compression- Diff, AI-and-Documentation - Phase C (Cookbook recipes, 12): Recipe-Inspect-Unknown-CSV, Recipe- Clean-and-Normalize, Recipe-Geographic-Enrichment, Recipe-Date- Enrichment, Recipe-CKAN-Integration, Recipe-JSON-Schema-Validate, Recipe-Build-a-Data-Pipeline, Recipe-Stats-to-Insights, Recipe-Fetch- and-Cache, Recipe-Larger-than-RAM, Recipe-Diff-and-Audit, Recipe-Multi- Table-Joins - Phase D (Tuning + ecosystem, 8): Performance-Tuning, Environment- Variables, Stats-Cache-and-Caching, Lookup-Tables, Claude-Cowork-Plugin, MCP-Server, qsv-pro-Spotlight, Integrations - Phase E (Polish, 6): Troubleshooting, FAQ, Comparison, Glossary, External-Resources, Contributing-to-the-Wiki Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

    @jqnatividad jqnatividad committed May 13, 2026