Comparison

Comparison vs other CSV tools

A short, honest comparison of qsv against neighboring tools. The deep numbers live in docs/BENCHMARKS.md and the live dashboard at qsv.dathere.com/benchmarks. All performance claims on this page link to those sources rather than embedding numbers that will go stale.

Note

See also the original xsv 0.13.0 stats compared with qsv 1.0.0 stats wiki page for a side-by-side example of the stats-output expansion.

qsv vs xsv

xsv is the original Rust CSV tool from BurntSushi. qsv is a maintained, multithreaded fork that adds many commands and features. xsv has been on minimal-maintenance status since ~2019.

	xsv 0.13	qsv 20+
Commands	~13	70+
Multithreaded	No	Many commands (🚀 / 🏎️)
Polars-powered SQL	No	`sqlp`, `joinp`, `pivotp`, `scoresql`
JSON Schema validation	No	`validate` (with custom keywords)
Geocoding	No	`geocode`
HTTP fetching	No	`fetch`, `fetchpost` with caching
AI / LLM integration	No	`describegpt`, MCP server, Cowork plugin
Embedded scripting DSL	No	Luau and Python
External-* commands	No	`extsort`, `extdedup`
Apache DataSketches	No	`--cardinality-method approx`, `--quantile-method approx`, `frequency --sketch-method frequent_items`
`stats` output columns	12 (default), 14 (`--everything`)	37 (default), 47 (`--everything`) — see legacy wiki page
Ongoing Development	Archived	Active

Migration path: install qsvlite — it's the xsv-compatible subset, with the same flags and command set. Or install full qsv for everything.

qsv vs csvkit

csvkit is a Python CSV toolkit (csvstack, csvgrep, csvjoin, csvstat, …) with long history.

Speed: qsv outperforms csvkit by ~10× on typical workloads (compiled Rust + multithreading vs Python). See docs/BENCHMARKS.md for the methodology.

Surface area:

csvkit has tighter integration with the Python ecosystem (pip-installable, extensible in Python).
qsv has more commands (geocoding, fetch, validate with custom keywords, describegpt, Polars SQL, …).
csvkit is one project; qsv is the engine plus an ecosystem (qsv pro, MCP server, Cowork plugin, qsv-recipes, qsv-lookup-tables, DataPusher+).

When to pick which:

Inside a Python project where you already use pandas — csvkit might fit better.
For shell pipelines, CI gates, or large files — qsv wins decisively.
The two can coexist; many users use csvkit's csvstat then pipe results into a downstream qsv command.

qsv vs Miller (`mlr`)

Miller is C, fast, and shape-agnostic — it handles CSV, TSV, JSON, JSONL, DKVP, PPRINT, NIDX. qsv is CSV-specialized with deeper stats and validation.

Speed: comparable for streaming row ops. qsv pulls ahead on aggregations, joins, and stats due to multithreading.

Where Miller shines:

DKVP and nested-JSON inputs.
Compact DSL that does row transformations and filtering in one expression.
Long-standing maturity.

Where qsv shines:

48-metric stats with guaranteed type inference.
JSON Schema validation at 780k rows/sec.
Polars-powered SQL and asof joins.
Integrated AI workflows.
MCP server / Cowork plugin / qsv pro ecosystem.

Both are excellent. Many shell power-users keep both installed and reach for whichever is faster for the task at hand.

qsv vs DuckDB CSV reader

DuckDB is an embedded analytical SQL database. Its read_csv table function is highly optimized.

Different jobs:

DuckDB excels at multi-CSV SQL analytics with a full query optimizer and OLAP execution engine.
qsv excels at the pre-DB cleaning, profiling, validation, and enrichment layer.

The recommended pattern: use qsv for cleaning + Parquet conversion, then DuckDB for analytics:

qsv stats --stats-jsonl raw.csv
qsv schema --polars raw.csv
qsv to parquet outdir/ raw.csv

duckdb -c "SELECT * FROM read_parquet('outdir/raw.parquet') WHERE ..."

qsv also integrates with DuckDB directly:

qsv sqlp runs Polars SQL; the closest spiritual cousin to duckdb -c "..." for CSVs.
qsv scoresql --duckdb uses DuckDB's planner to score a query before running it.
qsv describegpt with QSV_DUCKDB_PATH uses DuckDB for SQL-RAG.

See Integrations → DuckDB.

qsv vs pandas

pandas is the Python data-analysis workhorse. qsv complements pandas; it doesn't replace it.

pandas is in-memory, Python-native, and excels at ad-hoc analysis with charts and ML.
qsv is streaming (mostly), shell-native, and excels at fast profiling, validation, transformations on multi-GB files, and pre-DB cleaning.

For a 2.7M-row CSV, qsv stats runs in well under a second; pd.read_csv(...).describe() typically takes 10+ seconds. For inline charts or train_test_split, pandas / scikit-learn are obviously the right tools.

Use qsv from notebooks via subprocess:

import subprocess
subprocess.run(['qsv', 'stats', '--everything', '--stats-jsonl', 'data.csv'], check=True)
stats_df = pd.read_csv('data.stats.csv')

See Integrations → Python notebooks.

qsv vs Visidata

Visidata is a terminal UI for tabular data exploration — closer to qsv lens than to the rest of qsv. Both are excellent in their niche.

Visidata is interactive (TUI). Sift, filter, pivot, sort visually.
qsv lens is interactive too, via csvlens.
The rest of qsv is non-interactive — fast batch operations from the shell.

Use Visidata for exploratory analysis; use qsv for the rest of the pipeline.

qsv vs awk / sed for CSVs

awk and sed are general-purpose text tools. They don't understand CSV quoting — embedded commas, multi-line quoted fields, and escaped quotes will trip them up.

Use qsv for any CSV operation. Use awk / sed for plain-text logs and configuration files.

Choosing — the cheat sheet

You want to…	Reach for
Profile / validate / clean a CSV	qsv
Multi-file SQL analytics	DuckDB + qsv-prepared Parquet
Notebook-driven exploratory ML	pandas + qsv subprocess for heavy lifting
Shape-agnostic stream processing (JSON, DKVP, …)	Miller
Interactive TUI exploration	qsv lens or Visidata
Drag-and-drop GUI exploration	qsv pro
AI-assisted analysis	qsv MCP Server + Claude Cowork Plugin
xsv-compatible drop-in	qsvlite

Comparison

Comparison vs other CSV tools

qsv vs xsv

qsv vs csvkit

qsv vs Miller (mlr)

qsv vs DuckDB CSV reader

qsv vs pandas

qsv vs Visidata

qsv vs awk / sed for CSVs

Choosing — the cheat sheet

See also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Get Started

Command Reference

Cookbook

Tuning & Internals

Ecosystem

Reference

Legacy

Clone this wiki locally

qsv vs Miller (`mlr`)