Releases: Vitruves/nail-parquet
nail-parquet v1.8.0
nail 1.8.0
This release teaches nail to display nested and complex column types properly. Previously head/tail/preview printed raw Arrow type names (ListArray, StructArray, PrimitiveArray<Int8>, timestamp) for anything beyond a handful of scalar types; now they render real values, and nested List/Struct/Map columns expand into a readable tree.
Added
-L/--level— expand nested values.head,tail, andpreviewnow expandList/Struct/Mapcolumns instead of printingListArray/StructArray. The depth budget controls how far they expand:-L 0collapses each container to a compact tag ({…2 fields},[…3 items]), the default-L 1shows one level, and higher values go deeper. Cards render an indented tree laid out in the value column (single-element lists drop the-bullet for less noise);--tablekeeps the compact single-line form; and-f jsonemits valid, properly-typed nested JSON. Binary blobs always render as<N bytes>regardless of depth, so image/embedding columns are safe to inspect at any level.
Fixed
- Many column types showed their Arrow type name instead of a value.
head/tail/previewnow render real values forInt8/Int16/UInt8/UInt16/UInt32,Timestamp(including timezone-aware),Time32/Time64,Decimal128,Dictionary,LargeUtf8,Binary/LargeBinary,FixedSizeList, and other types that used to printPrimitiveArray<…>,timestamp,ListArray, etc. Leaf values now go through Arrow's formatter, so the set of correctly-displayed types matches Arrow's. nail head -f jsonproduced invalid JSON for some strings. Strings containing backslashes or control characters (e.g. LaTeX\angle, embedded newlines) are now properly escaped, and nested columns serialize to real JSON objects/arrays.--tablerows could be broken by nested values. Embedded newlines in nested string leaves are now flattened so a cell can never spill past the grid borders.
Changed
--tablerendering optimized for wide tables. Cell content is capped at 30 characters with…truncation, and columns paginate into multiple bordered blocks that each fit the terminal width. A#row-index column repeats on every block so rows stay aligned across pages, acols A–B of N (page X/Y)header annotates each subsequent block, and column colors stay consistent across blocks.nail correlations --matrixredesigned. Output is now a proper bordered matrix with headers derived from the row labels (so names containing dots/underscores like__index_level_0__are no longer mangled), per-column auto-sizing, and color-coded cells (green for positive, red for negative, intensity by magnitude, bold white for the 1.0 diagonal).
Example
$ nail head data.parquet -n 1 -L 2
┌──────────────────────────────────────── Record 1 ────────────────────────────────────────
│
│ data_source : synthetic_python
│ prompt : content : Write a Python function `get_even_numbers(n)` that returns
│ : all even numbers from 1 to `n`.
│ : role : user
│ reward_model : ground_truth : [2, 4, 6, 8, 10]
│ : style : rule
│ extra_info : index : 0
│ : split : train
│ : synthetic : true
│ images : - bytes : <10566 bytes>
│ : path : img_diagram.png
└────────────────────────────────────────────────────────────────────────────────────────────
Upgrading
cargo install nail-parquet
Pre-built binaries (linux-musl x86_64/aarch64, macOS x86_64/aarch64, windows-msvc) with SHA256SUMS.txt are attached to this release.
Full Changelog: v1.7.1...v1.8.0
nail-parquet v1.7.1
[1.7.1] - 2026-05-21
Fixed
- Linux release binaries no longer require glibc ≥ 2.38 (#6).
Releases built onubuntu-latest(24.04, glibc 2.39) failed to start on Ubuntu 22.04, Debian 11, RHEL 8, etc. (version 'GLIBC_2.38' not found).
Linux artifacts are now statically linked against musl — no glibc dependency. - Windows stack overflow in many subcommands
(STATUS_STACK_OVERFLOW,0xC00000FD). The binary now reserves an 8 MiB stack on Windows targets via.cargo/config.toml, matching Linux/macOS. - Clippy lints under Rust 1.95:
too_many_arguments,collapsible_match,nonminimal_bool,needless_range_loop. - Source formatting drift (
cargo fmt) and pinned style via committedrustfmt.toml(hard_tabs = true).
Added
nail createmath functions — column expressions now support a curated set of nail-flavored functions:- Scalar (per-row):
abs,sign,floor,ceil,round,trunc,sqrt,cbrt,exp,ln,log,log10,log2,pow,sin,cos,tan,asin,acos,atan,atan2. - Aggregate (broadcast to every row, auto-wrapped as
OVER ()):mean,sum,min,max,count,median,std/stddev,var/variance,stddev_pop,var_pop. - Example:
-c "z=(value-mean(value))/std(value)". - Quote column names that collide with a function (
mean("mean")) or that contain operator characters ("revenue-income"-cost).
- Scalar (per-row):
-cis now repeatable onnail create. Pass multiple-cflags or comma-separate specs within one flag; commas insidepow(a, b)/round(x, 2)are correctly preserved.nail clean— one-shot import cleanup. Snake_case headers, trim string whitespace, drop fully-empty rows by default;--drop-empty-colsopt-in.--keep-headers,--keep-whitespace,--keep-empty-rowsto opt out.- Stdin/stdout via
-on every command. Read withcat data.csv | nail head -, write withnail filter ... -o -. Format auto-detected from--formator by sniffing input bytes (Parquet magic, JSON brace, CSV otherwise). Excel is the only format that still requires a file path. - "Did you mean …?" suggestions on column-not-found errors. Powered by Levenshtein distance with length-scaled tolerance.
- Examples in every
--help. Each subcommand now ends its help with 2-3 concrete invocations. - Multi-platform release workflow (
.github/workflows/release.yml) producing artifacts for:x86_64-unknown-linux-musl(static)aarch64-unknown-linux-musl(static, viacross)x86_64-apple-darwin,aarch64-apple-darwinx86_64-pc-windows-msvc
Each release ships aSHA256SUMS.txt.
Full Changelog: v1.7.0...v1.7.1
nail-parquet v1.7.0
nail-parquet v1.7.0 — 2026-04-22
Major release: richer output, more expressive selectors, shell completions,
and a lighter dependency footprint.
New
--table(global): colored columnar output for any command.--batch-size <N>(global): tune DataFusion's rows-per-record-batch.completions <shell>: generate scripts for bash/zsh/fish/powershell/elvish.
Supports--auto-install(standard user location) and--location <PATH>.filter: OR via|, AND via,(AND binds tighter).select/drop:--type <numeric|integer|float|string|boolean|temporal|binary>,
combinable with--columns(pattern AND type).search: multi-value OR via|in--value.frequency:--head <N>,--tail <N>, and%column in table/file output.preview:--rows <spec>(e.g.1,3,5-10), works in interactive mode too.
Fixed
- Float formatting trims trailing zeros, keeps one decimal (
40.000→40.0). - Better border visibility on dark terminal themes.
Internals
- Streaming Parquet writes (
ArrowWriter+ SNAPPY); no full in-memory materialization. - Predicate pushdown enabled on reads; default batch sizes 32,768 / 8,192 (with
--jobs). reqwest→ureqfor update checks; DataFusion built with minimal features.- Added
rayonandfutures; new GitHub Actions CI; README trimmed.
Upgrade
cargo install nail-parquetBinaries: https://github.com/Vitruves/nail-parquet/releases
No breaking CLI changes. CommonArgs gained batch_size and table fields
(library embedders should update struct initializers).
nail-parquet v1.6.5
Release Notes - Version 1.6.5
New Features & Improvements 🎉
• New sort command: Added comprehensive sorting functionality with support for multiple date types and sorting strategies
• Kendall Tau correlation: Proper implementation replaces the previous approximation derived from Spearman correlation
• Matrix output for correlations: The --matrix command now includes formatted table output for improved readability
• Bug fixes and stability improvements: Various fixes to enhance overall performance and reliability
Looking for Your Input 🤝
As nail-parquet continues to mature with its extensive collection of subcommands and customization options, we're actively seeking fresh ideas from our community. Development has naturally slowed in recent weeks as we've addressed many of the most requested features.
We'd love to hear from you!
Whether you have suggestions for new functionality, improvements to existing features, or ideas we haven't considered, please don't hesitate to:
• Contact us directly
• Open an issue with your improvement suggestions on our repository
Your feedback helps shape the future of nail-parquet and ensures it continues to meet your data processing needs.
Happy data nailing!
v1.6.4
nail-parquet v1.6.4 - First Release
A fast, powerful command-line utility for working with Parquet files and other data formats, built with Rust and Apache DataFusion.
Features
Core Functionality:
- 32+ Commands for comprehensive data manipulation and analysis
- Multi-format Support: Parquet, CSV, JSON, Excel (read/write)
- High Performance: Built on Apache DataFusion and Arrow for columnar processing
- Interactive Mode: Terminal UI with ratatui for data preview and exploration
Key Commands:
head,tail,preview- Data inspectionstats,correlations,frequency- Statistical analysisfilter,select,drop- Data manipulationmerge,append,split- Data combinationconvert,optimize- Format conversion and optimizationoutliers,dedup- Data quality operationssample,shuffle,binning- Data sampling and analysis
Advanced Features:
- Parallel Processing: Configurable with
--jobsflag - Verbose Logging: Debug output with
--verboseflag - Multiple Output Formats: JSON, CSV, Parquet, formatted text
- Smart Column Resolution: Case-insensitive column name matching
- Statistical Operations: Correlation analysis, outlier detection, frequency analysis