Skip to content

nail-parquet v1.8.0

Latest

Choose a tag to compare

@Vitruves Vitruves released this 30 May 13:04

nail 1.8.0

This release teaches nail to display nested and complex column types properly. Previously head/tail/preview printed raw Arrow type names (ListArray, StructArray, PrimitiveArray<Int8>, timestamp) for anything beyond a handful of scalar types; now they render real values, and nested List/Struct/Map columns expand into a readable tree.

Added

  • -L/--level — expand nested values. head, tail, and preview now expand List/Struct/Map columns instead of printing ListArray/StructArray. The depth budget controls how far they expand: -L 0 collapses each container to a compact tag ({…2 fields}, […3 items]), the default -L 1 shows one level, and higher values go deeper. Cards render an indented tree laid out in the value column (single-element lists drop the - bullet for less noise); --table keeps the compact single-line form; and -f json emits valid, properly-typed nested JSON. Binary blobs always render as <N bytes> regardless of depth, so image/embedding columns are safe to inspect at any level.

Fixed

  • Many column types showed their Arrow type name instead of a value. head/tail/preview now render real values for Int8/Int16/UInt8/UInt16/UInt32, Timestamp (including timezone-aware), Time32/Time64, Decimal128, Dictionary, LargeUtf8, Binary/LargeBinary, FixedSizeList, and other types that used to print PrimitiveArray<…>, timestamp, ListArray, etc. Leaf values now go through Arrow's formatter, so the set of correctly-displayed types matches Arrow's.
  • nail head -f json produced invalid JSON for some strings. Strings containing backslashes or control characters (e.g. LaTeX \angle, embedded newlines) are now properly escaped, and nested columns serialize to real JSON objects/arrays.
  • --table rows could be broken by nested values. Embedded newlines in nested string leaves are now flattened so a cell can never spill past the grid borders.

Changed

  • --table rendering optimized for wide tables. Cell content is capped at 30 characters with truncation, and columns paginate into multiple bordered blocks that each fit the terminal width. A # row-index column repeats on every block so rows stay aligned across pages, a cols A–B of N (page X/Y) header annotates each subsequent block, and column colors stay consistent across blocks.
  • nail correlations --matrix redesigned. Output is now a proper bordered matrix with headers derived from the row labels (so names containing dots/underscores like __index_level_0__ are no longer mangled), per-column auto-sizing, and color-coded cells (green for positive, red for negative, intensity by magnitude, bold white for the 1.0 diagonal).

Example

$ nail head data.parquet -n 1 -L 2
┌──────────────────────────────────────── Record 1 ────────────────────────────────────────
│
│ data_source          : synthetic_python
│ prompt               : content : Write a Python function `get_even_numbers(n)` that returns
│                      :           all even numbers from 1 to `n`.
│                      : role    : user
│ reward_model         : ground_truth : [2, 4, 6, 8, 10]
│                      : style        : rule
│ extra_info           : index     : 0
│                      : split     : train
│                      : synthetic : true
│ images               : - bytes : <10566 bytes>
│                      :   path  : img_diagram.png
└────────────────────────────────────────────────────────────────────────────────────────────

Upgrading

cargo install nail-parquet

Pre-built binaries (linux-musl x86_64/aarch64, macOS x86_64/aarch64, windows-msvc) with SHA256SUMS.txt are attached to this release.

Full Changelog: v1.7.1...v1.8.0