Supported File Formats: β Parquet | β JSON / JSON Lines (ndjson) | (More planned!)
parqv
is a Python-based interactive TUI (Text User Interface) tool designed to explore, analyze, and understand various data file formats directly within your terminal. Initially supporting Parquet and JSON, parqv
aims to provide a unified, visual experience for quick data inspection without leaving your console.
- Unified Interface: Launch
parqv <your_data_file>
to access metadata, schema, data preview, and column statistics all within a single, navigable terminal window. No more juggling different commands for different file types. - Interactive Exploration:
- π±οΈ Keyboard & Mouse Driven: Navigate using familiar keys (arrows,
hjkl
, Tab) or even your mouse (thanks toTextual
). - π Scrollable Views: Easily scroll through large schemas, data tables, or column lists.
- π² Clear Schema View: Understand column names, data types, and nullability at a glance. (Complex nested structures visualization might vary by format).
- π Dynamic Stats: Select a column and instantly see its detailed statistics (counts, nulls, min/max, mean, distinct values, etc.).
- π±οΈ Keyboard & Mouse Driven: Navigate using familiar keys (arrows,
- Cross-Format Consistency:
- π¨ Rich Display: Leverages
rich
andTextual
for colorful, readable tables and text across supported formats. - π Quick Stats: Get key statistical insights consistently, regardless of the underlying file type.
- π Extensible: Designed with a handler interface to easily add support for more file formats in the future (like CSV, Arrow IPC, etc.).
- π¨ Rich Display: Leverages
- Multi-Format Support: Currently supports Parquet (
.parquet
) and JSON/JSON Lines (.json
,.ndjson
). Runparqv <your_file.{parquet,json,ndjson}>
. - Metadata Panel: Displays key file information (path, format, size, total rows, column count, etc.). Fields may vary slightly depending on the file format.
- Schema Explorer:
- Interactive list view of columns.
- Clearly shows column names, data types, and nullability.
- Data Table Viewer:
- Scrollable table preview of the file's data.
- Attempts to preserve data types for better representation.
- Column Statistics Viewer:
- Select a column in the Schema tab to view detailed statistics.
- Shows counts (total, valid, null), percentages, and type-specific stats (min/max, mean, stddev, distinct counts, length stats, boolean value counts where applicable).
- Row Group Inspector (Parquet Specific):
- This panel only appears when viewing Parquet files.
- Lists row groups with stats (row count, compressed/uncompressed size).
- (Planned) Select a row group for more details.
1. Prerequisites:
- Python: Version 3.10 or higher.
- pip: The Python package installer.
2. Install parqv
:
- Open your terminal and run:
(This will also install dependencies like
pip install parqv
textual
,pyarrow
,pandas
, andduckdb
) - Updating
parqv
:pip install --upgrade parqv
3. Run parqv
:
- Point
parqv
to your data file:#parquet parqv /path/to/your/data.parquet # json parqv /path/to/your/data.json
- The interactive TUI will launch. Use your keyboard (and mouse, if supported by your terminal) to navigate:
- Arrow Keys /
j
,k
(in lists): Move selection up/down. Tab
/Shift+Tab
: Cycle focus between the main tab content and potentially other areas. (Focus handling might evolve).Enter
(in column list): Select a column to view statistics.- View Switching: Use
Ctrl+N
(Next Tab) andCtrl+P
(Previous Tab) or click on the tabs (Metadata, Schema, Data Preview). - Scrolling: Use
PageUp
/PageDown
/Home
/End
or arrow keys/mouse wheel within scrollable areas (like Schema stats or Data Preview). q
/Ctrl+C
: Quitparqv
.- (Help Screen
?
is planned)
- Arrow Keys /
Licensed under the Apache License, Version 2.0. See LICENSE for the full license text.