Skip to content

[cli] paimon cli#7975

Open
hqlalala wants to merge 1 commit into
apache:masterfrom
hqlalala:feat/paimon-cli
Open

[cli] paimon cli#7975
hqlalala wants to merge 1 commit into
apache:masterfrom
hqlalala:feat/paimon-cli

Conversation

@hqlalala
Copy link
Copy Markdown

Purpose

  • Add a standalone Java CLI (paimon-cli) for Apache Paimon, providing a native command-line tool that works without Flink/Spark
  • Full SQL support via Apache Calcite planner (JOIN, GROUP BY, window functions, subqueries, UNION, etc.)
  • Supports catalog operations: create/drop/alter database and table, schema evolution, snapshot/tag management, branch management, orphan file cleanup
  • Includes data read/write, EXPLAIN plan, and an interactive SQL REPL
  • Installs as a single paimon command with fat jar packaging

Comparison with Python CLI

Feature Java CLI Python CLI
SQL Engine Apache Calcite (full planner) DataFusion (Rust)
SQL Capabilities JOIN, window functions, subqueries, UNION Same
Filter/Projection Pushdown Yes (via ProjectableFilterableTable) Yes (native)
Deployment Single fat jar, JVM only Requires Python + pypaimon wheel
Catalog Operations Full (create/alter/drop DB & table) Full
Schema Evolution Yes (add/drop/rename column, set options) Limited
Snapshot Management Yes (view, tag, expire, rollback) No
Branch Management Yes (create, list, rename, delete) No
Orphan File Cleanup Yes No
Interactive REPL Yes Yes

Tests

  • 98 unit/integration tests passing (mvn clean test -pl paimon-cli)
  • SQL tests cover: SELECT, WHERE, LIMIT, GROUP BY, HAVING, DISTINCT, ORDER BY, JOIN, subquery, CASE WHEN, UNION, window functions (ROW_NUMBER)
  • Manual verification via paimon sql against local filesystem warehouse

@hqlalala hqlalala force-pushed the feat/paimon-cli branch 2 times, most recently from 0e5b7cc to 9239a1a Compare May 26, 2026 09:49
Copy link
Copy Markdown
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big addition (9000+ lines, new module). A few high-level thoughts:

Scope & Direction Questions:

  1. Overlap with Python CLI: The PR description includes a comparison table, but the fundamental question is: do we want to maintain two standalone CLIs (Java + Python) long-term? Each new feature would need to be implemented in both. What's the intended audience segmentation?

  2. Calcite dependency: Adding Apache Calcite as a SQL engine brings significant dependency weight. The fat jar will be large. Have you measured the final jar size? Calcite also has its own version compatibility concerns with different Hadoop/Hive environments.

  3. Module structure: Adding a top-level paimon-cli module to the root pom.xml means it's part of the main release artifacts. Should this be an experimental/optional module first?

Technical Concerns:

  1. SQL engine completeness: Using Calcite's planner is powerful but means you need to handle Calcite's SQL dialect differences from Flink/Spark SQL. Users may expect the same SQL syntax to work across Flink SQL, Spark SQL, and this CLI. What's the compatibility story?

  2. Write path: The CLI includes a WriteCommand — what's the commit protocol? Does it use the same FileStoreCommit as Flink/Spark? How are conflicts handled for concurrent writers?

  3. No PIP reference: A new module of this scope typically goes through a PIP (Paimon Improvement Proposal) for community discussion. Has this been discussed on the mailing list?

I'd suggest:

  • Start with a PIP discussion to align on scope and long-term maintenance plan
  • Consider starting as a separate repository or paimon-extras module to avoid blocking releases
  • Focus the first version on read-only operations (SQL queries, catalog inspection) before adding write support

cc @apache/paimon-committers for broader input on whether we want a Java CLI in the main repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants