Skip to content

feat(catalog): hadoop list drop rename operations#970

Open
tanmayrauth wants to merge 7 commits intoapache:mainfrom
tanmayrauth:feat/hadoop-list-drop-rename
Open

feat(catalog): hadoop list drop rename operations#970
tanmayrauth wants to merge 7 commits intoapache:mainfrom
tanmayrauth:feat/hadoop-list-drop-rename

Conversation

@tanmayrauth
Copy link
Copy Markdown
Contributor

@tanmayrauth tanmayrauth commented May 1, 2026

5: ListTables + DropTable + RenameTable

Implement the remaining table catalog methods. ListTables verifies the namespace exists then scans the namespace directory for table subdirs using isTableDir. DropTable verifies the table exists then calls os.RemoveAll on the table directory. RenameTable returns an unsupported error. Tests cover list empty namespace, list with tables, list non-existent namespace, drop existing table and verify directory is removed, drop non-existent table, and rename returns error.

Depends on #968 #963
Relates to #798

Implement CreateNamespace, DropNamespace, CheckNamespaceExists,
ListNamespaces, LoadNamespaceProperties, and UpdateNamespaceProperties
(unsupported, matching Java).

Relates to apache#798

Depends-on: apache#953 (scaffold)
Depended-on-by: PR 4 (table CRUD), PR 5 (list/drop/rename)
Set up Docker and Spark infrastructure for Hadoop catalog
cross-compatibility testing with Java's HadoopCatalog.

- Add hadoop_validation.py: SparkSession configured with
  spark.sql.catalog.hadoop_test (type=hadoop, warehouse=/home/iceberg/hadoop-warehouse)
- Add shared volume mount in docker-compose.yml:
  /tmp/iceberg-hadoop-warehouse (host) <-> /home/iceberg/hadoop-warehouse (Spark)
- Copy hadoop_validation.py into Spark container via Dockerfile
- Add make integration-hadoop target

No Go code — purely infrastructure so subsequent PRs can add
integration test cases that validate Go ↔ Spark interop.

Depends-on: nothing (parallel with PR 1)
Depended-on-by: PRs 4, 5, 6 (integration test cases)
Implement the remaining table catalog methods:

- ListTables: scans namespace directory for table subdirs via isTableDir,
  returns fully-qualified identifiers using iter.Seq2 pattern
- DropTable: verifies table exists then removes entire directory tree
- RenameTable: returns unsupported error (matches Java)

Fix slice aliasing in ListTables by copying the namespace prefix
before appending the table name to each yielded identifier.

Relates to apache#798

Depends-on: PR 3 (namespace-ops)
Depended-on-by: none (leaf PR)
@tanmayrauth tanmayrauth requested a review from zeroshade as a code owner May 1, 2026 22:10
…pTable, RenameTable

Cover cross-catalog interop: Spark creates tables, Go lists/drops them,
and vice versa. Validates namespace round-trip, drop cleanup, and
ErrNoSuchTable/ErrNoSuchNamespace error paths.
Pre-create hadoop-warehouse directory before Docker compose to ensure
runner ownership. Use Go-created namespaces and fake tables for write
tests to avoid root-ownership issues with Spark-created directories.
@tanmayrauth tanmayrauth force-pushed the feat/hadoop-list-drop-rename branch from 32574ed to 83e00f5 Compare May 2, 2026 01:52
@tanmayrauth
Copy link
Copy Markdown
Contributor Author

@laskoviymishka @zeroshade can you please review this PR?

Copy link
Copy Markdown
Member

@zeroshade zeroshade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me, can you also update the README with the support?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants