Skip to content

Conversation

flyrain
Copy link
Contributor

@flyrain flyrain commented Oct 6, 2025

To address #2467, the PR adds docs for catalog federation. With the help of chatGPT codex. cc @poojanilangekar @dennishuo

"warehouse": "s3://analytics-bucket/warehouse/",
"authenticationParameters": { "authenticationType": "IMPLICIT" }
}
}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to use polaris cli to create the catalog now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @XJDKC ! I can add the cli version as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the docs just reference the CLI, so if the CLI works I think we should prefer that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the REST example. The HMS cli support wasn't there. I can make the change after adding the support. https://github.com/polaris-catalog/polaris/blob/d449f59e9b54b414c0dd581e086d5ce436b05708/client/python/cli/command/catalogs.py#L271-L271

Comment on lines +23 to +25
---

Polaris can federate catalog operations to an existing Hive Metastore (HMS). This lets an external
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this document can also be used for Hadoop catalogs. Essentially, the mechanism for HMS and Hadoop is the same. We could call these non-REST catalogs.

CC @eric-maynard

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, let me make that change

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This page is about HMS so it's maybe okay to focus on HMS, but @poojanilangekar is absolutely correct that Polaris is meant to be able to federate to any HadoopCatalog implementation and we should make the docs clear about that (if not on this page then elsewhere). There are also catalogs which do use REST which Polaris could federate to (e.g. Unity Catalog).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add hadoop one as a followup

Copy link
Contributor

@HonahX HonahX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the federation doc!

@flyrain
Copy link
Contributor Author

flyrain commented Oct 9, 2025

Thanks for the feedback! This is ready for another review.

Copy link
Contributor

@HonahX HonahX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the CLI examples! Some small comments but it looks great!

--storage-type S3 \
--role-arn "arn:aws:iam::123456789012:role/polaris-warehouse-access" \
--default-base-location "s3://analytics-bucket/warehouse/" \
--catalog-connection-type ICEBERG \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
--catalog-connection-type ICEBERG \
--catalog-connection-type iceberg-rest \

I think we need to fix the doc of Polaris CLI, I plan to add more tests for federation related commands at the same time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we need to fix the doc. We could change the enum name as well. iceberg -> Iceberg_REST,

https://github.com/polaris-catalog/polaris/blob/846d16505dd89d58572f918ace825c81e96aafd7/client/python/cli/command/catalogs.py#L263-L263


```bash
polaris catalogs create \
--name analytics_rest \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For catalog creation, the catalog name is a positional argument, we could move it to the last.

For reference, here is an example CLI I constructed:

polaris \
    --base-url $POLARIS_BASE_URL \
    --client-id $CLIENT_ID \
    --client-secret $CLIENT_SECRET \
    catalogs create \
    --storage-type s3 \
    --role-arn arn.... \
    --default-base-location s3://<bucket>/$USER/polaris-$FEDERATED_CATALOG \
    --type EXTERNAL \
    --catalog-connection-type iceberg-rest \
    --iceberg-remote-catalog-name jojiang_catalog \
    --catalog-uri "$POLARIS_BASE_URL/api/catalog" \
    --catalog-authentication-type OAUTH \
    --catalog-token-uri "$POLARIS_BASE_URL/api/catalog/v1/oauth/tokens" \
    --catalog-client-id $PRINCIPAL_CLIENT_ID \
    --catalog-client-secret $PRINCIPAL_CLIENT_SECRET \
    --catalog-client-scope "PRINCIPAL_ROLE:ALL" \
    $FEDERATED_CATALOG

@flyrain flyrain force-pushed the docs-hms-federation branch from 1f3dc19 to 9b191a9 Compare October 10, 2025 23:33
Copy link
Contributor

@HonahX HonahX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Oct 11, 2025
@flyrain flyrain merged commit a496a6f into apache:main Oct 11, 2025
16 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Oct 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants