-
Notifications
You must be signed in to change notification settings - Fork 314
Site: Add docs for catalog federation #2761
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
"warehouse": "s3://analytics-bucket/warehouse/", | ||
"authenticationParameters": { "authenticationType": "IMPLICIT" } | ||
} | ||
}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to use polaris cli to create the catalog now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @XJDKC ! I can add the cli version as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the docs just reference the CLI, so if the CLI works I think we should prefer that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed the REST example. The HMS cli support wasn't there. I can make the change after adding the support. https://github.com/polaris-catalog/polaris/blob/d449f59e9b54b414c0dd581e086d5ce436b05708/client/python/cli/command/catalogs.py#L271-L271
--- | ||
|
||
Polaris can federate catalog operations to an existing Hive Metastore (HMS). This lets an external |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this document can also be used for Hadoop catalogs. Essentially, the mechanism for HMS and Hadoop is the same. We could call these non-REST catalogs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, let me make that change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This page is about HMS so it's maybe okay to focus on HMS, but @poojanilangekar is absolutely correct that Polaris is meant to be able to federate to any HadoopCatalog implementation and we should make the docs clear about that (if not on this page then elsewhere). There are also catalogs which do use REST which Polaris could federate to (e.g. Unity Catalog).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add hadoop one as a followup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the federation doc!
site/content/in-dev/unreleased/federation/iceberg-rest-federation.md
Outdated
Show resolved
Hide resolved
site/content/in-dev/unreleased/federation/iceberg-rest-federation.md
Outdated
Show resolved
Hide resolved
Thanks for the feedback! This is ready for another review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the CLI examples! Some small comments but it looks great!
site/content/in-dev/unreleased/federation/iceberg-rest-federation.md
Outdated
Show resolved
Hide resolved
--storage-type S3 \ | ||
--role-arn "arn:aws:iam::123456789012:role/polaris-warehouse-access" \ | ||
--default-base-location "s3://analytics-bucket/warehouse/" \ | ||
--catalog-connection-type ICEBERG \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--catalog-connection-type ICEBERG \ | |
--catalog-connection-type iceberg-rest \ |
I think we need to fix the doc of Polaris CLI, I plan to add more tests for federation related commands at the same time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we need to fix the doc. We could change the enum name as well. iceberg
-> Iceberg_REST
,
|
||
```bash | ||
polaris catalogs create \ | ||
--name analytics_rest \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For catalog creation, the catalog name is a positional argument, we could move it to the last.
For reference, here is an example CLI I constructed:
polaris \
--base-url $POLARIS_BASE_URL \
--client-id $CLIENT_ID \
--client-secret $CLIENT_SECRET \
catalogs create \
--storage-type s3 \
--role-arn arn.... \
--default-base-location s3://<bucket>/$USER/polaris-$FEDERATED_CATALOG \
--type EXTERNAL \
--catalog-connection-type iceberg-rest \
--iceberg-remote-catalog-name jojiang_catalog \
--catalog-uri "$POLARIS_BASE_URL/api/catalog" \
--catalog-authentication-type OAUTH \
--catalog-token-uri "$POLARIS_BASE_URL/api/catalog/v1/oauth/tokens" \
--catalog-client-id $PRINCIPAL_CLIENT_ID \
--catalog-client-secret $PRINCIPAL_CLIENT_SECRET \
--catalog-client-scope "PRINCIPAL_ROLE:ALL" \
$FEDERATED_CATALOG
1f3dc19
to
9b191a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
To address #2467, the PR adds docs for catalog federation. With the help of chatGPT codex. cc @poojanilangekar @dennishuo