Skip to content

fix(cli/rest) Support Glue REST operations with Iceberg-Go CLI #459

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

maninc
Copy link
Contributor

@maninc maninc commented Jun 12, 2025

Motivation

To Support AWS Glue Iceberg endpoint using Iceberg-Go CLI. Forbidden error is thrown when Iceberg tables in glue are queried using CLI.

https://docs.aws.amazon.com/glue/latest/dg/connect-glu-iceberg-rest.html

Fix:

  • Added RestCatalogConfig struct to CatalogConfig struct
  • RestCatalogConfig contains rest catalog configuration properties.
  • Above values are used before creating rest catalog
  • Use AWS environment credentials when Credentials property is not set

Testing:

  • Added unit test for new configuration
  • Used below rest_config.yaml to query Iceberg tables in AWS Glue.
catalog:
  default:
    type: rest
    uri: https://glue.us-east-1.amazonaws.com/iceberg
    region: us-east-1
    warehouse: YOUR_AWS_ACCOUNT_ID
    rest-config:
      sigv4-region: us-east-1
      sigv4-service: glue

Copy link
Contributor

@laskoviymishka laskoviymishka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm doubtful about this approach.
Can we pass this via flags?
This makes the CLI less unified. I think it's better to keep this approach consistent — by using flags.

@maninc
Copy link
Contributor Author

maninc commented Jun 16, 2025

I'm doubtful about this approach. Can we pass this via flags? This makes the CLI less unified. I think it's better to keep this approach consistent — by using flags.

Thanks @laskoviymishka for reviewing !

Yes, that was my initial approach as well to make CLI args and config files to be as close as possible. Since we may end up in passing too many arguments, i used this approach.

Do you think adding these REST config arguments as a JSON would be easier ?

@laskoviymishka
Copy link
Contributor

I think this is still a problem.
From what I see in other Glue clients, parameters are typically passed as CLI flags instead of a JSON file.
CLI flags are more self-describing (you can view them directly with --help) and much more convenient to use.

Currently, using a custom JSON schema adds an unnecessary layer of complexity - it's less clear to the user and makes maintenance more cumbersome.

Could we consider following the pattern used by other tools in the ecosystem?

See for example Iceberg Catalog Migrator

java -jar iceberg-catalog-migrator-cli.jar migrate \
  --source-catalog-type GLUE \
  --source-catalog-properties warehouse=s3a://example-bucket/gluecatalog/,io-impl=org.apache.iceberg.aws.s3.S3FileIO \
  --target-catalog-type NESSIE \
  --target-catalog-properties uri=http://...,warehouse=s3a://...,io-impl=...

Another option is simply by-pass this as AWS_* env-vars, as it done in https://py.iceberg.apache.org/cli/

@zeroshade
Copy link
Member

I agree with @laskoviymishka that we shouldn't be passing custom JSON stuff and should instead use flags or just follow whatever pattern pyiceberg is using.

@maninc
Copy link
Contributor Author

maninc commented Jul 7, 2025

Thank you @laskoviymishka @zeroshade for the suggestion. I have updated the code to follow the pattern in Iceberg Catalog Migrator.

Successfully tested cli using command below,

./iceberg list --catalog rest --uri https://glue.us-east-1.amazonaws.com/iceberg --warehouse <ACCOUNT_ID> --rest-config sigv4-region=us-east-1,sigv4-service=glue

output: json
credential: client-id:client-secret
warehouse: 123456789012
rest-config: auth-url=https://auth.example.com,sigv4-enabled=true,sigv4-region=us-east-1,sigv4-service=glue,tls-skip-verify=false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, why don't we just use the existing properties for these that we already recognize as separate properties to pass?

i.e. rest.sigv4-enabled, rest.signing-region, rest.authorization-url, etc.. (see catalog/rest.go)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants