Skip to content

Conversation

@eric-maynard
Copy link
Contributor

Description

This adds support for a new property, region for AWS storage configurations.

Fixes #342

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • Documentation update
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

I'm able to create catalogs and add the region property to the StorageConfigInfo:

LD4RTJ0HY9:polaris emaynard$ curl -X POST http://localhost:8181/api/management/v1/catalogs \
> -H "Authorization: Bearer principal:root;realm:default-realm" \
> -H "Content-Type: application/json" \
> -d '{
>   "catalog": {
>     "type": "INTERNAL",
>     "name": "example_catalog",
>     "properties": {
>       "default-base-location": "s3://your-bucket/catalog-location/"
>     },
>     "storageConfigInfo": {
>       "storageType": "S3",
>       "roleArn": "arn:aws:iam::012345678901:role/jdoe",
>       "region": "us-east-2"
>     }
>   }
> }'
LD4RTJ0HY9:polaris emaynard$ curl -X GET http://localhost:8181/api/management/v1/catalogs/example_catalog \
> -H "Authorization: Bearer principal:root;realm:default-realm" \
> -H "Content-Type: application/json" | jq
{
  "type": "INTERNAL",
  "name": "example_catalog",
  "properties": {
    "default-base-location": "s3://your-bucket/catalog-location/"
  },
  "createTimestamp": 1731818113312,
  "lastUpdateTimestamp": 1731818113312,
  "entityVersion": 1,
  "storageConfigInfo": {
    "storageType": "S3",
    "roleArn": "arn:aws:iam::012345678901:role/jdoe",
    "externalId": null,
    "userArn": null,
    "region": "us-east-2",
    "allowedLocations": [
      "s3://your-bucket/catalog-location/"
    ]
  }
}

@eric-maynard eric-maynard changed the title Issue 342 Storage AWS region in AwsStorageConfigurationInfo Nov 17, 2024
@eric-maynard eric-maynard changed the title Storage AWS region in AwsStorageConfigurationInfo Store AWS region in AwsStorageConfigurationInfo Nov 17, 2024
@eric-maynard eric-maynard marked this pull request as draft November 18, 2024 17:51
Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[doubt] This region specified will be region of the polaris catalog ? Do we plan to support federation of catalogs in future if yes, the should this be the region of polaris catalog or the region of federated catalog ?

for ex:
can this a possible case in coming future ?
Polaris -> (Federated to) -> Glue
(region A) ------ > (region B)

@eric-maynard
Copy link
Contributor Author

eric-maynard commented Nov 18, 2024

Hey @singhpk234, the idea is that you can associate a region with a storage configuration so that the region can be used by any client that leverages credentials/files associated that storage configuration.

As you pointed out, it is not clear how this will work with catalog federation (cc @dennishuo). But I think it is also unclear how storage configurations will work with federated catalogs more generally -- for example, a single role ARN may not be valid for the entire federated catalog. So this is something our design for federation must address.

At the very least, we have discussed allowing storage configurations to be defined on a level more granular than the catalog (e.g. at the table or namespace level).

Maybe @munendrasn, the filer of #342, can also help provide some additional context here. For my part I am curious if there's a particular test case we can add here to make sure the issue reported in #342 is fully addressed.

@munendrasn
Copy link

@eric-maynard
It is similar to the case @singhpk234 mentioned.
We have custom catalog tracking Native iceberg tables and Federated Iceberg tables from different Catalogs. One such Catalog is Polaris Catalog.
Our setup is in one AWS region but the Federated table's data is stored in another region.. So, accessing the table fails.

On the testing, are you looking to test it via Iceberg APIs or directly S3 client API? If AWS_REGION is set to one region but the table's storage in another region.. any read or listOperation would fail unless region is explicitly specified on s3Client creation

@eric-maynard
Copy link
Contributor Author

Hi @munendrasn I see -- do the current changes here work for your use case then? client.region should be specified in the credentials map so long as it's set for the table's storage configuration.

@eric-maynard eric-maynard marked this pull request as ready for review November 20, 2024 18:42
@munendrasn
Copy link

@eric-maynard
As long as client.region is set, it would the reported issue. Skimming through the code, I couldn't find client.region being returned as part of credentialMap. would it be possible to include a test case for the same?

@eric-maynard
Copy link
Contributor Author

eric-maynard commented Nov 25, 2024

Hi @munendrasn, please see the newly-added test. client.region should be present in the credentials map when a region is attached to the storage config.

Copy link

@munendrasn munendrasn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@JsonProperty(value = "roleARN", required = true) @NotNull String roleARN) {
this(storageType, allowedLocations, roleARN, null);
@JsonProperty(value = "roleARN", required = true) @NotNull String roleARN,
@JsonProperty(value = "region", required = false) @NotNull String region) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why @NotNull here while the property is nullable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I thought it had to be NotNull for the JsonCreator but in fact that's not true. Fixed.

@eric-maynard eric-maynard requested a review from adutra December 5, 2024 18:47
@adutra
Copy link
Contributor

adutra commented Dec 10, 2024

@eric-maynard let's merge this one, could you please resolve the conflicts?

@eric-maynard eric-maynard enabled auto-merge (squash) December 11, 2024 21:39
@eric-maynard
Copy link
Contributor Author

Hey @adutra, the conflicts are resolved -- when you get a moment, could you re-approve? Thanks!

@eric-maynard eric-maynard merged commit f9a24d4 into apache:main Dec 12, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE REQUEST] support for returning client region in the loadTable response for S3

4 participants