-
Notifications
You must be signed in to change notification settings - Fork 327
adding support to use a kms key for s3 buckets data encryption (AWS only) #2802
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
67fdaca
5cd4c4d
7db0a87
cf4ec3e
ffbe979
5a0f3eb
708630c
3a10f7d
543dabc
351dc07
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -33,6 +33,8 @@ | |
| import org.apache.polaris.core.storage.StorageAccessProperty; | ||
| import org.apache.polaris.core.storage.StorageUtil; | ||
| import org.apache.polaris.core.storage.aws.StsClientProvider.StsDestination; | ||
| import org.slf4j.Logger; | ||
| import org.slf4j.LoggerFactory; | ||
| import software.amazon.awssdk.auth.credentials.AwsCredentialsProvider; | ||
| import software.amazon.awssdk.policybuilder.iam.IamConditionOperator; | ||
| import software.amazon.awssdk.policybuilder.iam.IamEffect; | ||
|
|
@@ -49,6 +51,9 @@ public class AwsCredentialsStorageIntegration | |
| private final StsClientProvider stsClientProvider; | ||
| private final Optional<AwsCredentialsProvider> credentialsProvider; | ||
|
|
||
| private static final Logger LOGGER = | ||
| LoggerFactory.getLogger(AwsCredentialsStorageIntegration.class); | ||
|
|
||
| public AwsCredentialsStorageIntegration( | ||
| AwsStorageConfigurationInfo config, StsClient fixedClient) { | ||
| this(config, (destination) -> fixedClient); | ||
|
|
@@ -75,11 +80,15 @@ public AccessConfig getSubscopedCreds( | |
| boolean allowListOperation, | ||
| @Nonnull Set<String> allowedReadLocations, | ||
| @Nonnull Set<String> allowedWriteLocations, | ||
| Optional<String> refreshCredentialsEndpoint) { | ||
| Optional<String> refreshCredentialsEndpoint, | ||
| Map props) { | ||
| LOGGER.info("Getting subscoped creds props: {}", props); | ||
| String kmsKey = props.get("s3.sse.key") != null ? props.get("s3.sse.key").toString() : null; | ||
| int storageCredentialDurationSeconds = | ||
| realmConfig.getConfig(STORAGE_CREDENTIAL_DURATION_SECONDS); | ||
| AwsStorageConfigurationInfo storageConfig = config(); | ||
| String region = storageConfig.getRegion(); | ||
| String accountId = storageConfig.getAwsAccountId(); | ||
| AccessConfig.Builder accessConfig = AccessConfig.builder(); | ||
|
|
||
| if (shouldUseSts(storageConfig)) { | ||
|
|
@@ -90,10 +99,13 @@ public AccessConfig getSubscopedCreds( | |
| .roleSessionName("PolarisAwsCredentialsStorageIntegration") | ||
| .policy( | ||
| policyString( | ||
| storageConfig.getAwsPartition(), | ||
| storageConfig, | ||
| allowListOperation, | ||
| allowedReadLocations, | ||
| allowedWriteLocations) | ||
| allowedWriteLocations, | ||
| kmsKey, | ||
| region, | ||
| accountId) | ||
| .toJson()) | ||
| .durationSeconds(storageCredentialDurationSeconds); | ||
| credentialsProvider.ifPresent( | ||
|
|
@@ -163,12 +175,14 @@ private boolean shouldUseSts(AwsStorageConfigurationInfo storageConfig) { | |
| * ListBucket privileges with no resources. This prevents us from sending an empty policy to AWS | ||
| * and just assuming the role with full privileges. | ||
| */ | ||
| // TODO - add KMS key access | ||
| private IamPolicy policyString( | ||
| String awsPartition, | ||
| AwsStorageConfigurationInfo storageConfigurationInfo, | ||
| boolean allowList, | ||
| Set<String> readLocations, | ||
| Set<String> writeLocations) { | ||
| Set<String> writeLocations, | ||
| String kmsKey, | ||
| String region, | ||
| String accountId) { | ||
| IamPolicy.Builder policyBuilder = IamPolicy.builder(); | ||
| IamStatement.Builder allowGetObjectStatementBuilder = | ||
| IamStatement.builder() | ||
|
|
@@ -178,7 +192,9 @@ private IamPolicy policyString( | |
| Map<String, IamStatement.Builder> bucketListStatementBuilder = new HashMap<>(); | ||
| Map<String, IamStatement.Builder> bucketGetLocationStatementBuilder = new HashMap<>(); | ||
|
|
||
| String arnPrefix = arnPrefixForPartition(awsPartition); | ||
| String arnPrefix = arnPrefixForPartition(storageConfigurationInfo.getAwsPartition()); | ||
| String kmsKeyArn = storageConfigurationInfo.getKmsKeyArn(); | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can / will this work with MinIO KMS? Cf. https://github.com/minio/minio/blob/master/docs/kms/README.md There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will need to test that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's fine either way, but if it did not work with MinIO it would be good to mention AWS (only) in the PR description. Also, I'm pretty sure we'll have to support KMS in MinIO at some point, so if current changes were reusable it would be great (but not required). |
||
|
|
||
| Stream.concat(readLocations.stream(), writeLocations.stream()) | ||
| .distinct() | ||
| .forEach( | ||
|
|
@@ -225,6 +241,9 @@ private IamPolicy policyString( | |
| arnPrefix + StorageUtil.concatFilePrefixes(parseS3Path(uri), "*", "/"))); | ||
| }); | ||
| policyBuilder.addStatement(allowPutObjectStatementBuilder.build()); | ||
| addKmsKeyPolicy(kmsKey, kmsKeyArn, policyBuilder, true, region, accountId); | ||
| } else { | ||
| addKmsKeyPolicy(kmsKey, kmsKeyArn, policyBuilder, false, region, accountId); | ||
| } | ||
| if (!bucketListStatementBuilder.isEmpty()) { | ||
| bucketListStatementBuilder | ||
|
|
@@ -239,7 +258,45 @@ private IamPolicy policyString( | |
| bucketGetLocationStatementBuilder | ||
| .values() | ||
| .forEach(statementBuilder -> policyBuilder.addStatement(statementBuilder.build())); | ||
| return policyBuilder.addStatement(allowGetObjectStatementBuilder.build()).build(); | ||
| var r = policyBuilder.addStatement(allowGetObjectStatementBuilder.build()).build(); | ||
| LOGGER.info("Policies {}", r); | ||
| return r; | ||
| } | ||
|
|
||
| private static void addKmsKeyPolicy( | ||
| String kmsKeyArnOverride, | ||
| String kmsKeyArn, | ||
| IamPolicy.Builder policyBuilder, | ||
| boolean canEncrypt, | ||
| String region, | ||
| String accountId) { | ||
| if (kmsKeyArn == null && kmsKeyArnOverride == null) { | ||
| kmsKeyArn = arnKeyAll(region, accountId); | ||
| } | ||
| IamStatement.Builder allowKms = | ||
| IamStatement.builder() | ||
| .effect(IamEffect.ALLOW) | ||
| .addAction("kms:GenerateDataKeyWithoutPlaintext") | ||
| .addAction("kms:DescribeKey") | ||
| .addAction("kms:Decrypt") | ||
| .addAction("kms:GenerateDataKey"); | ||
|
|
||
| if (canEncrypt) { | ||
| allowKms.addAction("kms:Encrypt"); | ||
| } | ||
| if (kmsKeyArnOverride != null) { | ||
| LOGGER.info("Adding KMS key policy for key {}", kmsKeyArnOverride); | ||
| allowKms.addResource(IamResource.create(kmsKeyArnOverride)); | ||
| } | ||
| if (kmsKeyArn != null) { | ||
| LOGGER.info("Adding KMS key policy for key {}", kmsKeyArn); | ||
| allowKms.addResource(IamResource.create(kmsKeyArn)); | ||
| } | ||
| policyBuilder.addStatement(allowKms.build()); | ||
| } | ||
|
|
||
| private static String arnKeyAll(String region, String accountId) { | ||
| return String.format("arn:aws:kms:%s:%s:key/*", region, accountId); | ||
| } | ||
|
|
||
| private static String arnPrefixForPartition(String awsPartition) { | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -110,7 +110,8 @@ public AccessConfig getOrGenerateSubScopeCreds( | |
| boolean allowListOperation, | ||
| @Nonnull Set<String> allowedReadLocations, | ||
| @Nonnull Set<String> allowedWriteLocations, | ||
| Optional<String> refreshCredentialsEndpoint) { | ||
| Optional<String> refreshCredentialsEndpoint, | ||
| Map props) { | ||
| if (!isTypeSupported(polarisEntity.getType())) { | ||
| diagnostics.fail( | ||
| "entity_type_not_suppported_to_scope_creds", "type={}", polarisEntity.getType()); | ||
|
|
@@ -136,7 +137,8 @@ public AccessConfig getOrGenerateSubScopeCreds( | |
| k.allowedListAction(), | ||
| k.allowedReadLocations(), | ||
| k.allowedWriteLocations(), | ||
| k.refreshCredentialsEndpoint()); | ||
| k.refreshCredentialsEndpoint(), | ||
| props); | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If |
||
| if (scopedCredentialsResult.isSuccess()) { | ||
| long maxCacheDurationMs = maxCacheDurationMs(callCtx.getRealmConfig()); | ||
| return new StorageCredentialCacheEntry( | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally think that "storage integration" code should work independently of table properties. In other words Polaris should own storage integration and
AccessConfig. Table properties are settable by the end users outside the context of a catalog (e.g. via registerTable). Mixing both sources of integration config is confusing IMHO.Is the intention here to support different KMS config per table?
Polaris may need to be able to get
AccessConfigin order to load table metadata (which has table properties)... Does this not create a chicken-and-egg problem?That said, I'm open to a wider discussion about this on the
devML.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that is to support the usage of the s3 properties defined in iceberg, we don't really need to pass down props or have that code in there as long as we can agree on the code https://github.com/apache/polaris/pull/2802/files#diff-d305f7a426a7690c576722c114257792b3fcee726624655d15893b71499827f8R274.
If no kms key was defined in the polaris AWS storage then we allow the usage of any keys owned by that account.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before we discuss
arnKeyAll(and I do not have any objections there), I think we need to resolve the conceptual issue of using metadata properties for generatingAccessConfig. Polaris may needAccessConfigin order to read metadata JSON files (which have table properties)... How can this work in principle?