Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destination S3: STS Assume Role Authentication #38143

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ dependencies {

// Re-export dependencies for gcs-destinations.
api 'com.amazonaws:aws-java-sdk-s3:1.12.647'
api 'com.amazonaws:aws-java-sdk-sts:1.12.647'
api ('com.github.airbytehq:json-avro-converter:1.1.0') { exclude group: 'ch.qos.logback', module: 'logback-classic'}
api 'com.github.alexmojaki:s3-stream-upload:2.2.4'
api 'org.apache.avro:avro:1.11.3'
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ import com.fasterxml.jackson.databind.JsonNode
import io.airbyte.cdk.integrations.destination.s3.constant.S3Constants
import io.airbyte.cdk.integrations.destination.s3.credential.S3AWSDefaultProfileCredentialConfig
import io.airbyte.cdk.integrations.destination.s3.credential.S3AccessKeyCredentialConfig
import io.airbyte.cdk.integrations.destination.s3.credential.S3AssumeRoleCredentialConfig
import io.airbyte.cdk.integrations.destination.s3.credential.S3CredentialConfig
import io.airbyte.cdk.integrations.destination.s3.credential.S3CredentialType
import java.util.*
Expand Down Expand Up @@ -343,6 +344,8 @@ open class S3DestinationConfig {
getProperty(config, S3Constants.ACCESS_KEY_ID),
getProperty(config, S3Constants.SECRET_ACCESS_KEY)
)
} else if (config.has(S3Constants.ROLE_ARN)) {
S3AssumeRoleCredentialConfig(getProperty(config, S3Constants.ROLE_ARN)!!)
} else {
S3AWSDefaultProfileCredentialConfig()
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ class S3Constants {
const val SECRET_ACCESS_KEY: String = "secret_access_key"
const val S_3_BUCKET_NAME: String = "s3_bucket_name"
const val S_3_BUCKET_REGION: String = "s3_bucket_region"
const val ROLE_ARN: String = "role_arn"

// r2 requires account_id
const val ACCOUNT_ID: String = "account_id"
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
/*
* Copyright (c) 2024 Airbyte, Inc., all rights reserved.
*/
package io.airbyte.cdk.integrations.destination.s3.credential

import com.amazonaws.auth.AWSCredentialsProvider
import com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider
import com.amazonaws.regions.Regions
import com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClient

private const val AIRBYTE_STS_SESSION_NAME = "airbyte-sts-session"
bgroff marked this conversation as resolved.
Show resolved Hide resolved

/**
* The S3AssumeRoleCredentialConfig implementation of the S3CredentialConfig returns an
* STSAssumeRoleSessionCredentialsProvider. The STSAssumeRoleSessionCredentialsProvider
* automatically refreshes assumed role credentials on a background thread. To do this, an STS
* Client is created using the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables
* that are provided by the orchestrator. The roleArn comes from the spec and the externalId, which
* is used to protect against confused deputy problems, and also is provided through the
* orchestrator via an environment variable. As of 5/2024, the externalId is set to the workspaceId.
*
* @param roleArn The Amazon Resource Name (ARN) of the role to assume.
*/
class S3AssumeRoleCredentialConfig(private val roleArn: String) : S3CredentialConfig {
// TODO: Verify this env var, I think it might actually be AWS_ASSUME_ROLE_EXTERNAL_ID or
// something like that.
private val externalId: String? = System.getenv("AWS_EXTERNAL_ID")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should that be in the config?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to be set by us to a unique value for each customer. It cannot come from the config because that would allow for a confused deputy problem. This coming from the environment solves this problem because it means that even if you were to use an ARN that you don't own, the externalId will never match the customer's Role rules as the externalId will not match.

For example:

Customer A has configured a role with id: 1234 and externalId: abcd
Customer B has configured a role with id: 5678 and externalId: zxyw

If Customer A sets an ARN of 5678, the platform will still inject abcd as the externalId, so the STS Assume Role will fail because the externalId was not zxyw. The key here is that Airbyte MUST provide the externalId to the connector and the externalId MUST NOT be configurable.

It is not required that the externalId (or the role ARN) be secret.

More info here: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-user_externalid.html#external-id-purpose


override val credentialType: S3CredentialType
get() = S3CredentialType.ASSUME_ROLE

override val s3CredentialsProvider: AWSCredentialsProvider
get() {
/**
* AWSCredentialsProvider implementation that uses the AWS Security Token Service to
* assume a Role and create temporary, short-lived sessions to use for authentication.
* This credentials provider uses a background thread to refresh credentials. This
* background thread can be shut down via the close() method when the credentials
* provider is no longer used.
*/
return STSAssumeRoleSessionCredentialsProvider.Builder(
roleArn,
AIRBYTE_STS_SESSION_NAME
)
.withExternalId(externalId)
/**
* This client is used to make the AssumeRole request. The credentials are
* automatically loaded from the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
* environment variables set by the orchestrator.
*/
.withStsClient(
AWSSecurityTokenServiceClient.builder()
.withRegion(Regions.DEFAULT_REGION)
.build()
)
.build()
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,6 @@ package io.airbyte.cdk.integrations.destination.s3.credential

enum class S3CredentialType {
ACCESS_KEY,
DEFAULT_PROFILE
DEFAULT_PROFILE,
ASSUME_ROLE
}
Original file line number Diff line number Diff line change
Expand Up @@ -401,6 +401,13 @@
"{sync_id}"
],
"order": 8
},
"role_arn": {
"type": "string",
"description": "The Role ARN",
"title": "Role ARN",
"examples": ["arn:aws:iam::667471877866:role/TestExternalId"],
"order": 9
}
}
}
Expand Down