feat: add SESSION_NAME_FIELDS_IN_SUBSCOPED_CREDENTIAL for configurable STS session names#4326
feat: add SESSION_NAME_FIELDS_IN_SUBSCOPED_CREDENTIAL for configurable STS session names#4326obelix74 wants to merge 8 commits intoapache:mainfrom
Conversation
…e STS session names
Adds a new feature flag SESSION_NAME_FIELDS_IN_SUBSCOPED_CREDENTIAL that
mirrors the design of SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL, allowing
operators to compose structured AWS STS role session names from an ordered
list of context fields (realm, catalog, namespace, table, principal).
Fields are joined with '-', prefixed with 'p-', and truncated to the
64-character AWS limit using proportional budget allocation across fields.
When the list is empty (default), falls back to the existing
INCLUDE_PRINCIPAL_NAME_IN_SUBSCOPED_CREDENTIAL behaviour for backward
compatibility.
The STS role session name is the only caller context AWS propagates to
S3 data events (GetObject, PutObject) in CloudTrail — session tags are
only visible on the AssumeRole event itself. A structured, configurable
session name (e.g. p-{realm}-{catalog}-{table}-{principal}) makes
per-table egress attribution directly readable from S3 data events
without joining to AssumeRole events via accessKeyId.
Closes apache#4324
|
@jbonofre , @adnanhemani , @singhpk234 , @dimas-b - I uptook 1.4.0 and realized that we can make the tags easier to use. Raised #3327 and this is a PR for that. |
…UBSCOPED_CREDENTIAL
| .key("SESSION_NAME_FIELDS_IN_SUBSCOPED_CREDENTIAL") | ||
| .description( | ||
| "Ordered list of fields to include in the STS role session name for AWS credential vending.\n" | ||
| + "Fields are joined with '-' and prefixed with 'p-'. The result is truncated to 64 characters\n" |
There was a problem hiding this comment.
I guess p- stands for "Polaris"? (just wondering)
optional suggestion: make it configurable by allowing users to enter elements like prefix-myprefix (defaulting to p-).
There was a problem hiding this comment.
Made the following changes.
- AwsSessionNameBuilder: added extractPrefix() that parses a prefix-X token from the config list and returns X- as the prefix (default p-), plus a new 4-arg buildSessionName overload accepting a prefix parameter
- AwsCredentialsStorageIntegration: now calls extractPrefix() and filters prefix- tokens before resolving field names, passing the extracted prefix to buildSessionName
- FeatureConfiguration + docs: updated description to document the prefix-X syntax
|
Since this PR introduces a new feature flag, it would be nice to mention it on |
…NTIAL - Add configurable session name prefix via 'prefix-X' token in the field list (e.g. 'prefix-myorg' → prefix 'myorg-'); defaults to 'p-' - Make SessionNameField enum package-private - Replace proportional budget allocation with greedy reuse: unused chars from a short field flow to subsequent fields rather than being lost - Tighten deterministic test assertions to .isEqualTo where output is fixed
|
I haven't looked in detail yet, but doesn't this need extra handling on cache key generation part as well? by default cache isn't keyed by the entire context iirc. also, should we think about deprecating INCLUDE_PRINCIPAL_NAME_IN_SUBSCOPED_CREDENTIAL? I think the only remaining caveat might be the cache. if cache key depends on generated session name itself (which should not be that hard to implement, at least after #3699) then I think there will be no good reason to keep INCLUDE_PRINCIPAL_NAME_IN_SUBSCOPED_CREDENTIAL around. |
When SESSION_NAME_FIELDS_IN_SUBSCOPED_CREDENTIAL includes context-varying fields (realm, catalog, namespace, table), the cache key must incorporate the full CredentialVendingContext even when SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL is disabled. Without this, two requests for different tables would share a cached credential stamped with the first table's session name, producing silent misattribution in CloudTrail S3 data events. Similarly, if "principal" appears in the session name fields, the principal is now included in the cache key so different principals cannot share a credential with the wrong session name. Adds regression tests for both cases.
Thank you for catching this. Fixed in #113533df: we now check On deprecating |
|
Thanks for the caching note @tokoko - good catch 👍 |
|
@obelix74 : CI failed 🤔 I wonder if this PR needs a rebase 🤔 |
It failed twice. The second time was after the rebase. |
The Zizmor security scan fails (exit code 13) due to mismatched version comments on github/codeql-action pins in two workflow files. Root cause: Renovate bot commit 4bb7800 in main updated the github/codeql-action SHA to e46ed2cbd01... (which is v4.35.3) but left the comment as # v4 instead of #v4.35.3. Zizmor's online audit resolves the SHA to the underlying commit 68bde559dea0 (which is tagged v4.35.3) and flags the mismatch. Three violations:
This is not caused by your PR. It was pulled in by the #bb0a0b87d merge from main. The fix is simple — change # v4 to # v4.35.3 on those three lines. |
| * @param prefix the session name prefix (including any trailing separator) | ||
| * @return a valid AWS STS role session name of at most 64 characters | ||
| */ | ||
| public static String buildSessionName( |
There was a problem hiding this comment.
This probably needs to be package-private since SessionNameField is package-private.
…visibility SessionNameField is package-private, so methods that reference it in their signature should be too.
Cf. #4326 (comment) Co-authored-by: Anand K Sankaran <lists@anands.net>
Adds a new feature flag
SESSION_NAME_FIELDS_IN_SUBSCOPED_CREDENTIALthat mirrors the design ofSESSION_TAGS_IN_SUBSCOPED_CREDENTIAL, allowing operators to compose structured AWS STS role session names from an ordered list of context fields (realm, catalog, namespace, table, principal).Fields are joined with
'-', prefixed with'p-', and truncated to the 64-character AWS limit using proportional budget allocation across fields. When the list is empty (default), falls back to the existingINCLUDE_PRINCIPAL_NAME_IN_SUBSCOPED_CREDENTIALbehaviour for backward compatibility.The STS role session name is the only caller context AWS propagates to S3 data events (GetObject, PutObject) in CloudTrail — session tags are only visible on the AssumeRole event itself. A structured, configurable session name (e.g.
p-{realm}-{catalog}-{table}-{principal}) makes per-table egress attribution directly readable from S3 data events without joining to AssumeRole events via accessKeyId.Closes #4324
Checklist
CHANGELOG.md(if needed)site/content/in-dev/unreleased(if needed)