Skip to content

feat: add SESSION_NAME_FIELDS_IN_SUBSCOPED_CREDENTIAL for configurable STS session names#4326

Open
obelix74 wants to merge 8 commits intoapache:mainfrom
obelix74:feat/session-name-fields-configurable
Open

feat: add SESSION_NAME_FIELDS_IN_SUBSCOPED_CREDENTIAL for configurable STS session names#4326
obelix74 wants to merge 8 commits intoapache:mainfrom
obelix74:feat/session-name-fields-configurable

Conversation

@obelix74
Copy link
Copy Markdown
Contributor

@obelix74 obelix74 commented Apr 30, 2026

Adds a new feature flag SESSION_NAME_FIELDS_IN_SUBSCOPED_CREDENTIAL that mirrors the design of SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL, allowing operators to compose structured AWS STS role session names from an ordered list of context fields (realm, catalog, namespace, table, principal).

Fields are joined with '-', prefixed with 'p-', and truncated to the 64-character AWS limit using proportional budget allocation across fields. When the list is empty (default), falls back to the existing INCLUDE_PRINCIPAL_NAME_IN_SUBSCOPED_CREDENTIAL behaviour for backward compatibility.

The STS role session name is the only caller context AWS propagates to S3 data events (GetObject, PutObject) in CloudTrail — session tags are only visible on the AssumeRole event itself. A structured, configurable session name (e.g. p-{realm}-{catalog}-{table}-{principal}) makes per-table egress attribution directly readable from S3 data events without joining to AssumeRole events via accessKeyId.

Closes #4324

Checklist

  • 🛡️ Don't disclose security issues! (contact security@apache.org)
  • 🔗 Clearly explained why the changes are needed, or linked related issues: Fixes #
  • 🧪 Added/updated tests with good coverage, or manually tested (and explained how)
  • 💡 Added comments for complex logic
  • 🧾 Updated CHANGELOG.md (if needed)
  • 📚 Updated documentation in site/content/in-dev/unreleased (if needed)

Anand Kumar Sankaran added 2 commits April 30, 2026 13:38
…e STS session names

Adds a new feature flag SESSION_NAME_FIELDS_IN_SUBSCOPED_CREDENTIAL that
mirrors the design of SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL, allowing
operators to compose structured AWS STS role session names from an ordered
list of context fields (realm, catalog, namespace, table, principal).

Fields are joined with '-', prefixed with 'p-', and truncated to the
64-character AWS limit using proportional budget allocation across fields.
When the list is empty (default), falls back to the existing
INCLUDE_PRINCIPAL_NAME_IN_SUBSCOPED_CREDENTIAL behaviour for backward
compatibility.

The STS role session name is the only caller context AWS propagates to
S3 data events (GetObject, PutObject) in CloudTrail — session tags are
only visible on the AssumeRole event itself. A structured, configurable
session name (e.g. p-{realm}-{catalog}-{table}-{principal}) makes
per-table egress attribution directly readable from S3 data events
without joining to AssumeRole events via accessKeyId.

Closes apache#4324
@obelix74
Copy link
Copy Markdown
Contributor Author

@jbonofre , @adnanhemani , @singhpk234 , @dimas-b - I uptook 1.4.0 and realized that we can make the tags easier to use. Raised #3327 and this is a PR for that.

dimas-b
dimas-b previously approved these changes May 6, 2026
Copy link
Copy Markdown
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a valuable feature to me 👍 Thanks, @obelix74 !

.key("SESSION_NAME_FIELDS_IN_SUBSCOPED_CREDENTIAL")
.description(
"Ordered list of fields to include in the STS role session name for AWS credential vending.\n"
+ "Fields are joined with '-' and prefixed with 'p-'. The result is truncated to 64 characters\n"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess p- stands for "Polaris"? (just wondering)

optional suggestion: make it configurable by allowing users to enter elements like prefix-myprefix (defaulting to p-).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the following changes.

  • AwsSessionNameBuilder: added extractPrefix() that parses a prefix-X token from the config list and returns X- as the prefix (default p-), plus a new 4-arg buildSessionName overload accepting a prefix parameter
  • AwsCredentialsStorageIntegration: now calls extractPrefix() and filters prefix- tokens before resolving field names, passing the extracted prefix to buildSessionName
  • FeatureConfiguration + docs: updated description to document the prefix-X syntax

@github-project-automation github-project-automation Bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board May 6, 2026
@dimas-b
Copy link
Copy Markdown
Contributor

dimas-b commented May 6, 2026

Since this PR introduces a new feature flag, it would be nice to mention it on dev... @obelix74 : could you send that email?

…NTIAL

- Add configurable session name prefix via 'prefix-X' token in the field
  list (e.g. 'prefix-myorg' → prefix 'myorg-'); defaults to 'p-'
- Make SessionNameField enum package-private
- Replace proportional budget allocation with greedy reuse: unused chars
  from a short field flow to subsequent fields rather than being lost
- Tighten deterministic test assertions to .isEqualTo where output is fixed
@tokoko
Copy link
Copy Markdown
Contributor

tokoko commented May 7, 2026

I haven't looked in detail yet, but doesn't this need extra handling on cache key generation part as well? by default cache isn't keyed by the entire context iirc.

also, should we think about deprecating INCLUDE_PRINCIPAL_NAME_IN_SUBSCOPED_CREDENTIAL? I think the only remaining caveat might be the cache. if cache key depends on generated session name itself (which should not be that hard to implement, at least after #3699) then I think there will be no good reason to keep INCLUDE_PRINCIPAL_NAME_IN_SUBSCOPED_CREDENTIAL around.

When SESSION_NAME_FIELDS_IN_SUBSCOPED_CREDENTIAL includes context-varying
fields (realm, catalog, namespace, table), the cache key must incorporate
the full CredentialVendingContext even when SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL
is disabled. Without this, two requests for different tables would share a
cached credential stamped with the first table's session name, producing
silent misattribution in CloudTrail S3 data events.

Similarly, if "principal" appears in the session name fields, the principal
is now included in the cache key so different principals cannot share a
credential with the wrong session name.

Adds regression tests for both cases.
@obelix74
Copy link
Copy Markdown
Contributor Author

obelix74 commented May 7, 2026

I haven't looked in detail yet, but doesn't this need extra handling on cache key generation part as well? by default cache isn't keyed by the entire context iirc.

also, should we think about deprecating INCLUDE_PRINCIPAL_NAME_IN_SUBSCOPED_CREDENTIAL? I think the only remaining caveat might be the cache. if cache key depends on generated session name itself (which should not be that hard to implement, at least after #3699) then I think there will be no good reason to keep INCLUDE_PRINCIPAL_NAME_IN_SUBSCOPED_CREDENTIAL around.

Thank you for catching this. Fixed in #113533df: we now check SESSION_NAME_FIELDS_IN_SUBSCOPED_CREDENTIAL directly in StorageCredentialCache — if any of realm/catalog/namespace/table appear in the configured fields the full context is included in the cache key; if principal appears, the principal is keyed too.

On deprecating INCLUDE_PRINCIPAL_NAME_IN_SUBSCOPED_CREDENTIAL — agreed in principle. I'd suggest we leave that for a follow-up once #3699 lands and we can verify the cache key story end-to-end for the session-name path. The two flags can coexist safely for now (the new one takes precedence when non-empty), so there's no correctness risk in keeping it around short-term.

@dimas-b
Copy link
Copy Markdown
Contributor

dimas-b commented May 7, 2026

Thanks for the caching note @tokoko - good catch 👍

@dimas-b
Copy link
Copy Markdown
Contributor

dimas-b commented May 7, 2026

@obelix74 : CI failed 🤔 I wonder if this PR needs a rebase 🤔

@obelix74
Copy link
Copy Markdown
Contributor Author

obelix74 commented May 7, 2026

@obelix74 : CI failed 🤔 I wonder if this PR needs a rebase 🤔

It failed twice. The second time was after the rebase.

@obelix74
Copy link
Copy Markdown
Contributor Author

obelix74 commented May 7, 2026

@obelix74 : CI failed 🤔 I wonder if this PR needs a rebase 🤔

It failed twice. The second time was after the rebase.

The Zizmor security scan fails (exit code 13) due to mismatched version comments on github/codeql-action pins in two workflow files.

Root cause: Renovate bot commit 4bb7800 in main updated the github/codeql-action SHA to e46ed2cbd01... (which is v4.35.3) but left the comment as # v4 instead of #v4.35.3. Zizmor's online audit resolves the SHA to the underlying commit 68bde559dea0 (which is tagged v4.35.3) and flags the mismatch.

Three violations:

This is not caused by your PR. It was pulled in by the #bb0a0b87d merge from main. The fix is simple — change # v4 to # v4.35.3 on those three lines.

dimas-b
dimas-b previously approved these changes May 7, 2026
* @param prefix the session name prefix (including any trailing separator)
* @return a valid AWS STS role session name of at most 64 characters
*/
public static String buildSessionName(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably needs to be package-private since SessionNameField is package-private.

@dimas-b dimas-b mentioned this pull request May 7, 2026
6 tasks
@dimas-b
Copy link
Copy Markdown
Contributor

dimas-b commented May 7, 2026

CI follow-up: #4387 (thanks, @obelix74 !)

@obelix74
Copy link
Copy Markdown
Contributor Author

obelix74 commented May 7, 2026

CI follow-up: #4387 (thanks, @obelix74 !)

Approved, thanks for fixing it.

…visibility

SessionNameField is package-private, so methods that reference it in their
signature should be too.
dimas-b added a commit that referenced this pull request May 8, 2026
Cf. #4326 (comment)

Co-authored-by: Anand K Sankaran <lists@anands.net>
@obelix74 obelix74 requested a review from dimas-b May 8, 2026 03:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Make STS role session name fields configurable for credential vending

3 participants