Skip to content

Correct slow log user for RCS 2.0 #130140

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

gmjehovich
Copy link
Contributor

@gmjehovich gmjehovich commented Jun 26, 2025

Description:
This PR addresses an issue where Elasticsearch slow logs, specifically on the fulfilling cluster during a Cross-Cluster Search (CCS) with RCS 2.0, incorrectly displayed the authentication details of the cross-cluster API key's creator instead of the original user who initiated the remote search.

Solution Overview:

  • Corrected User Context for CCS: Modified Security.getAuthContextForSlowLog() to accurately extract the originalAuthentication (the Authentication object representing the user on the querying cluster) when processing cross-cluster access requests.
    • Includes user.effective.* fields if the original user was performing a run-as operation on the querying cluster.
    • Includes apikey.id and apikey.name if the original user authenticated via an API key on the querying cluster.

Testing:

  • Added comprehensive unit tests for getAuthContextForSlowLog() in SecurityTests to cover various scenarios for both local and cross-cluster access.
  • See comments for discussion on integration tests

Ticket
Original issue is ES-8568 on Jira.

* @param auth The Authentication object to extract details from.
* @param authContext The map to populate with authentication details.
*/
private void populateAuthContextMap(Authentication auth, Map<String, String> authContext) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Cross-Cluster Search (CCS) slow logs, I've implemented so that user.*, auth.type, user.effective.*, and apikey.* fields are all populated from the inner authentication object.

Is this the precise intended behavior? Or should some fields explicitly signal the 'cross-cluster' nature (e.g., user.realm showing _es_cross_cluster_access), potentially sacrificing original user context?

Copy link
Contributor

@n1v0lg n1v0lg Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good q! I think we want the inner user context here -- we might consider add more fields to point to the cross-cluster nature of the query but I don't think that's trivial with slow logs, off the top of my head.

going with what you have will also match the slow logs behavior of RCS 1.

@gmjehovich gmjehovich added Team:Security Meta label for security team :Security/Authentication Logging in, Usernames/passwords, Realms (Native/LDAP/AD/SAML/PKI/etc) >enhancement labels Jun 26, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @gmjehovich, I've created a changelog YAML for you.

@gmjehovich
Copy link
Contributor Author

Discussion on Integration tests:

As I understand, SecuritySlowLogIT is designed for a single-cluster setup, whereas true Cross-Cluster Search (CCS) inherently requires two clusters. I believe we would need to set up a test that spins up multiple clusters.

Are there existing multi-cluster IT frameworks or standard practices within Elasticsearch that could accommodate a true E2E CCS test for this kind of logging behavior?
Or is this generally considered too complex or out of scope, relying instead on the comprehensive unit test coverage for the getAuthContextForSlowLog() logic?

@gmjehovich gmjehovich self-assigned this Jun 26, 2025
@gmjehovich gmjehovich requested a review from n1v0lg June 26, 2025 21:08
@n1v0lg
Copy link
Contributor

n1v0lg commented Jun 27, 2025

As I understand, SecuritySlowLogIT is designed for a single-cluster setup, whereas true Cross-Cluster Search (CCS) inherently requires two clusters. I believe we would need to set up a test that spins up multiple clusters.

@gmjehovich true!

I think an integration test is a good idea. We have AbstractRemoteClusterSecurityTestCase as CCS test infrastructure. I'd model a new REST test case after e.g., RemoteClusterSecurityApiKeyRestIT -- the test method itself will of course be different but the two cluster setup should be fairly similar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Security/Authentication Logging in, Usernames/passwords, Realms (Native/LDAP/AD/SAML/PKI/etc) Team:Security Meta label for security team v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants