-
Notifications
You must be signed in to change notification settings - Fork 24.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Indexify CCR internal actions for RCS 2.0 #95264
Conversation
.../main/java/org/elasticsearch/xpack/ccr/action/repositories/GetCcrRestoreFileChunkAction.java
Outdated
Show resolved
Hide resolved
@Override | ||
public String[] indices() { | ||
if (shardId == null) { | ||
return null; | ||
} else { | ||
return new String[] { shardId.getIndexName() }; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using just the index name from the shard works because CCR does not support aliases. Otherwise it could be a lot of trouble to keep the original indices.
final String effectiveAction = RCS_ACTION_NAME_LOOKUP.getOrDefault(action, action); | ||
if (false == effectiveAction.equals(action)) { | ||
logger.info("switching action from [{}] to [{}]", action, effectiveAction); | ||
} | ||
sendWithCrossClusterAccessHeaders(crossClusterAccessHeaders, connection, effectiveAction, request, options, handler); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rewrite the request here to use the indices actions instead of internal actions to honor indices privileges. This also means, for existing remote cluster model, it will keep using the internal actions.
if (out.getTransportVersion().onOrAfter(TransportVersion.V_8_8_0)) { | ||
shardId.writeTo(out); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since RCS 2.0 is after 8.8, dropping this field for older versions is not an issue at all.
Pinging @elastic/es-security (Team:Security) |
import static org.hamcrest.Matchers.hasSize; | ||
import static org.hamcrest.Matchers.is; | ||
|
||
public class RemoteClusterSecurityCcrFcActionAuthorizationIT extends ESRestTestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test class does not extends the abstract RCS base class because it only uses a single cluster. The single cluster is used as the leader and the test uses a TransportService to directly hit the remote server interface. This allows us to test requests and beheviours that should not happen with normal software execution environment, i.e. it simulates potentially malicious behaviours.
if (false == getMetadata().contains(fileName)) { | ||
throw new IllegalArgumentException("invalid file name [" + fileName + "]"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic here as well as the above ensureSessionShardIdConsistency
method are behaviour change for existing CCR as well. They feel safe enough and are likely useful enhancements. We could have a distributed team member to take a look to be sure. Alternatively we could try conditionally disable these checks for RCS 1 leader clusters. But it may not be easy since this service layer does not have knowledge about the cluster connections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaning towards not disabling them conditionally since as you say it would be complicated to do this; besides, these seem like reasonable validation steps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you suggest here it actually makes sense (and isn't too challenging) to avoid this for old-style actions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more unrelated sanity check: the filenames are for Lucene-maintained segment info files, with a fixed format of segment_N
; having poked around the code a bit (e.g. this is the relevant ES-side part) I don't think there is a way for someone to maliciously (and manually) add a file with a name that would enable directory traversal, unless perhaps they had access to the file system; in that case though they would not benefit from a directory traversal attack to begin with. Was wondering if this is roughly what you have found as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes basically that. Also, I don't think it is possible to physically add a file with a name that enables directory traversal. The file sytsem won't allow such file name to exist in the first place.
RetentionLeaseActions.Add.ACTION_NAME, | ||
RetentionLeaseActions.Remove.ACTION_NAME, | ||
RetentionLeaseActions.Renew.ACTION_NAME, | ||
"indices:monitor/stats", | ||
"indices:internal/admin/ccr/restore/session/put", | ||
"indices:internal/admin/ccr/restore/session/clear", | ||
"internal:transport/proxy/indices:internal/admin/ccr/restore/session/clear", | ||
"indices:internal/admin/ccr/restore/file_chunk/get", | ||
"internal:transport/proxy/indices:internal/admin/ccr/restore/file_chunk/get" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will have a separate PR for adding new consolidated CCR/CCS privileges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great. I made a first pass today -- only nits, and one sanity check so far. Will review the rest, including test tomorrow.
...src/main/java/org/elasticsearch/xpack/security/action/user/TransportHasPrivilegesAction.java
Outdated
Show resolved
Hide resolved
// Cross cluster access user can perform has privilege check | ||
if (authentication.isCrossClusterAccess() && HasPrivilegesAction.NAME.equals(action)) { | ||
assert request instanceof HasPrivilegesRequest; | ||
return ((Authentication) authentication.getAuthenticatingSubject() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional nit: getRemoteAuthenticationFromMetadata
or getAuthenticationFromCrossClusterAccessMetadata
or something similar could be a nice util to add to Authentication
.
((Authentication) authentication.getAuthenticatingSubject().getMetadata().get(CROSS_CLUSTER_ACCESS_AUTHENTICATION_KEY))
feels pretty long and distracting, esp. because of the cast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered that. But I was a bit hesitate to add such method to Authentication
because we don't even have a getMetadata
method for it since metadata
right now is a subject level information. So for now I added getAuthenticationFromCrossClusterAccessMetadata
as a static method to the Authentication class. We can iterate in future to further rationalize how Authentication information should be distributed.
...java/org/elasticsearch/xpack/security/transport/CrossClusterAccessServerTransportFilter.java
Outdated
Show resolved
Hide resolved
...main/java/org/elasticsearch/xpack/security/transport/SecurityServerTransportInterceptor.java
Show resolved
Hide resolved
...main/java/org/elasticsearch/xpack/security/transport/SecurityServerTransportInterceptor.java
Show resolved
Hide resolved
...main/java/org/elasticsearch/xpack/security/transport/SecurityServerTransportInterceptor.java
Outdated
Show resolved
Hide resolved
this.ccrRestoreService = ccrRestoreService; | ||
} | ||
|
||
@Override | ||
protected void doExecute(Task task, ClearCcrRestoreSessionRequest request, ActionListener<ActionResponse.Empty> listener) { | ||
final ShardId shardId = request.getShardId(); | ||
// It can be null if the request is sent from an old node with the internal action | ||
if (shardId != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sanity check: if shardId
is null
, for example for ClearCcrRestoreSessionRequest
, then authorization is guaranteed to fail for the new-style action (indices:internal/admin/ccr/restore/session/clear
) here:
Lines 149 to 151 in 3bd0032
if (indices == null || indices.length == 0) { | |
throw new IllegalArgumentException("the action " + action + " requires explicit index names, but none were provided"); | |
} |
For the old-style action (internal:admin/ccr/restore/session/clear
) on the other hand, we won't hit the above IAE because authz won't go through the regular authz flow.
Is my understanding correct here? If so, I wonder if it makes sense to fail early for new style actions if shardId
is null
, e.g., by making doExecute
or parts of it abstract, and overriding for InternalTransportAction
and TransportAction
accordingly. After all the action is guaranteed to fail.
Apologies if I'm missing something here. I ran out of time getting to the bottom of this today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you said makes perfect sense! Thanks for the prompt. I was focusing on minimize the changes of different transport actions and forgot about that we could actual leverage the difference. In this case, you are absolutely right that we should fail for null
in the new TransportAction
and only have this leniency in the internal one.
This also prompts me to think twice about this comment. Since we have separate transport actions for different RCS models, we could lift the check up into the transport action so that it does not affect existing model. This is my current preference because conditional check for the ShardId
here would also make the change not affect existing model. It would be more consistent if both ShardId
and fileName
checks either affect the existing model or not at all. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, glad I was onto something. Yup, I'm ++ on lifting the other check to the transport action as well. It's good to minimize the impact on the old model and we can always retrofit the check if we deem it worthwhile in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've gone through the test code as well. All looks good. Once the adjustments around separating old-style from new-style actions are made, I will take another quick look but from my perspective this is basically ready to ship!
.../src/javaRestTest/java/org/elasticsearch/xpack/remotecluster/RemoteClusterSecurityCcrIT.java
Show resolved
Hide resolved
...est/java/org/elasticsearch/xpack/security/action/user/TransportHasPrivilegesActionTests.java
Outdated
Show resolved
Hide resolved
if (false == getMetadata().contains(fileName)) { | ||
throw new IllegalArgumentException("invalid file name [" + fileName + "]"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you suggest here it actually makes sense (and isn't too challenging) to avoid this for old-style actions.
if (false == getMetadata().contains(fileName)) { | ||
throw new IllegalArgumentException("invalid file name [" + fileName + "]"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more unrelated sanity check: the filenames are for Lucene-maintained segment info files, with a fixed format of segment_N
; having poked around the code a bit (e.g. this is the relevant ES-side part) I don't think there is a way for someone to maliciously (and manually) add a file with a name that would enable directory traversal, unless perhaps they had access to the file system; in that case though they would not benefit from a directory traversal attack to begin with. Was wondering if this is roughly what you have found as well.
…ecurity/transport/SecurityServerTransportInterceptor.java Co-authored-by: Nikolaj Volgushev <n1v0lg@users.noreply.github.com>
public void ensureSessionShardIdConsistency(String sessionUUID, ShardId shardId) { | ||
final RestoreSession restore = onGoingRestores.get(sessionUUID); | ||
if (restore == null) { | ||
logger.debug("could not get session [{}] because session not found", sessionUUID); | ||
throw new IllegalArgumentException("session [" + sessionUUID + "] not found"); | ||
} | ||
final ShardId sessionShardId = restore.indexShard.shardId(); | ||
if (false == sessionShardId.equals(shardId)) { | ||
throw new IllegalArgumentException( | ||
"session [" + sessionUUID + "] shardId [" + sessionShardId + "] does not match requested shardId [" + shardId + "]" | ||
); | ||
} | ||
} | ||
|
||
public void ensureFileNameIsKnownToSession(String sessionUUID, String fileName) throws IOException { | ||
final RestoreSession restore = onGoingRestores.get(sessionUUID); | ||
if (restore == null) { | ||
logger.debug("could not get session [{}] because session not found", sessionUUID); | ||
throw new IllegalArgumentException("session [" + sessionUUID + "] not found"); | ||
} | ||
// Ensure no file system traversal is possible by only allow file names known to the restore session | ||
if (false == restore.getMetadata().contains(fileName)) { | ||
throw new IllegalArgumentException("invalid file name [" + fileName + "]"); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure whether these methods need to be sychronized. I will seek advices from the distributed team.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's been advised that synchronization are not necessary for these methods since they do not mutate state.
// query cluster error cases - aliases not supported | ||
{ | ||
final Request putCcrRequest = new Request("PUT", "/follower-index-3/_ccr/follow?wait_for_active_shards=1"); | ||
putCcrRequest.setJsonEntity(""" | ||
{ | ||
"remote_cluster": "my_remote_cluster", | ||
"leader_index": "shared-alias" | ||
}"""); | ||
final ResponseException e = expectThrows(ResponseException.class, () -> performRequestWithCcrUser(putCcrRequest)); | ||
assertThat(e.getResponse().getStatusLine().getStatusCode(), equalTo(400)); | ||
assertThat(e.getMessage(), containsString("cannot follow [shared-alias], because it is a ALIAS")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a negative test case specifically for aliases since the indices action has an implicit assumption that the permission must be granted on the concrete indices.
.../java/org/elasticsearch/xpack/ccr/action/repositories/ClearCcrRestoreSessionActionTests.java
Show resolved
Hide resolved
final Engine.IndexCommitRef commitRef = indexShard.acquireSafeIndexCommit(); | ||
final Set<String> fileNames = Set.copyOf(commitRef.getIndexCommit().getFileNames()); | ||
restore = new RestoreSession(sessionUUID, indexShard, commitRef, fileNames, scheduleTimeout(sessionUUID)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Storing the known list of file names on RestoreSession
creation time so that later checks for known file name is lightweight.
Thanks to @Tim-Brooks for the suggestion!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
.../main/java/org/elasticsearch/xpack/ccr/action/repositories/ClearCcrRestoreSessionAction.java
Outdated
Show resolved
Hide resolved
...plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/repository/CcrRestoreSourceService.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Nikolaj Volgushev <n1v0lg@users.noreply.github.com>
@elasticmachine run elasticsearch-ci/part-2 |
@elasticmachine run elasticsearch-ci/part-2-fips |
@elasticmachine run elasticsearch-ci/part-2 |
This PR adds CCR support for RCS 2.0 by pairing its internal actions with indices actions which are safe to use across different clusters.