Skip to content

HDDS-9388. OM Ratis Write: Move ACL check and Bucket resolution to preExecute#5694

Merged
smengcl merged 18 commits intoapache:masterfrom
duongkame:HDDS-9388
Jan 4, 2024
Merged

HDDS-9388. OM Ratis Write: Move ACL check and Bucket resolution to preExecute#5694
smengcl merged 18 commits intoapache:masterfrom
duongkame:HDDS-9388

Conversation

@duongkame
Copy link
Contributor

What changes were proposed in this pull request?

ACL checks and bucket soft-link resolution should be moved to preExecute. Today, these checks are done in validateAndUpdateCache and performed in each replica. Those should be done when the leader node receives the operation. Checks should be validated upfront, not when applying the state.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-9388

How was this patch tested?

CI.

@duongkame duongkame marked this pull request as ready for review November 28, 2023 23:14
@duongkame duongkame marked this pull request as draft November 29, 2023 01:58
@duongkame duongkame marked this pull request as ready for review December 1, 2023 20:25
@adoroszlai
Copy link
Contributor

Thanks @duongkame for the patch.

Failure in TestOzoneAtRestEncryption.mpuWithLink and TestOzoneFileSystemWithFSO.testFSDeleteLogWarnNoExist seem related, can be reproduced locally.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are calls to checkKeyAcls from some validateAndUpdateCache methods, so far I've found:

  • OMKeyRenameRequestWithFSO
  • S3MultipartUploadCompleteRequest
  • OMRecoverLeaseRequest

The first two also resolve bucket links.

Are these intentional?

@duongkame
Copy link
Contributor Author

There are calls to checkKeyAcls from some validateAndUpdateCache methods, so far I've found:

  • OMKeyRenameRequestWithFSO
  • S3MultipartUploadCompleteRequest
  • OMRecoverLeaseRequest

The first two also resolve bucket links.

Are these intentional?

They're not intentional. I was still cleaning up a few places. Thanks for the detailed look, @adoroszlai .

@duongkame
Copy link
Contributor Author

There's an open point from this PR. Some S3 APIs like MultipartInfoInitiate or MultipartUploadComplete return unnecessary/redundant data like original volume, bucket and key names in their response. With the refactor to move bucket resolution to preExecute, we don't have the original bucket/volume name in applyTransaction to include in the response.
Anyway, those are redundant data and are not used in S3G (only used in tests), I decided to just mark the data @deprecated for compatibility, and removed all the relevant usage in tests.

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duongkame Thanks for working over this, this changes may have issue,
in pre-execute, it can not validate the relational things which may be part of previous request as,

  • link bucket validation as link bucket may not be created and pending to be excuted
  • ACL validation as acl change may be pending to be executed

Only static validation related to request can be done, Or some thing from system perspective which normally do not depend on previous possible request

@kerneltime
Copy link
Contributor

@duongkame Thanks for working over this, this changes may have issue, in pre-execute, it can not validate the relational things which may be part of previous request as,

  • link bucket validation as link bucket may not be created and pending to be excuted
  • ACL validation as acl change may be pending to be executed

Only static validation related to request can be done, Or some thing from system perspective which normally do not depend on previous possible request

Concurrent requests can be interleaved in any order. I am not sure this is an issue. Do you have a specific issue in mind?

@errose28
Copy link
Contributor

@sumitagrawl I think you are also describing what I wrote on the Jira here. IMO this is more of a minor change in behavior than an issue. The server is still failing the request based on an ordering either way.

@sumitagrawl
Copy link
Contributor

@sumitagrawl I think you are also describing what I wrote on the Jira here. IMO this is more of a minor change in behavior than an issue. The server is still failing the request based on an ordering either way.

@errose28
yes, 2 cases we are changing,

  1. ACL as already mentioned
  2. linked bucket resolution also

@duongkame
Copy link
Contributor Author

@sumitagrawl I think you are also describing what I wrote on the Jira here. IMO this is more of a minor change in behavior than an issue. The server is still failing the request based on an ordering either way.

@errose28 yes, 2 cases we are changing,

  1. ACL as already mentioned
  2. linked bucket resolution also

@sumitagrawl
I don't see problems with Link resolution in preExecute. E.g. at the time a key-create request reaches the leader node, the request should be rejected if the link is not there. This is different from before when the link creation can happen after a while before the key-creation request reaches each replica. A reverse example is when a link is being removed. Either case, making the decision at the leader makes sense to me.

Yes, there are changes in behaviors the edge cases but they're not necessarily issues.

The same story for moving ACL checks, as @errose28 mentioned.

I think moving validation/pre-processing out of applyTransaction is the right direction. ApplyTransaction should be to apply the state of a request to the state machine.

C.c. some more folks as I need your opinions too. @kerneltime @szetszwo @smengcl.

Copy link
Contributor

@smengcl smengcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Core change lgtm. Thanks @duongkame for this

@smengcl smengcl merged commit cca6782 into apache:master Jan 4, 2024
@smengcl
Copy link
Contributor

smengcl commented Jan 4, 2024

Thanks @duongkame for the patch, and @adoroszlai @sumitagrawl @kerneltime @errose28 for the reviews and comments.

@duongkame
Copy link
Contributor Author

Thanks @smengcl for the reviews and merging it.

generateRequiredEncryptionInfo(keyArgs, newKeyArgs, ozoneManager);
CreateFileRequest.Builder newCreateFileRequest =
createFileRequest.toBuilder().setKeyArgs(newKeyArgs)
createFileRequest.toBuilder().setKeyArgs(resolvedArgs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out this causes a regression where encrypted buckets won't correctly have file encryption info in keyargs set. Because encryption info is generated in generateRequiredEncryptionInfo above and put into newKeyArgs.

Credit to @jojochuang for locating this bug.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duongkame duongkame deleted the HDDS-9388 branch April 12, 2025 00:10
swamirishi pushed a commit to swamirishi/ozone that referenced this pull request Dec 3, 2025
…lution to preExecute (apache#5694)

Change-Id: Ib4b9f7ec9c0320342485f20405197a1b05e61333

Conflicts: hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestOzoneClientMultipartUploadWithFSO.java
	hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestOzoneRpcClientAbstract.java
	hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestOzoneRpcClientWithRatis.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMAllocateBlockRequest.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyCommitRequestWithFSO.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyCreateRequestWithFSO.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyDeleteRequestWithFSO.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyRenameRequest.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyRenameRequestWithFSO.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyRequest.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/s3/multipart/S3InitiateMultipartUploadRequest.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/s3/multipart/S3InitiateMultipartUploadRequestWithFSO.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/s3/multipart/S3MultipartUploadAbortRequest.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/s3/multipart/S3MultipartUploadCommitPartRequest.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/s3/multipart/S3MultipartUploadCompleteRequest.java
	hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/request/file/TestOMDirectoryCreateRequest.java
	hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/request/file/TestOMDirectoryCreateRequestWithFSO.java
	hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/request/key/TestOMKeyDeleteRequest.java
	hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/request/key/TestOMKeyDeleteRequestWithFSO.java
	hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/request/key/TestOMKeyRenameRequestWithFSO.java
	hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/response/TestCleanupTableInfo.java
Change-Id: I807c3d08ff63a6ec21023f4dd65ad4801e27e6e7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants

Comments