Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-6063. [Multi-Tenant] Use VOLUME_LOCK in read and write requests, and some minor refactoring #3051

Merged
merged 23 commits into from
Mar 7, 2022

Conversation

smengcl
Copy link
Contributor

@smengcl smengcl commented Feb 7, 2022

What changes were proposed in this pull request?

  • We are already using VOLUME_LOCK for all write requests. It has been done in a previous jira. Just need to acquire the read lock in the appropriate read tenant db table requests. Basically just look for all access to these tables:
  // OMMetadataManager
  Table<String, OmDBAccessIdInfo> getTenantAccessIdTable();
  Table<String, OmDBKerberosPrincipalInfo> getPrincipalToAccessIdsTable();
  Table<String, OmDBTenantInfo> getTenantStateTable();
  Table<String, String> getTenantGroupTable();
  Table<String, String> getTenantRoleTable();
  Table<String, String> getTenantPolicyTable();
  • Rename all occurrences of:
tenantName -> tenantId
tenantUsername -> userPrincipal

for consistency.

  • Rename OMAssignUserToTenantRequest -> OMTenantAssignUserAccessIdRequest for consistency
  • Remove success field in tenant requests since they are redundant (the parent OMResponse already has a success flag built-in, no need to have another success flag in the response).

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-6063

How was this patch tested?

  • All existing tests should pass (after HDDS-6239).

Change-Id: I358503694ac0356c70a42155dd659ae454237a46
Change-Id: Icf8c71e1b6e3279a1e02f03160808af84221a5e4
Change-Id: I6c6feee24ccedfe9b80a360c97a4bf3884e678a9
Change-Id: If8c70a243b13d7da616197aa11759f6ac1ab7138
Change-Id: I3ca7c10ec2b5c0fe261f98beaa1bdb2b8f855ac7
Change-Id: Idcb99f885f28e50fdc012f52397703a9ecced6b4
Change-Id: I6584f6a8be06778299da9a7e55d2a1a21bf955ed
Change-Id: I16c0f29c614d585e5aba6adf31a13cced3582032
@smengcl
Copy link
Contributor Author

smengcl commented Feb 13, 2022

Conflicts:
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java

Change-Id: I4793869d2e6ea9a05e216eb52c7173c46bc65416
Change-Id: I6e77a25a4bccfe6878e438600a18420c98846aa7
@smengcl
Copy link
Contributor Author

smengcl commented Feb 16, 2022

I have resolved conflicts by merging current feature branch into this PR.
This PR is ready for review. Pls take a look.
@avijayanhwx
@prashantpogde
@hanishakoneru

Change-Id: Ic6041395d5d109d2fd3309fe8670b9648b35ade0
Change-Id: I32567c795f9da57b28a3301f9c91994410d09abc
…d -- exception would have been thrown to the client anyway;

(2) Rename all occurrences of:

tenantName -> tenantId
tenantUsername -> userPrincipal
principal -> userPrincipal

for consistency.

(3) Rename `OMAssignUserToTenantRequest` -> `OMTenantAssignUserAccessIdRequest` for consistency.

Change-Id: I47ceebef8352cd90d27fe2b48a876b69cae086fe
@smengcl smengcl changed the title HDDS-6063. [Multi-Tenant] Use VOLUME_LOCK in read and write requests HDDS-6063. [Multi-Tenant] Use VOLUME_LOCK in read and write requests, and some minor refactoring Feb 17, 2022
Change-Id: I564df483c89d8cb9b41939f914c19be3297ee810
Change-Id: I02942c3fdba121f88c63b7f18371a76292380ef4
Copy link
Contributor

@prashantpogde prashantpogde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest Look good to me.

Change-Id: Iafa9dea8713311962be923fefa5970c899ce8368
Change-Id: I26c8f00a732d806ea416c063c159cf49b0109945
Copy link
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes @smengcl. I think we also need to tweak OzoneManager#getS3VolumeContext after our decision to only use volume locks.
- We can't actually get a volume lock here since we don't know the volume until the end of the operations.
- We can remove the TODO and fix the code so that tenantInfo is used to get the s3 volume, instead of a redundant query to the DB.
- If tenantInfo is null, we should throw the OMException like it is, but not log an error since this could happen in a legitimate case.

@@ -374,10 +374,15 @@ public void deactivateUser(String accessID)
}

@Override
public boolean isTenantAdmin(String user, String tenantName) {
public boolean isTenantAdmin(String user, String tenantId) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of problematic method stubs and unused methods in this class, and this is probably one of the worst examples. We must be sure to purge anything like this stub before the merge.

For admin checks specifically, it's a bit of a mess. At the top level, we have this method in OMMultiTenantManager, and OMTenantRequestHelper#checkTenantAdmin. There's one in the OzoneManager as well that OMTenantRequestHelper#checkTenantAdmin delegates to, but I don't think that belongs there because it does not use any OM instance fields and is only called from outside the OM.

As it is I'm not sure the difference between OMMultiTenantManager and OMTenantRequestHelper. OMTenantRequestHelper is not useful as a static utility class because it is called from requests which can already access OMMultiTenantManager through their OM instance. I think the two should be consolidated to reduce code duplication, bugs, and confusion. With these admin checks, for example, we have all three of those.

I think a follow up Jira is necessary to consolidate these two classes and clean them out so they only contain methods that are both needed and correctly implemented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Ideally OMMultiTenantManager and OMTenantRequestHelper should be merged. While writing the Request code earlier it was easier to extract and add my own helper methods rather than calling OMMultiTenantManager methods. It is due for a refactoring. @prashantpogde and I can probably work together on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed HDDS-6387 for this. In the mean time, can we fix this particular case now since we don't want it to get lost in the merge? Probably easiest to have the one caller use the properly implemented version in OMTenantRequestHelper and make this stub return false for now to be safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(being) fixed in #3177

Change-Id: I61fb43308482dcc7bbf7800a642d9ee83228b955
Change-Id: I23cd751800b3346b7bcf45440ac33e396ce48f06
Copy link
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @smengcl. Not sure if you saw my comment here but we should fix/update OzoneManager#getS3VolumeContext in this PR as well.

Change-Id: Idba3a5c45db129684d594e5908eb5427eddbe6f8
@smengcl
Copy link
Contributor Author

smengcl commented Mar 2, 2022

Thanks for the updates @smengcl. Not sure if you saw my comment here but we should fix/update OzoneManager#getS3VolumeContext in this PR as well.

Yup. Sorry I missed that earlier somehow.

I have pushed a commit to address this.

It seems we can only protect OzoneAclUtils.accessIdToUserPrincipal with the volume lock, as volume name is acquired really late in getS3VolumeContext.

Change-Id: I36f5035f4478c8086bf071e4a4a307e997ceefb2
Copy link
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few more comments on the new getS3VolumeContext changes, the rest looks good thanks @smengcl.

Change-Id: I36e98a56301957c820843be614d1d32ac488ef41
Copy link
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for continuously updating this PR @smengcl. Current CI failure looks unrelated and the changes LGTM.

@smengcl
Copy link
Contributor Author

smengcl commented Mar 7, 2022

Thanks a lot for the review! @errose28 @prashantpogde

CI passed on my fork's branch https://github.com/smengcl/hadoop-ozone/commits/HDDS-6063

Will merge shortly.

@smengcl smengcl merged commit 5231830 into apache:HDDS-4944 Mar 7, 2022
errose28 added a commit to errose28/ozone that referenced this pull request Mar 30, 2022
* HDDS-4944: (268 commits)
  HDDS-6366. [Multi-Tenant] Disallow specifying custom accessId in OzoneManager (apache#3166)
  HDDS-6275. [Multi-Tenant] Add feature documentation and CLI quick start guide (apache#3095)
  HDDS-6063. [Multi-Tenant] Use VOLUME_LOCK in read and write requests, and some minor refactoring (apache#3051)
  HDDS-6214. [Multi-Tenant] Fix KMS Encryption/Decryption (apache#3010)
  HDDS-6322. Fix Recon getting inccorrect sequenceNumber from OM (apache#3090)
  HDDS-5913. Avoid integer overflow when setting dfs.container.ratis.lo… (apache#2785)
  HDDS-6313. Remove replicas in ContainerStateMap when a container is deleted (apache#3086)
  HDDS-6186. Selective checks: skip integration check for unit test changes (apache#3061)
  HDDS-6310. Update json-smart to 2.4.7. (apache#3080)
  HDDS-6190. Cleanup unnecessary datanode id path checks. (apache#2993)
  HDDS-6304. Add enforcer to make sure ozone.version equals project.version (apache#3075)
  HDDS-6309. Update ozone-runner version to 20220212-1 (apache#3079)
  HDDS-6293. Allow using custom ozone-runner image (apache#3072)
  HDDS-4126. Freon key generator should support >2GB files. (apache#3054)
  HDDS-6088. Implement O3FS/OFS getFileChecksum() using file checksum helpers - addendum: fix checkstyle
  HDDS-6088. Implement O3FS/OFS getFileChecksum() using file checksum helpers. (apache#2935)
  HDDS-6084. [Multi-Tenant] Handle upgrades to version supporting S3 multi-tenancy (apache#3018)
  HDDS-6257. Wrong stack trace for S3 errors (apache#3066)
  HDDS-6278 Improve memory profile for listStatus API call. (apache#3053)
  HDDS-6285. ozonesecure-mr intermittently failing with timeout downloading packages (apache#3057)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants