-
Notifications
You must be signed in to change notification settings - Fork 502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-8879. Cleanup SecurityConfig and related class initialization #4921
Conversation
…ilabe for further code organization
…security configuration like token enablement, security enablement amongst others.
…and make everything security related rely on SecurityConfig purely.
…r DN does kerberos login.
…he service user before initializing certificate client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @fapifta for the cleanup, LGTM.
Thank you @fapifta for working on this. LGTM +1 |
@@ -472,31 +470,6 @@ public static SCMSecurityProtocolClientSideTranslatorPB getScmSecurityClient( | |||
ugi == null ? UserGroupInformation.getCurrentUser() : ugi)); | |||
} | |||
|
|||
public static SCMSecurityProtocolClientSideTranslatorPB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fapifta , I didn't see obvious benefit of moving this getScmSecurityClientWithFixedDuration from HddsServerUtil.java to HASecurityUtils, could you explain it a bit? Changing this function location, cause a lot of changes in DefaultCertificateClient and HddsDatanodeService. If there is no obvious benefit, I would strongly suggest keep the function in this file, so that all other service modules can call this function, instead of now, only the scm module can call this function.
Others looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method was only referenced from HASecurityUtils#getRootCASignedSCMCert, hence the move and then I also made this private in HASecurityUtils.
I guess you are referring to getScmSecurityClientWithMaxRetry that is part of a change everywhere where we create a CertificateClient, as the creation of the ScmSecurityClient this way is moved out of the CertificateClient. By injecting the ScmSecurityClient via the constructor gives us the benefit of not having to be worried about initialization order in tests we do not need a setter just for testing, and we can easily inject mocking for the communicaiton via the constructor (actually we have to, which makes the test writer think about this, and do not just assume that the server communication would not be call from the test flow), also all the services can separately specify retries, timeouts and such, which can be a benefit in the future (but I admit it is not really a reason now).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fapifta , Yeah, I mean getScmSecurityClientWithMaxRetry, not getScmSecurityClientWithFixedDuration. To support every service can have different scm client retires, timeouts, we can just override the getScmSecureClient() function, I just did it in SCMCertificateClient:)
I'm kind of think it would better to handle the scm client creation in Certificate client internally, instead of every time, it need create the scm client before instantiate a certificate client, so to avoid many duplicate codes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have looked into how we can bring back the SCMSecurityProtocolClient creation into the CertificateClient, and then I realized the main reason why it has to be outside, besides other advantages I think are there and I noted before.
It is using Hadoop RPC, with that it requires the full Configuration, and not just the SecurityConfig, so with that we need to have the full Configuration in the CertificateClient code again, and that means the bulk if this PR is not really meaningful anymore.
Actually the code of creation is not duplicated, we are calling the same utility method everywhere to create the client, and the provide it to the CertificateClient instance.
I have played around with the idea of having the SCMClientConfig in the constructor, and based on it create the client, but that one is also somewhat tedious, as either we take the Configuration and convert it internally, or we do something different...
One idea I can imagine working, where we take a ConfigurationSource in the DefaultCertificateClient constructor instead of SecurityConfig, and then in the DefaultCertificateClient we create the SecurityConfig and the proper SCMClientConfig and then use these further down the line. Unsure, but I can play around with this idea if it sounds more feasible, should I?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I understand the reason. We can get ConfigurationSource from SecurityConfig.getConfiguration(). So the DefaultCertificateClient constructor can just use the SecurityConfig.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can get ConfigurationSource from SecurityConfig.getConfiguration().
After this change we cannot, this method is being removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, then I guess this solution is the most fit one since we want to remove the usage of ConfigurationSource.
Thanks @fapifta for all the effort. I will merge it.
* master: (96 commits) HDDS-8586 Recon. - API for Count of deletePending keys and amount of data mapped to such keys. (apache#4923) HDDS-8908. Intermittent failure in TestBlockDeletion#testBlockDeletion (apache#4958) HDDS-8910. Replace LockManager with striped lock in ContainerStateManager (apache#4962) HDDS-8917. Move protobuf conversion out of the lock in PipelineStateManagerImpl (apache#4965) HDDS-8825. Use apache/hadoop 3.3.5 docker image (apache#4963) HDDS-8906. Avoid stream when getting in-service healthy nodes (apache#4960) HDDS-8907. Store volume count when storage report is updated (apache#4957) HDDS-8905. PipelineManager metrics should not be synchronized (apache#4959) HDDS-8553. Improve scanner integration tests. (apache#4936) HDDS-8854. Avoid unnecessary DatanodeDetails creation for NodeStateManager lookup (apache#4925) HDDS-8315. [Snapshot] Added unit tests for SnapshotDiffManager (apache#4716) HDDS-7968. [Snapshot] Improve KeyDeletingService to reclaim eligible key blocks in snapshot's deletedTable (apache#4935) HDDS-8838. Update default datanode check empty containter on disk to false (apache#4937) HDDS-8763. Support RocksDB iterator with ByteBuffer. (apache#4942) HDDS-8543. FSO directory should reflect bucket/cluster default replication (apache#4947) HDDS-8898. Replication limit should not be less than reconstruction weight (apache#4954) HDDS-8739. Snapdiff should return complete absolute path in Diff Entry (apache#4823) HDDS-8908. Mark TestBlockDeletion#testBlockDeletion as flaky HDDS-8534. Support asynchronous service logging (apache#4663) HDDS-8879. Cleanup SecurityConfig and related class initialization (apache#4921) ...
What changes were proposed in this pull request?
This is a fairly large refactor mainly due to moving files around in different modules/packages.
Besides all these moves the following significant changes were made:
Why I would like to change all of these?
This is one of the steps to simplify and reduce external dependencies of the CertificateClient and related codebase. The final aim is to get to a stage where we can separate the code that is responsible to create the certificate sign request for a service and send it to the SCM, the code that is responsible to store the certificate and key material of a particular service, the code that is responsible to gather the rootCA certificate from SCM, and the code that provides means to set up the different type of secure connections.
Also, by centralizing all the security related configuration and its handling, we have a single pane of glass to see what configurations we use, and what rules we apply to these configuration values.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-8879
How was this patch tested?
Existing JUnit and integration tests.