Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-6213. Pluggable OzoneManager request handling hooks #3104

Merged
merged 36 commits into from
Mar 17, 2022

Conversation

fapifta
Copy link
Contributor

@fapifta fapifta commented Feb 16, 2022

What changes were proposed in this pull request?

During developing different new features, the different protocols have to be kept in a stance where the old client can talk to a new server, and a new client also can talk to an old server.
The latter was solved earlier by exposing the server version in the ServiceInfo, but the prior is a hard thing to deal with.

After a few rounds of discussions we came up with an idea to implement a pluggable pre and post validation system, that can help us to either right away reject a specific query from an old client, or if the response contains something that the client can understand, give us the possibility to re-write the regular response to a form that the old client can understand, or at least give the old client a specific error message that says why the request fails.
Similarly, the system has the possibility to adjust the request coming from the old clients if it is possible, so that the request can be served.

Later on we have identified that there is a specific lifecycle phase, when the cluster is not finalized but already upgraded, certain requests in these cases might need to be right away cancelled, based on some request properties, but just until the new feature is not finalized.

With this, we have created 4 conditions in which a validation can be activated (note that one validator can be activated by multiple conditions, and one condition can trigger multiple validators). These conditions are when a newer client posts a request, when an older client posts a request, when the cluster is in the pre-finalized state, and in case of every query a validator might need to run.

Validators are assigned for request types, and request processing phases as well.
Request types are as defined for the Ozone Manager protocol definition, while processing phases for now are pre and post processing.

A post processor can not be a pre-processor at the same time, and for the different phases the validator method signature is different.
Similarly, a validator can belong to only one request type.
Binding happens at initialization time, when the Ozone Manager RPC server starts up.

The validator methods are discovered based on the newly introduced RequestFeatureValidator annotation. Note that this annotation has a processor added as well, which checks requirements against annotated validation methods at compile time.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-6213

How was this patch tested?

JUnit tests added for the system, OM request handler code change is straightforward enough that is the only part for which I think tests are not necessary, let me know if I am wrong, and the current tests does not prove that the integration will work.

@@ -47,6 +47,11 @@ https://maven.apache.org/xsd/maven-4.0.0.xsd">
<groupId>javax.annotation</groupId>
<artifactId>javax.annotation-api</artifactId>
</dependency>
<dependency>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this one is weird, but without this, the hdds-interface-client module was not compiling, and it was missing a class from this package once I have added our annotation-processing module. I would be glad if someone can explain what is the reason behind the necessity of this dependency definition after adding the ozone-annotation-processing module.

@fapifta fapifta changed the title Request validation Pluggable OzoneManager request handling hooks Feb 16, 2022
@fapifta fapifta changed the title Pluggable OzoneManager request handling hooks HDDS-6213. Pluggable OzoneManager request handling hooks Feb 16, 2022
@errose28
Copy link
Contributor

Thanks for creating this @fapifta. I haven't had a chance to look it over yet, but I'm just wondering how this relates to the client versioning in #3031? Is this a replacement for that PR or is it designed to work together with it?

@kerneltime
Copy link
Contributor

@errose28 this is a more generic framework for adding request pre and post validations. The fields in place for OM and Client are the same this is an alternative way to define and invoke validations.

<execution>
<id>default-compile</id>
<configuration>
<compilerArgument>-proc:none</compilerArgument>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a note as to why we need -proc:none and the sequence of events that should occur when we fresh build the repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for noting this!, I have extended the note into the comment before the plugin definition about the compiler argument specified here.

* License for the specific language governing permissions and limitations under
* the License.
*/
package org.apache.hadoop.ozone.om.request.validation2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this package for a sample? Not sure why it needs to be there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is here for a test case.
The bulk of the tests are using the simple validation package, but in that we have validators for all kinds of stuff.
This package is created so that when the validator registry is instantiated, we can provide this package to scan for some of the tests, and we can test for cases when there are no validators for specific conditions, or phases or request types. As the discovery discovers classes with an annotation under a classpath url, I can not test for both, hence I created this package where I have just one specific validation.

@fapifta
Copy link
Contributor Author

fapifta commented Feb 22, 2022

Thank you @kerneltime for your notes, I have added answers for you.

In the meantime, I was thinking about how to answer to @errose28 beyond the simple yes, to give some more context I have realized a few questions, for which I am trying to come up with good answers and reasoning, but still I feel we should go over them one by one, and discuss it once more. I am wondering what would be a good platform for that discussion.

The questions:

  1. The condition NEW_CLIENT_REQUESTS, is something which is a bit futuristic, as how on earth an old server will new about a new client, and I was wondering if ever this will be relevant...
  2. I am still wondering if an unconditional validation will ever be a way we want to go under, as that has consequences on how we organize request handling code...
  3. Ordering of the validators is again a good question, will there be ever an ordering required (if we use unconditionally hooked in methods, then certainly it will be required...)
  4. The current structure is also a good question, now we can define a validator in terms of the validation condition, the request type, and by the hook... Request type is tied to the protocol version, and it is really not easy now to add the same validation to two requests.

To give some short answers:

  1. this we can use in rare cases, for example when an incompatible change went through, then this could be an easy way to provide a fix for those whom are having trouble because of that change within an already upgraded cluster and an old one in case those need to communicate from the newer code side. The cost of this is negligible when we look at maintenance or request processing performance.
  2. unconditionals can be used to move out authorization and auditing for example, but if we go down this route then naming of the classes and methods needs to change, as these are not strictly validations of request features...
  3. ordering is easy to add, and inevitable I believe if we implement those unconditionals mentioned in 2.
  4. is a good enforcer of a - I think - better design, as it forces an implementor to implement logic for one request only, and helps us avoid speculative generalism, with that switch-cases and ifs inside a validator method (how much of this is whisful thinking?), also gives a good rationale to put validators into the request processing classes. (Note: read requests' processing have to be moved out to their own classes for this though)

The idea @kerneltime came up with to perhaps move some generic request processing part into validator methods, made me also think further...
If we go with unconditionals and use them for such things like authorization of the request, then we probably should rename a few things:
RequestFeatureValidator -> RequestPreProcessor and RequestPostProcessor (with that we can remove the RequestProcessingPhase enum)
ValidatorRegistry -> RequestHookRegistry (or something like that)
RequestValidations -> RequestHooks (or something like that)

The idea to use this with other protocols than just the ozone client protocol, also brings us to some further changes that are necessary to extend the whole API:

  • move the whole thing to hdds-common
  • we might move the javac annotation processor to hdds/annotations
  • we need to generalize the request type in the annotations (also it would still be nice to validate that somehow, and that is hard to solve due to unknown dependencies, unless we have a generic annotation processor and specific annotation processors defined alongside the protocol implementation we want to hook into)
  • we still need one wrapper around the registry per protocol, as integration with the protocol might varies for different protocols, and request parameters might be different, this has to be examined
  • we have to decide where this would be useful, and we need to introduce client version into the protocols where it is missing.
  • probably more, I still just experimenting with this in my head... also I have no clue if there would be a necessary incompatible change down this route...

Q1 also leads to one more thing, for which I did not do any research, so I just put it here as something we might examine later on: newer client requests condition can be something used on the client side, if there is such an easy way to hook things into the client side of the request, as in that case the client can use this condition to alter requests when talking to older servers... again... this came up just now in my mind, and I am absolutely unsure if we can easily hook on client side, and whether it makes any sense to do this there.
(Note to self: the idea feels like if I found a golden hammer and I started to see nails everywhere, so please argue with the usefuleness of this...)

Last but not least, there was a main concern from @kerneltime, how to ensure these methods will not linger somewhere in the code, and how we can be sure which methods were running during pre/post process phase for a given query. Also how we can make this easy for devs to realize that there are these validations.
One answer is code organization (validators along with request handlers), and possibly an enforcement of that, an other is logging, these concerns are still not addressed in the code.

Copy link
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @fapifta. Overall I think it is very well designed. I haven't had a chance to review the test code yet, but wanted to add some questions/comments for us to discuss before our meeting this afternoon.

* If you plan to use this, please justify why the validation code should not
* be part of the actual request handling code.
*/
UNCONDITIONAL(r -> true, ctx -> true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need these last two right now. Maybe it would be better to remove them to reduce confusion? I can see the unconditional validator being misused since there isn't really a clearly defined use case. I think most newer client handling is going to happen on the client side based on the service info call, so having it here might be misleading for devs working on something with forward compatibility concerns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is kind of a dilemma for me as well, I have removed these two for now, however I started to look into the possibilities of extending the framework to these use cases on the client side, and what changes might be necessary to use a similar approach there and whether it would be beneficial or not... But in case it will happen we can re-introduce the NEWER_CLIENT_REQUESTS, once we start to think about moving out the authorization or audit for example, we can re-introduce UNCONDITIONAL, and if we change both things, we might forget about them :)

Just to do not loose the tests for these enum values, I have just commented out test code, and the enum values, so we can reintroduce them easier if we want.

}

boolean shouldApply(OMRequest req, ValidationContext ctx) {
return shouldApplyTo.test(req) && shouldApplyIn.test(ctx);
Copy link
Contributor

@errose28 errose28 Mar 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are just ANDing the conditions together, would it be easier to have ValidationConditions take just one predicate, and let the predicate's implementation define what conditions to act on? The current implementation seems to frequently require these true placeholders. I think it would read clearer if ValidationConditions were declared like:

  CLUSTER_NEEDS_FINALIZATION(ctx -> ctx.versionManager().needsFinalization()),
  OLDER_CLIENT_REQUESTS(r -> r.getVersion() < ClientVersions.CURRENT_VERSION),
  NEWER_CLIENT_REQUESTS(r -> r.getVersion() > ClientVersions.CURRENT_VERSION),
  UNCONDITIONAL(r -> true);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As predicates are type parametrized, we can not use just one predicate that processes either a context or a request, but you are right, it would be way more better to have a more simpler approach to express the condition.

How about using an abstract method and overriding it on a per enum value basis like the one I added instead of the old way?

Copy link
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @fapifta. Minor comments on the tests but this looks good overall.

fapifta and others added 19 commits March 11, 2022 22:31
…lem, remove unnecessary extra compile step from annotation processing module.
This reverts commit 585c0a700e310254c49a0d530be4ccfddd714dfc.
…on itself, with that generalize the validations side logic.
Co-authored-by: Ritesh H Shukla <kerneltime@gmail.com>
…dator's packages to be encapsulated within the tests' package.
@fapifta fapifta force-pushed the request_validation branch 2 times, most recently from 40617af to ce57565 Compare March 11, 2022 21:41
Copy link
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updated @fapifta LGTM.

@fapifta
Copy link
Contributor Author

fapifta commented Mar 17, 2022

One of the failing flaky tests is TestOzoneManagerHAMetadataOnly which I tried to run locally, and it ran green, while the other two was ignored earlier, and does not seem to work at all now.

Based on this and the +1 from @errose28 I am merging this one to master.
Thank you for the reviews!

@fapifta fapifta merged commit 74d92c8 into apache:master Mar 17, 2022
@adoroszlai
Copy link
Contributor

Sorry for being late here.

Why was the new annotation added in a completely separate project in dev-support? We already have custom annotations defined in regular submodules used in other ones.

1.dev-support usually contains stuff that is useful but not necessary for the project to work. This is not the case here.
2. It requires duplicate definition of few dependencies and checks (checkstyle, spotbugs, etc.).
3. Versioning and release process is not clear. versions:set needs to be applied separately to this module (see #3317). This may break downstream builds. I would guess the jar also needs to be uploaded to Nexus separately during release.

@fapifta
Copy link
Contributor Author

fapifta commented Apr 20, 2022

Hi @adoroszlai

no worries, you may be on time, before we go to far down this road :)
I am open to any better place for this module... Let me first shed light about why this exists as a separate thing.

Even though we have other annotations, we do not have anything that validates those annotations, and for this annotation, we wanted to ensure some invariants about the methods that are annotated with it, as there are some expectations about the methods where they are used.
This module defines the annotation processor that is used during the compilation of the project, not even the annotation itself. The annotation processor checks the annotated methods during compile time, and issues an error if any of the invariants are violated by the annotated method.

As such, this module has to be compiled before anything else, as compilation of anything else depends on the jar created from this module. If this module is a child of the main project, then we have a circular dependency as this module would depend on the parent while the parent would depend on the module... In order to solve this I chose to separate the module from the main pom, but made it as a dependency of the main pom with that hook it in onto the build. Certain tools does not hook it in seamlessly like version:set, which is a problem, I admit, I do not have any good resolution for.
Spotbugs and Checkstyle dependency is added to cover the module code within CI properly, this is also something I do not have a better resolution for.

Dev-support in the main project dir seemed to be a good idea for one reason, this annotation processor is there to help developers to ensure the methods annotated with the annotation are correct, and do not violate the invariants, and if we want this may be optional for the regular dev cycle and it can be just mandatory in the CI level, but that I think would have just complicated things more. I am open to have any better place to build this. Also I had an other idea to have even a separate project like ozone-sdk or such, where we can put similar things, but for just this one use case that seemed to be an overkill.

Let's discuss this further at anytime, even offline, or in a new JIRA where we can discuss the proper solution... I have a couple of other items that is related to this and related PRs, and I think there has to be some significant follow up work on this system, as we started to learn more about this while we started to use it in EC upgrade compat problems.

@adoroszlai
Copy link
Contributor

Thanks @fapifta for the detailed answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants