New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
issue190 Add a recovery test for Compensate and AfterLRA annotations #240
Conversation
I cannot find a test that validates that it is possible to register for an AfterLRA notification when the LRA is already cancelling or closing. This requirement is implicit in the spec but it is probably best to add a sentence that makes it explicit. This behaviour is required since the cancellation or closure of an LRA can take an indefinite amount of time. In contrast, note that it would never make sense to allow participants to register after the end phase has begun. |
@mmusgrov since that scenario doesn't require a restart of the service, do you mind if I move it to a new issue? |
Not at all
…On Mon, Oct 7, 2019 at 4:12 PM Martin Stefanko ***@***.***> wrote:
@mmusgrov <https://github.com/mmusgrov> since that scenario doesn't
require a restart of the service, do you mind if I move it to a new issue?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#240?email_source=notifications&email_token=AADYPNDALYYJRN7I2ED4ZHTQNNGXBA5CNFSM4I6BDDJ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAQWQDY#issuecomment-539060239>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AADYPNA6253VMC7GXDZP3GTQNNGXBANCNFSM4I6BDDJQ>
.
--
Michael Musgrove
mmusgrov@redhat.com
JBoss, by Red Hat
Registered Address: Red Hat Ltd, 6700 Cork Airport Business Park, Kinsale
Road, Co. Cork.
Registered in the Companies Registration Office, Parnell House, 14 Parnell
Square, Dublin 1, Ireland, No.304873
Directors:Michael Cunningham (USA), Vicky Wiseman (USA), Michael O'Neill,
Keith Phelan, Matt Parson (USA)
|
the mentioned scenario moved to #241. This PR is ready for review. |
// wait for the timeout cancellation of the LRA while the service is still down | ||
// Compensate should be attempted to be called while the participant service is down | ||
try { | ||
Thread.sleep(500); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timeOut is also 500, so this is not according to the spec.
Spec around timeout (javaDoc): When this period has elapsed the LRA becomes eligible for cancellation.
This doesn't mean that when period has elapsed, the Compensate method is called.
So that means that we need to adapt the spec; "When this period has elapsed, LRA is cancelled and the Compensate method (if any) MUST be called immediate."
With the current specification, there is no deterministic way in writing such a test as this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that makes sense as when the participant or the service starting the LRA specifies timeout then this is a declaration that it will keep the resources for compensation for this period (or slightly more). If we allow Compensation for a timeout in the eventual point in the future after the timelimit is reached this information may no longer be present or relevant and thus we may get into invalid states.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we say something like when the period has elapsed the LRA should make reasonable effort to only transition to cancelled rather than mandate a specific action at a specific time? I am very hesitant to describe something as being done at an exact time. Even then, perhaps the LRA checks if it is not timed out, this check passes, then it moves to complete the LRA but due to a thread scheduling delay the actual complete call takes place after the timeout has occured. For example:
Time 1: lra.setTimeout(time3)
Time 2: if (!lraHasTimedOut(lra.getTimeout))
Time 3: // server overload, delay of an undefined period of time before next instruction, this LRA is now in theory timed out
Time 4: lra.complete() // Impossible to prevent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should not be any suggestion that a participant can ever decided ot perform it's compensation procedure unilaterally and still expect an overall atomic outcome for the transaction to be achieved.
A resource should guarantee to keep the resources available for compensation until the LRA directs it otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when the period has elapsed the LRA should make reasonable effort to only transition to cancelled rather than mandate a specific action at a specific time?
Yes, this is what we currently have in the spec (or how I interpret it, we have no specific time requirements)
But this TCK test mandate that we have specific time requirements. During undeploy, wait and deploy, it assumes the timeout is completely processed.
We have to accept that when we say 'eventual consistency' and 'at a certain point in the future', etc ..., you basically loose the ability to really test this.
We have already introduced some 'tricks' to trigger the recovery (to make things testable) but maybe we should go the opposite direction and accept that parts of the spec can't be tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xstefank @mmusgrov the way we are discussion things at the moment isn't working at all. People aren't replying to the correct answer, you can't indicate to what question you reply, Github is not even showing all comments anymore, etc...
This way, it is complete waste of my and everyone else his time.
And yes, this comment on the PR isn't either a good place for the above comment but that is the way we seem to work ....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to reply to a specific comment github has a "Quote Reply" option in the ... pull down.
When github hides comments it adds a blue clickable "Load More" where it has hidden them. Clicking on that will expose them again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rdebusscher Part of the problem is that we are fundamentally at odds on how this spec should work which generates lots of discussion. Are the issues a better way to discuss things? If so then as soon as it becomes clear on a PR that there are disagreements we should move back onto the issue for further discussion. Or do you have another suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make assumption on LraRecoveryService to be instantiable without CDI?
@xstefank I think it is okay for the TCK but I would prefer to avoid reliance on CDI in the spec itself since that constrains which frameworks/containers we can integrate with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rdebusscher let's take the discussion on how to better discuss issues on gitter please. I agree that a lot of at least my PRs often diverge into some different issues.
Can we make assumption on LraRecoveryService to be instantiable without CDI?
@xstefank I think it is okay for the TCK but I would prefer to avoid reliance on CDI in the spec itself since that constrains which frameworks/containers we can integrate with.
This is still only TCK. Basically the original idea was to have it instantianable through service loader but I changed to only be a CDI bean. Now we need to go a step back to something similar as service loader. I think that you actually agree with the original statement (Can we make assumption on LraRecoveryService to be instantiable without CDI?)
tck/src/main/java/org/eclipse/microprofile/lra/tck/TckRecoveryTests.java
Show resolved
Hide resolved
tck/src/main/java/org/eclipse/microprofile/lra/tck/TckTestBase.java
Outdated
Show resolved
Hide resolved
tck/src/main/java/org/eclipse/microprofile/lra/tck/TckTestBase.java
Outdated
Show resolved
Hide resolved
tck/src/main/java/org/eclipse/microprofile/lra/tck/service/LRATestService.java
Outdated
Show resolved
Hide resolved
tck/src/main/java/org/eclipse/microprofile/lra/tck/service/LRATestService.java
Outdated
Show resolved
Hide resolved
tck/src/main/java/org/eclipse/microprofile/lra/tck/service/LRATestService.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a few inline comments. The main one was
Can you also include documentation that describes what each test is doing and how it does it.
If you can do that then I will review the rest of the new code.
@mmusgrov ready for review again. |
tck/src/main/java/org/eclipse/microprofile/lra/tck/TckRecoveryTests.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have nearly finished my review but this is what I have so far. I will finish of looking at the rest of the code either later today or over the weekend. But the PR looks goods so far.
tck/src/main/java/org/eclipse/microprofile/lra/tck/participant/api/RecoveryResource.java
Show resolved
Hide resolved
tck/src/main/java/org/eclipse/microprofile/lra/tck/service/LRATestService.java
Outdated
Show resolved
Hide resolved
// wait for the timeout cancellation of the LRA while the service is still down | ||
// Compensate should be attempted to be called while the participant service is down | ||
try { | ||
Thread.sleep(500); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess a short delay is probably enough although a possible implementation could be to mark the LRA as Cancelling
when the timer expires and it then cancels the LRA on its next scheduled recovery pass
which could be a while. Maybe the safest option is to "trigger a recovery pass".
tck/src/main/java/org/eclipse/microprofile/lra/tck/TckRecoveryTests.java
Show resolved
Hide resolved
Signed-off-by: xstefank <xstefank122@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way we discuss isn't efficient, useful and possibly some issues aren't properly covered.
Anyway, I'm just approving this since I can't remove myself as reviewer.
I agree but the tool isn't the best. Do we have suggestions on how to improve it?
You are allowed to abstain. Before you joined the team we asked the broader community for permission to merge PRs. |
tck/src/main/java/org/eclipse/microprofile/lra/tck/service/LRATestService.java
Outdated
Show resolved
Hide resolved
tck/src/main/java/org/eclipse/microprofile/lra/tck/service/LRATestService.java
Outdated
Show resolved
Hide resolved
tck/src/main/java/org/eclipse/microprofile/lra/tck/participant/api/RecoveryResource.java
Outdated
Show resolved
Hide resolved
import java.net.URL; | ||
|
||
@ApplicationScoped | ||
public class LRATestService { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a comment to get a link - please ignore.
Signed-off-by: xstefank xstefank122@gmail.com
resolves #190