New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sure BackoffSupervisorSpec specs are not racy anymore, enable skipped tests. #5014
base: dev
Are you sure you want to change the base?
Conversation
var supervisor = Create(OnStopOptions()); | ||
supervisor.Tell(BackoffSupervisor.GetCurrentChild.Instance); | ||
var c1 = ExpectMsg<BackoffSupervisor.CurrentChild>().Ref; | ||
var supervisor = Create(OnStopOptions().WithManualReset()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Supervisor props need to be forced to use manual reset to make sure that it doesn't auto reset the internal restart count mid-spec and fails the spec.
public void BackoffSupervisor_must_stop_restarting_the_child_after_reaching_maxNrOfRetries_limit_using_BackOff_OnStop() | ||
{ | ||
var supervisor = Create(OnStopOptions(maxNrOfRetries: 2)); | ||
var supervisor = Create(OnStopOptions(maxNrOfRetries: 2).WithManualReset()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Supervisor props need to be forced to use manual reset to make sure that it doesn't auto reset the internal restart count mid-spec and fails the spec.
public void BackoffSupervisor_must_stop_restarting_the_child_after_reaching_maxNrOfRetries_limit_using_BackOff_OnFailure() | ||
{ | ||
EventFilter.Exception<TestException>().Expect(3, () => | ||
{ | ||
var supervisor = Create(OnFailureOptions(maxNrOfRetries: 2)); | ||
var supervisor = Create(OnFailureOptions(maxNrOfRetries: 2).WithManualReset()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Supervisor props need to be forced to use manual reset to make sure that it doesn't auto reset the internal restart count mid-spec and fails the spec.
var supervisor = Create(OnStopOptions().WithManualReset()); | ||
|
||
IActorRef c1 = null; | ||
AwaitCondition(() => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Child might not be immediately available, have to wait until it actually is
ExpectTerminated(c1); | ||
ExpectMsg<Terminated>().ActorRef.Should().Be(c1); | ||
|
||
AwaitAssert(() => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no guarantee that the supervisor will observe the child termination immediately. We know that it should, sometime in the very near future, so we need to await this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Still racy somehow, need to recheck the code |
@@ -372,46 +392,52 @@ IActorRef WaitForChild() | |||
var c1 = ExpectMsg<BackoffSupervisor.CurrentChild>().Ref; | |||
Watch(c1); | |||
c1.Tell(PoisonPill.Instance); | |||
ExpectTerminated(c1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can rarely fail, need to track down the real cause because its not supposed to happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can that fail? If c1
processes the PosionPill
it has to die? Is this taking longer than 3 seconds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the test received a Terminated message but from another actor, which is really weird.
It only happened once every 1000 test or so iteration, making it hard to debug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah ok - so if we're death watching two actors the easiest way to solve that race condition is to create two different test probes, one to death watch each actor. No overlap that way. Although I expect in this case what makes it hard is that it's two different children from the same parent being watched, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might be a solution, will have to explore it a bit more
No description provided.