Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed Akka.Remote.ResendUnfulfillableException: Unable to fulfill resend request since negatively acknowledged payload is no longer in buffer. #3914

Merged

Conversation

Aaronontheweb
Copy link
Member

closes #3905

@Aaronontheweb
Copy link
Member Author

Added reproducer - letting it run on build server first so I can get a record of it.

@Aaronontheweb
Copy link
Member Author

Spec failed, as predicted - going to review the logs

@Aaronontheweb
Copy link
Member Author

Looks like the issue was having multiple public constructors defined.

@Aaronontheweb
Copy link
Member Author

Ah, still an issue with the test design:

Cause: System.MissingMethodException: Constructor on type 'Akka.Remote.Tests.MultiNode.TransportFailSpecConfig+TestFailureDetector' not found.
   at System.RuntimeType.CreateInstanceImpl(BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture, Object[] activationAttributes, StackCrawlMark& stackMark)
   at System.Activator.CreateInstance(Type type, BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture, Object[] activationAttributes)
   at System.Activator.CreateInstance(Type type, Object[] args)
   at Akka.Remote.FailureDetectorLoader.Load(String fqcn, Config config, ActorSystem system)
   at Akka.Remote.Transport.AkkaProtocolManager.CreateTransportFailureDetector()
   at Akka.Remote.Transport.AkkaProtocolManager.CreateOutboundStateActor(Address remoteAddress, TaskCompletionSource`1 statusPromise, Nullable`1 refuseUid)
   at Akka.Remote.Transport.AkkaProtocolManager.Ready(Object message)
   at Akka.Actor.ActorCell.<>c__DisplayClass114_0.<Akka.Actor.IUntypedActorContext.Become>b__0(Object m)
   at Akka.Actor.ActorBase.AroundReceive(Receive receive, Object message)
   at Akka.Actor.ActorCell.ReceiveMessage(Object message)
   at Akka.Actor.ActorCell.Invoke(Envelope envelope)

@Aaronontheweb
Copy link
Member Author

@seungyongshim found it - the reproduction spec works.

[ERROR][9/13/2019 1:57:23 AM][Thread 0023][remoting] Association to [akka.tcp://TransportFailSpecConfig@localhost:1728] with UID [1436389813] is irrecoverably failed. Quarantining address.
Cause: Akka.Pattern.IllegalStateException: Error encountered while processing system message acknowledgement buffer: [Remote message  -> [akka.tcp://TransportFailSpecConfig@localhost:1728/user/subject#2052697585]] ack: ACK[1, 0] ---> Akka.Remote.ResendUnfulfillableException: Unable to fulfill resend request since negatively acknowledged payload is no longer in buffer. The resend states between two systems are compromised and cannot be recovered
   at Akka.Remote.AckedSendBuffer`1.Acknowledge(Ack ack)
   at Akka.Remote.ReliableDeliverySupervisor.<Receiving>b__33_3(Ack ack)
   --- End of inner exception stack trace ---

node2__second__net.txt
node1__first__net.txt

Going to work on implementing the fix and this test should pass without any issues afterwards.

@Aaronontheweb Aaronontheweb marked this pull request as ready for review September 18, 2019 21:33
@Aaronontheweb
Copy link
Member Author

This is finished - going to see if CI agrees with me. This PR is a port of akka/akka#23129

@Aaronontheweb
Copy link
Member Author

Reproducer spec still failed - the other unit test failures appear to be usual racy stuff. Taking a look at it.

@Aaronontheweb
Copy link
Member Author

No more quarantine, but....

[Node1:first][FAIL] Akka.Remote.Tests.MultiNode.TransportFailSpec.TransportFail_should_reconnect
[Node1:first][FAIL-EXCEPTION] Type: System.NullReferenceException
--> [Node1:first][FAIL-EXCEPTION] Message: Object reference not set to an instance of an object.
--> [Node1:first][FAIL-EXCEPTION] StackTrace:    at Akka.Remote.Tests.MultiNode.TransportFailSpec.<TransportFail_should_reconnect>b__6_2() in D:\a\1\s\src\core\Akka.Remote.Tests.MultiNode\TransportFailSpec.cs:line 131
   at Akka.Remote.Tests.MultiNode.TransportFailSpec.TransportFail_should_reconnect() in D:\a\1\s\src\core\Akka.Remote.Tests.MultiNode\TransportFailSpec.cs:line 115

@Aaronontheweb
Copy link
Member Author

Ah, ok - that means that the ActorSelection failed to resolve the first time around, so that message may have been sent too early.

@Aaronontheweb
Copy link
Member Author

Looks like there's also a definite failure in the ActorsLeakSpec in Akka.Remote.Tests - going to investigate that as well.

@Aaronontheweb
Copy link
Member Author

Fixed the final issue - it was an equality by value issue with some of the fixes I introduced to the Endpoint actors in Akka.Remote. Have the reproducer spec passing locally now and the entire Akka.Remote test suite.

@Aaronontheweb Aaronontheweb changed the title Working on fixing akka.remote.ResendUnfulfillableException: Unable to fulfill resend request since negatively acknowledged payload is no longer in buffer. Fixed Akka.Remote.ResendUnfulfillableException: Unable to fulfill resend request since negatively acknowledged payload is no longer in buffer. Sep 19, 2019
@Aaronontheweb
Copy link
Member Author

Looks like there's an issue with the ActorsLeakSpec where we are now keeping the ReliableEndpointWriter alive, by design as a result of this bugfix. Going to take a look at that and see what the issue is.

@Aaronontheweb
Copy link
Member Author

Found the issue - we were only waiting 5 seconds for the TooLongIdle message to trigger the ReliableDeliverySupervisor shutdown, which takes 3 seconds to fire at least. Bumped it to 10 seconds, which matches what the JVM does.

@Aaronontheweb Aaronontheweb merged commit 65b4f55 into akkadotnet:dev Sep 20, 2019
@Aaronontheweb Aaronontheweb deleted the fix-3905-resendUnfulfillableException branch September 20, 2019 22:21
Aaronontheweb added a commit to Aaronontheweb/akka.net that referenced this pull request Sep 23, 2019
…end request since negatively acknowledged payload is no longer in buffer. (akkadotnet#3914)

close akkadotnet#3905 - Fixed Akka.Remote.ResendUnfulfillableException: Unable to fulfill resend request since negatively acknowledged payload is no longer in buffer.
Aaronontheweb added a commit to Aaronontheweb/akka.net that referenced this pull request Sep 23, 2019
…end request since negatively acknowledged payload is no longer in buffer. (akkadotnet#3914)

close akkadotnet#3905 - Fixed Akka.Remote.ResendUnfulfillableException: Unable to fulfill resend request since negatively acknowledged payload is no longer in buffer.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Found missing porting of Akka.Remote.ResendUnfulfillableException patch from Akka
1 participant