Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster.Tools.Client.ClusterReceptionist: An entry with the same key already exists. #2535

Closed
cgstevens opened this issue Mar 1, 2017 · 6 comments

Comments

@cgstevens
Copy link

commented Mar 1, 2017

Version 1.3
After having my member .Terminate and shutdown and then come back up when it joins the cluster ALL the other members log the following error.

[akka://mysystem/system/receptionist] - An entry with the same key already exists.

System.ArgumentException: An entry with the same key already exists.
at System.ThrowHelper.ThrowArgumentException(ExceptionResource resource)
at System.Collections.Generic.TreeSet1.AddIfNotPresent(T item) at System.Collections.Generic.SortedDictionary2.Add(TKey key, TValue value)
at Akka.Routing.ConsistentHash.Create[T](IEnumerable`1 nodes, Int32 virtualNodesFactor)
at Akka.Cluster.Tools.Client.ClusterReceptionist.Receive(Object message)
at Akka.Actor.ActorBase.AroundReceive(Receive receive, Object message)
at Akka.Actor.ActorCell.ReceiveMessage(Object message)
at Akka.Actor.ActorCell.Invoke(Envelope envelope)

@Horusiath

This comment has been minimized.

Copy link
Contributor

commented Mar 6, 2017

@cgstevens as I understand shuting down a node gets acknowledged by other nodes by calling it down? Or does it remaing unreachable waiting for reconnection to happen?

It looks like either we could forgot removing an entry somewhere on downing a node event or we should use indexer setter instead of Add method on SortedDictionary.

@cgstevens

This comment has been minimized.

Copy link
Author

commented Mar 9, 2017

@Horusiath I don't have that answer as I the logs didn't say with running Info.
I am currently setting up a new Dev environment and have not had a chance to confirm this yet.

@alexvaluyskiy alexvaluyskiy modified the milestones: 1.2.0, 1.3.0 Apr 6, 2017

@alexvaluyskiy alexvaluyskiy removed this from the 1.3.0 milestone Jul 18, 2017

@mwpro

This comment has been minimized.

Copy link
Contributor

commented Mar 6, 2019

I think, that I've encountered similar issue. I'm running a Akka.NET cluster on Kubernetes with Cluster Receptionist and a second application running Cluster Client. After I deploy a new version of cluster application (it's important that since we're using K8s, the nodes are started on random IP/port), the cluster receptionist happens to return contact points to old nodes that already left the cluster to the client.

I've tried to reproduce the issue locally by adding some extra logging to the ClusterReceptionist code and I've found out that Receptionist is handling the MemberRemoved event correctly but ImmutableSortedSet with RingOrder comperer used beneath does not always handle removing nodes correctly.

Issue reproduction:

Akka.NET version: 1.3.11 and dev branch

Platform: Happens both on Windows dev environment and Docker container running on Kubernetes.

@Aaronontheweb

This comment has been minimized.

Copy link
Member

commented Mar 18, 2019

@mwpro thanks for reporting this - we'll take a closer look at it.

@Aaronontheweb Aaronontheweb self-assigned this Mar 18, 2019

@Aaronontheweb Aaronontheweb added this to the 1.4.0 milestone Mar 18, 2019

Aaronontheweb added a commit to Aaronontheweb/akka.net that referenced this issue Jul 24, 2019

@Aaronontheweb

This comment has been minimized.

Copy link
Member

commented Jul 25, 2019

Looks like the issue is here:

public int Compare(Address x, Address y)
{
var ha = HashFor(x);
var hb = HashFor(y);
if (ha == hb) return 0;
return ha < hb || Member.AddressOrdering.Compare(x, y) < 0 ? -1 : 1;
}

This CompareTo function is, for lack of a better word, useless. Fails the transitive property of equality. The function's basic problem is that if hash of addr1 is lower than hash of addr2, the value from the first half of the equation gets used. But if I reverse the call, the second half of the equation gets used - the Address.CompareTo method. There's no guarantee that the hash value and the member address value are both going to sort in the same order.

I rewrote the method to use the following code:

 public int Compare(Address x, Address y)
            {
                var ha = HashFor(x);
                var hb = HashFor(y);

                if (ha == hb) return Member.AddressOrdering.Compare(x, y);
                return ha.CompareTo(hb);
            }

The Member.AddressOrdering only gets called in the rare event that the hashes are equal. Otherwise, we simply return the comparison values of the hashes themselves. All of the tests written by @mwpro now pass as a result.

@Aaronontheweb

This comment has been minimized.

Copy link
Member

commented Jul 25, 2019

Just wanted to point something out:

It looks like either we could forgot removing an entry somewhere on downing a node event or we should use indexer setter instead of Add method on SortedDictionary.

This would have made the problem worse - the reason why there's an error in the first place inside the ConsistentHash.Create method is because we passed in a range of duplicate items from a SortedSet<T> as a result of the bug in the IComparer - papering over that with a hack would have made the underlying problem more difficult to find.

Aaronontheweb added a commit to Aaronontheweb/akka.net that referenced this issue Jul 25, 2019

Aaronontheweb added a commit that referenced this issue Jul 26, 2019

ClusterClient fixes (#3866)
Fixed broker `IComparer` for ClusterClient hash ring and ported over other handoff fixes.

Close #2535
Close #2312
Close #3840

* implemented akka/akka#24167

* implemented akka/akka#22992

Aaronontheweb added a commit to Aaronontheweb/akka.net that referenced this issue Jul 26, 2019

ClusterClient fixes (akkadotnet#3866)
Fixed broker `IComparer` for ClusterClient hash ring and ported over other handoff fixes.

Close akkadotnet#2535
Close akkadotnet#2312
Close akkadotnet#3840

* implemented akka/akka#24167

* implemented akka/akka#22992
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.