Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deserialization of similar token networks #2766

Merged
merged 3 commits into from
Oct 12, 2018
Merged

Fix deserialization of similar token networks #2766

merged 3 commits into from
Oct 12, 2018

Conversation

palango
Copy link
Contributor

@palango palango commented Oct 12, 2018

So that turned out to be quite a bug, much more foundational than I initially thought.

From initial debugging with @CosminNechifor we saw that there was a mismatch in the internal network graph which is part of the TokenNetworkGraphState. However all tries to reproduce this failed, the correct state changes were written to the WAL and when replaying the WAL everything was right.
However, we finally found that we could reproduce this when we started with a snapshot. From here on the problem was easy to track down.

It turns out that that there was a problem during deserialization which made multiple TokenNetworkState objects reference the same TokenNetworkGraphState. This only happened in really special constrains, so it’s not surprising that we haven’t found this earlier:
A snapshot need to be deserialized when there are multiple token networks with the same topology and exactly the same channel participants at the time of the snapshot.
So thanks to the scenario player @CosminNechifor was able to trigger that.

For the fix: It turns out that the TokenNetworkGraphState currently doesn’t know which token network it belongs to. We therefore need a breaking change and add the token network identifier to the TokenNetworkGraphState. With this the objects don’t compare to equal any more and proper instances for each token network are created during deserialization.

This is the easy and quick fix, but in the future we might want to move the TokenNetworkGraphState in the TokenNetworkState or remove it all together when the PFS gets introduced.

Fixes #2662

It turns out that the `TokenNetworkGraphState` currently doesn’t know
which token network it belongs to. We therefore need a breaking change
and add the token network identifier to the `TokenNetworkGraphState`.
With this the objects don’t compare to equal any more and proper
instances for each token network are created during deserialization.
Copy link
Contributor

@rakanalh rakanalh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the change makes sense. As i understand that the RefCache's logic was caused this as one instance was created instead of multiple expected TokenNetworkGraphState, correct?

@palango
Copy link
Contributor Author

palango commented Oct 12, 2018

As i understand that the RefCache's logic was caused this as one instance was created instead of multiple expected TokenNetworkGraphState, correct?

Exactly. This happened easiliy when running the same scenario multiple times with the same nodes. Then the comparison method would return true, even though there should have been different graph states.

def __eq__(self, other):
return (
isinstance(other, TokenNetworkGraphState) and
self._to_comparable_graph() == other._to_comparable_graph() and
self.channel_identifier_to_participants == other.channel_identifier_to_participants
)
def __ne__(self, other):
return not self.__eq__(other)
def _to_comparable_graph(self):
return sorted([
sorted(edge) for edge in self.network.edges()
])

@rakanalh rakanalh merged commit 4846952 into raiden-network:master Oct 12, 2018
@palango palango deleted the fix-2662 branch October 12, 2018 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Routing problem on one hop transfer between long running nodes
2 participants