Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster unstable when running tcp example #84

Closed
sakno opened this issue Nov 9, 2021 Discussed in #83 · 1 comment
Closed

Cluster unstable when running tcp example #84

sakno opened this issue Nov 9, 2021 Discussed in #83 · 1 comment
Assignees
Labels
bug Something isn't working
Projects
Milestone

Comments

@sakno
Copy link
Collaborator

sakno commented Nov 9, 2021

Discussed in #83

Originally posted by AntonWerenberg November 8, 2021
I've been experimenting with the raft node example code for some time, and one issue keeps being present: Instability in the cluster.

The nodes will often run fine in the beginning, but sooner or later they will start to go out of sync and fail. One node will show the tcp error warning:
"warn: DotNext.Net.Cluster.Consensus.Raft.Tcp.TcpServer[74022]
Request has timed out"
and others will start running elections, but will not reach consensus.

the behaviour can be seen in the attached image.

I'm running with default election timeout settings from the example.

image

I'm running with default election timeout settings from the example.

I'm not a great programmer and I'm really having trouble seeing which direction to go to get to the bottom of this?

My main concern is currently if this could be due to my own setup, something not configured correctly, or something like that.

I measured Broadcasttime using Metrics collector. It is showing broadcast times of around 3 ms.

@sakno sakno self-assigned this Nov 9, 2021
@sakno sakno added the bug Something isn't working label Nov 9, 2021
@sakno sakno added this to the 4.0 milestone Nov 9, 2021
@sakno sakno added this to Opened in Cluster via automation Nov 9, 2021
@sakno
Copy link
Collaborator Author

sakno commented Nov 11, 2021

RC1 has been published. ConnectTimeout configuration option is added to TCP transport configuration and now it explicitly defined in Raft example.

@sakno sakno closed this as completed Nov 11, 2021
Cluster automation moved this from Opened to Closed Nov 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Cluster
Closed
Development

No branches or pull requests

1 participant