-
Notifications
You must be signed in to change notification settings - Fork 239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about Quick Benchmark Results #205
Comments
Hi @Steamgjk As you can find in the benchmark program, client and leader are running in the same process (more precisely, there is no separate client but the benchmark program itself is both client and server, and directly invokes the Raft API), to measure the pure Raft performance: NuRaft/tests/bench/raft_bench.cxx Lines 303 to 304 in f004f4c
Hence, there is no network cost between client and leader, and each replication can be done within a single RTT. |
Hi, @greensky00 |
Hi @Steamgjk |
Hi, @greensky00 . Do you think it is bounded by network bandwidth or something else? You know, I am using n1-standard-32 VM as replias, the bandwidth is 32^4 Gbps. I feel that is quite large and should be inefficient. So I think it may be more reasonable to attribute the bottleneck to CPU, because replicas need to serialize/deserialize/process more messages when we have more replicas. What do you think is a convincing explanation, CPU or bandwidht, or something else? |
@Steamgjk Most likely the bottleneck comes from serialization. As I already mentioned in the other comment (#207 (comment)), it is not a random and independent data broadcasting. Replication should be strictly and globally ordered, which means log |
I agree with it. Serialization/deserialization should be the bottleneck |
Hi, I am a bit curious about the latency result in https://github.com/eBay/NuRaft/blob/master/docs/bench_results.md
The network RTT is about 180 micro seconds. Raft needs two RTTs for one request to be committed (Client->Leader->Follower->Leader->Client). In that way, the median latency should be much larger than 180 micro seconds, but why are they almost the same (187 micro seconds)?
The text was updated successfully, but these errors were encountered: