Server crashes when trying to produce large packets because of buffer overflow #37

pfons · 2016-04-13T00:11:49Z

We found a bug that causes a buffer overflow on the leader when a lagging follower tries to recover. The stack overflow seems to occur within the recursive function “restore_from_log” (Shim.ml) when a very large packet is constructed and before the leader actually tries to send it.

This problem can be reproduced through the following process:
a) start 3 servers;
b) execute one client request;
c) stop a follower server;
d) execute many client requests (in our tests, at least 521,932 requests).
c) restart the server that was stopped

Here’s a sample output produced by the leader when it crashes:

   [Term 1] Sending 50 entries to 2 (currently have 521932 entries), commitIndex=521882_
   [Term 1] Sending 521881 entries to 3 (currently have 521932 entries), commitIndex=521882_
   [Term 1] Received AppendEntriesReply 50 entries true, commitIndex 521883
  Fatal error: exception Stack overflow

The text was updated successfully, but these errors were encountered:

palmskog · 2017-05-21T03:29:29Z

This issue was moved to uwplse/verdi-raft#49

palmskog mentioned this issue May 21, 2017

Server crashes when trying to produce large packets because of buffer overflow uwplse/verdi-raft#49

Open

palmskog closed this as completed May 21, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Server crashes when trying to produce large packets because of buffer overflow #37

Server crashes when trying to produce large packets because of buffer overflow #37

pfons commented Apr 13, 2016

palmskog commented May 21, 2017

Server crashes when trying to produce large packets because of buffer overflow #37

Server crashes when trying to produce large packets because of buffer overflow #37

Comments

pfons commented Apr 13, 2016

palmskog commented May 21, 2017