DotNext.Net.Cluster: System.ArgumentOutOfRangeException: Non-negative number required. (Parameter 'length') #244

sakno · 2024-06-25T09:24:56Z

Discussed in #243

^{Originally posted by LarsWithCA June 25, 2024}
Hi @sakno,

Once in a while we get a series of this exception during startup (possibly after restart/power-cycle) hindering the cluster from getting fully up and running:

2024-06-24 11:36:24.9866|ERROR|DotNext.Net.Cluster.Consensus.Raft.Tcp.TcpServer|Failed to process request from 192.168.100.154:52480|System.ArgumentOutOfRangeException: Non-negative number required. (Parameter 'length')
at System.IO.RandomAccess.SetLength(SafeFileHandle handle, Int64 length)
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.Table.UnsealIfNeededAsync(Int64 truncatePosition, CancellationToken token) in /_/src/cluster/DotNext.Net.Cluster/Net/Cluster/Consensus/Raft/PersistentState.Partition.cs:line 647
--- End of stack trace from previous location --- 
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.Table.WriteThroughAsync(CachedLogEntry entry, Int32 index, CancellationToken token) in /_/src/cluster/DotNext.Net.Cluster/Net/Cluster/Consensus/Raft/PersistentState.Partition.cs:line 605
at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendUncachedAsync[TEntry](ILogEntryProducer`1 supplier, Int64 startIndex, Boolean skipCommitted, CancellationToken token) in /_/src/cluster/DotNext.Net.Cluster/Net/Cluster/Consensus/Raft/PersistentState.cs:line 481
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendAsync[TEntry](ILogEntryProducer`1 entries, Int64 startIndex, Boolean skipCommitted, CancellationToken token) in /_/src/cluster/DotNext.Net.Cluster/Net/Cluster/Consensus/Raft/PersistentState.cs:line 520
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendAndCommitSlowAsync[TEntry](ILogEntryProducer`1 entries, Int64 startIndex, Boolean skipCommitted, Int64 commitIndex, CancellationToken token) in /_/src/cluster/DotNext.Net.Cluster/Net/Cluster/Consensus/Raft/PersistentState.cs:line 550
at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource<TResult>.GetResult(Int16 token)
at DotNext.Net.Cluster.Consensus.Raft.RaftCluster`1.AppendEntriesAsync[TEntry](ClusterMemberId sender, Int64 senderTerm, ILogEntryProducer`1 entries, Int64 prevLogIndex, Int64 prevLogTerm, Int64 commitIndex, IClusterConfiguration config, Boolean applyConfig, CancellationToken token) in /_/src/cluster/DotNext.Net.Cluster/Net/Cluster/Consensus/Raft/RaftCluster.cs:line 620
at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource<TResult>.GetResult(Int16 token)
at DotNext.Net.Cluster.Consensus.Raft.TransportServices.ConnectionOriented.Server.AppendEntriesAsync(ProtocolStream protocol, CancellationToken token) in /_/src/cluster/DotNext.Net.Cluster/Net/Cluster/Consensus/Raft/TransportServices/ConnectionOriented/Server.cs:line 142
at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
at DotNext.Net.Cluster.Consensus.Raft.Tcp.TcpServer.HandleConnection(Socket remoteClient) in /_/src/cluster/DotNext.Net.Cluster/Net/Cluster/Consensus/Raft/Tcp/TcpServer.cs:line 134

Our setup:

Linux ARM 32bit (little endian)
.NET8 + DotNext.Net.Cluster 5.7.0
Cluster with 6 nodes
10Hz communication frequency
MemoryBasedStateMachine.Options:
- recordsPerPartition = 50
- BufferSize = 8192 * 64 * 10
- InitialPartitionSize = 50 * 8192 * 10
- CompactionMode = CompactionMode.Sequential
- WriteMode = WriteMode.AutoFlush
- CacheEvictionPolicy = LogEntryCacheEvictionPolicy.OnSnapshot

Here are the values of various arguments/variables/fields inside PersistentState.Table.WriteThroughAsync when the exception happens:

index=1
fileOffset=512
FirstIndex=150
LastIndex=199
PartitionNumber=3
footer.Length=2000
entry.Length=1425
GetMetadataBuffer(index - 1).Length=40
writeAddress=LogEntryMetadata.GetEndOfLogEntry(GetMetadataBuffer(index - 1).Span)=-7426796073504137408

Are there any other values I should try and capture that might help you investigate/solve this?

In the previous version we were running (5.5.0), apart from this exception we also saw "System.ArgumentOutOfRangeException: Specified file length was too large for the file system." in the same area of the code - we might not be seeing that in the newest version.

The text was updated successfully, but these errors were encountered:

sakno · 2024-06-25T09:33:37Z

Related #242.

LarsWithCA · 2024-06-26T13:34:10Z

Hi @sakno

I've attached a zip of the node.state-file of the problematic node202: node.zip

The scenario is as follows: I restart the 6 Raspberry PIs in our cluster and our software including Raft is automatically started on startup. They all start from a clean slate. Here's what made it into their respective log-files (the callstack for the exceptions in node202 has been modified by my re-throwing in order to get various values):

node1:
2024-06-26 12:23:17.1807|WARN|leaderchange|New cluster leader is elected - Leader address 192.168.100.154:4250 - Term 1 - Election timeout 00:00:01.7110000

node149:
2024-06-26 12:23:05.2737|WARN|leaderchange|New cluster leader is elected - Leader address 192.168.100.154:4250 - Term 1 - Election timeout 00:00:02.7880000

node154:
2024-06-26 12:23:05.0097|WARN|DotNext.Net.Cluster.Consensus.Raft.RaftCluster|Cluster member 192.168.100.202:4250 is unavailable
2024-06-26 12:23:05.0954|WARN|DotNext.Net.Cluster.Consensus.Raft.RaftCluster|Cluster member 192.168.100.1:4250 is unavailable
2024-06-26 12:23:05.1880|WARN|leaderchange|New cluster leader is elected - Leader address 192.168.100.154:4250 - Term 1 - Election timeout 00:00:02.0870000
2024-06-26 12:23:05.5372|WARN|DotNext.Net.Cluster.Consensus.Raft.RaftCluster|Replication of 192.168.100.232:4250 is failed. Retry replication from entry 1
2024-06-26 12:23:05.5372|WARN|DotNext.Net.Cluster.Consensus.Raft.RaftCluster|Replication of 192.168.100.31:4250 is failed. Retry replication from entry 1
2024-06-26 12:23:05.5395|WARN|DotNext.Net.Cluster.Consensus.Raft.RaftCluster|Replication of 192.168.100.149:4250 is failed. Retry replication from entry 1

node232:
2024-06-26 12:23:05.3018|WARN|leaderchange|New cluster leader is elected - Leader address 192.168.100.154:4250 - Term 1 - Election timeout 00:00:01.7320000

node286:
2024-06-26 12:23:05.0852|WARN|leaderchange|New cluster leader is elected - Leader address 192.168.100.154:4250 - Term 1 - Election timeout 00:00:01.9890000

node202:
2024-06-26 12:23:15.7147|WARN|leaderchange|New cluster leader is elected - Leader address 192.168.100.154:4250 - Term 1 - Election timeout 00:00:02.4300000
2024-06-26 12:23:16.4185|ERROR|DotNext.Net.Cluster.Consensus.Raft.Tcp.TcpServer|Failed to process request from 192.168.100.154:38352|System.Exception: ### IsLittleEndian=True 64bit=False footer.Span.Length=2000 footer.Length=2000 index=1 writeAddress=59392164992516328 GetMetadataBuffer(index - 1).Length=40 LogEntryMetadata.GetEndOfLogEntry(GetMetadataBuffer(index - 1).Span)=59392164992516328 entry.Length=1425 fileOffset=512 FirstIndex=50 LastIndex=99 PartitionNumber=1
 ---> System.ArgumentOutOfRangeException: Specified file length was too large for the file system. (Parameter 'value')
   at System.IO.RandomAccess.WriteGatherAtOffset(SafeFileHandle handle, IReadOnlyList`1 buffers, Int64 fileOffset)
   at Microsoft.Win32.SafeHandles.SafeFileHandle.ThreadPoolValueTaskSource.ExecuteInternal()
--- End of stack trace from previous location ---
   at Microsoft.Win32.SafeHandles.SafeFileHandle.ThreadPoolValueTaskSource.GetResult(Int16 token)
   at DotNext.Net.Cluster.Consensus.Raft.PersistentState.Table.WriteThroughAsync(CachedLogEntry entry, Int32 index, CancellationToken token) in C:\Users\Dell\source\repos\dotNext-master\src\cluster\DotNext.Net.Cluster\Net\Cluster\Consensus\Raft\PersistentState.Partition.cs:line 616
   --- End of inner exception stack trace ---
   at DotNext.Net.Cluster.Consensus.Raft.PersistentState.Table.WriteThroughAsync(CachedLogEntry entry, Int32 index, CancellationToken token) in C:\Users\Dell\source\repos\dotNext-master\src\cluster\DotNext.Net.Cluster\Net\Cluster\Consensus\Raft\PersistentState.Partition.cs:line 625
   at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
   at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendUncachedAsync[TEntry](ILogEntryProducer`1 supplier, Int64 startIndex, Boolean skipCommitted, CancellationToken token) in C:\Users\Dell\source\repos\dotNext-master\src\cluster\DotNext.Net.Cluster\Net\Cluster\Consensus\Raft\PersistentState.cs:line 481
   at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendAsync[TEntry](ILogEntryProducer`1 entries, Int64 startIndex, Boolean skipCommitted, CancellationToken token) in C:\Users\Dell\source\repos\dotNext-master\src\cluster\DotNext.Net.Cluster\Net\Cluster\Consensus\Raft\PersistentState.cs:line 520
   at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendAndCommitSlowAsync[TEntry](ILogEntryProducer`1 entries, Int64 startIndex, Boolean skipCommitted, Int64 commitIndex, CancellationToken token) in C:\Users\Dell\source\repos\dotNext-master\src\cluster\DotNext.Net.Cluster\Net\Cluster\Consensus\Raft\PersistentState.cs:line 550
   at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource<TResult>.GetResult(Int16 token)
   at DotNext.Net.Cluster.Consensus.Raft.RaftCluster`1.AppendEntriesAsync[TEntry](ClusterMemberId sender, Int64 senderTerm, ILogEntryProducer`1 entries, Int64 prevLogIndex, Int64 prevLogTerm, Int64 commitIndex, IClusterConfiguration config, Boolean applyConfig, CancellationToken token) in C:\Users\Dell\source\repos\dotNext-master\src\cluster\DotNext.Net.Cluster\Net\Cluster\Consensus\Raft\RaftCluster.cs:line 620
   at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource<TResult>.GetResult(Int16 token)
   at DotNext.Net.Cluster.Consensus.Raft.TransportServices.ConnectionOriented.Server.AppendEntriesAsync(ProtocolStream protocol, CancellationToken token) in C:\Users\Dell\source\repos\dotNext-master\src\cluster\DotNext.Net.Cluster\Net\Cluster\Consensus\Raft\TransportServices\ConnectionOriented\Server.cs:line 142
   at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
   at DotNext.Net.Cluster.Consensus.Raft.Tcp.TcpServer.HandleConnection(Socket remoteClient) in C:\Users\Dell\source\repos\dotNext-master\src\cluster\DotNext.Net.Cluster\Net\Cluster\Consensus\Raft\Tcp\TcpServer.cs:line 134
2024-06-26 12:23:16.5328|ERROR|DotNext.Net.Cluster.Consensus.Raft.Tcp.TcpServer|Failed to process request from 192.168.100.154:38364|System.Exception: ### IsLittleEndian=True 64bit=False footer.Span.Length=2000 footer.Length=2000 index=1 writeAddress=59392164992516328 GetMetadataBuffer(index - 1).Length=40 LogEntryMetadata.GetEndOfLogEntry(GetMetadataBuffer(index - 1).Span)=59392164992516328 entry.Length=1425 fileOffset=512 FirstIndex=50 LastIndex=99 PartitionNumber=1
 ---> System.ArgumentOutOfRangeException: Specified file length was too large for the file system. (Parameter 'value')
   at System.IO.RandomAccess.WriteGatherAtOffset(SafeFileHandle handle, IReadOnlyList`1 buffers, Int64 fileOffset)
   at Microsoft.Win32.SafeHandles.SafeFileHandle.ThreadPoolValueTaskSource.ExecuteInternal()
--- End of stack trace from previous location ---
   at Microsoft.Win32.SafeHandles.SafeFileHandle.ThreadPoolValueTaskSource.GetResult(Int16 token)
   at DotNext.Net.Cluster.Consensus.Raft.PersistentState.Table.WriteThroughAsync(CachedLogEntry entry, Int32 index, CancellationToken token) in C:\Users\Dell\source\repos\dotNext-master\src\cluster\DotNext.Net.Cluster\Net\Cluster\Consensus\Raft\PersistentState.Partition.cs:line 616
   --- End of inner exception stack trace ---
   at DotNext.Net.Cluster.Consensus.Raft.PersistentState.Table.WriteThroughAsync(CachedLogEntry entry, Int32 index, CancellationToken token) in C:\Users\Dell\source\repos\dotNext-master\src\cluster\DotNext.Net.Cluster\Net\Cluster\Consensus\Raft\PersistentState.Partition.cs:line 625
   at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
   at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendUncachedAsync[TEntry](ILogEntryProducer`1 supplier, Int64 startIndex, Boolean skipCommitted, CancellationToken token) in C:\Users\Dell\source\repos\dotNext-master\src\cluster\DotNext.Net.Cluster\Net\Cluster\Consensus\Raft\PersistentState.cs:line 481
   at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendAsync[TEntry](ILogEntryProducer`1 entries, Int64 startIndex, Boolean skipCommitted, CancellationToken token) in C:\Users\Dell\source\repos\dotNext-master\src\cluster\DotNext.Net.Cluster\Net\Cluster\Consensus\Raft\PersistentState.cs:line 520
   at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendAndCommitSlowAsync[TEntry](ILogEntryProducer`1 entries, Int64 startIndex, Boolean skipCommitted, Int64 commitIndex, CancellationToken token) in C:\Users\Dell\source\repos\dotNext-master\src\cluster\DotNext.Net.Cluster\Net\Cluster\Consensus\Raft\PersistentState.cs:line 550
   at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource<TResult>.GetResult(Int16 token)
   at DotNext.Net.Cluster.Consensus.Raft.RaftCluster`1.AppendEntriesAsync[TEntry](ClusterMemberId sender, Int64 senderTerm, ILogEntryProducer`1 entries, Int64 prevLogIndex, Int64 prevLogTerm, Int64 commitIndex, IClusterConfiguration config, Boolean applyConfig, CancellationToken token) in C:\Users\Dell\source\repos\dotNext-master\src\cluster\DotNext.Net.Cluster\Net\Cluster\Consensus\Raft\RaftCluster.cs:line 620
   at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource<TResult>.GetResult(Int16 token)
   at DotNext.Net.Cluster.Consensus.Raft.TransportServices.ConnectionOriented.Server.AppendEntriesAsync(ProtocolStream protocol, CancellationToken token) in C:\Users\Dell\source\repos\dotNext-master\src\cluster\DotNext.Net.Cluster\Net\Cluster\Consensus\Raft\TransportServices\ConnectionOriented\Server.cs:line 142
   at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
   at DotNext.Net.Cluster.Consensus.Raft.Tcp.TcpServer.HandleConnection(Socket remoteClient) in C:\Users\Dell\source\repos\dotNext-master\src\cluster\DotNext.Net.Cluster\Net\Cluster\Consensus\Raft\Tcp\TcpServer.cs:line 134
...

Is this the data you were asking for?

sakno · 2024-06-26T15:20:07Z

node.zip has only node.state but not partition files.

LarsWithCA · 2024-06-26T15:31:26Z

Do you mean these?
node.zip

sakno · 2024-06-26T15:53:09Z

Do you mean these?

Yep, thanks! You've posted WAL in a legacy binary format. But you're trying to open it as a new format which is not binary compatible with the previous one. See Release Notes. There a few ways to fix that:

If you don't need your existing WAL, just erase it on all nodes and start from scratch. A new WAL will use new binary format.
If you prefer to keep data from existing WAL, set UseLegacyBinaryFormat to true as mentioned in release notes.

LarsWithCA · 2024-06-26T17:43:55Z

That surprises me since we store the node.state and partition files in a RAM disk folder, and I know the cluster was both power cycled and rebooted multiple times yesterday.
But thanks a lot for looking into it - I'll clear all the partition files and continue testing tomorrow.
Is there a way for me to see if a WAL is in the legacy binary format or not?

sakno · 2024-06-26T17:48:49Z

Is there a way for me to see if a WAL is in the legacy binary format or not?

Yep, the easiest way is to look at first 512 bytes. In a new format, all of them are zeroes (except first byte, it can be 1 or 0). Old format doesn't have such a header so its first 40 bytes are mostly non-zero.

sakno · 2024-06-26T18:03:55Z

I've published release 5.7.1 with a patch for new binary format that improves reliability of WAL in case of power loss or process crash.

LarsWithCA · 2024-06-27T12:49:29Z

Hi @sakno
I've been testing today with 5.7.1 and I still get these errors sometimes.
On each startup we always clear the folder containing the node.state and partition-files, so there's no chance of leftover files using the legacy binary format. We don't set UseLegacyBinaryFormat to true. Is there any other way the legacy binary format might get enforced by mistake? Could the partition file you looked at be corrupted (instead of being in the legacy binary format)?

sakno · 2024-06-28T09:40:36Z

Could the partition file you looked at be corrupted (instead of being in the legacy binary format)?

Definitely not. You can check it by yourself. Instantiate your WAL class inside of test and open those files with it. For instance, for a log entry with index 0 and partition number 96 I see the following log entry metadata in the debugger:

{DotNext.Net.Cluster.Consensus.Raft.PersistentState.LogEntryMetadata}
    End: 3409
    Id: null
    Length: 1409
    Offset: 2000
    Term: 1
    Timestamp: 638423899776894644

Offset for the first log entry matches to the size of legacy format header (50 entries per partition X 40 bytes metadata = 2000 bytes). Length is meaningful, as well as timestamp (which is 02/01/2024 13:12:57 +00:00).

sakno · 2024-06-28T10:10:11Z

Is there any other way the legacy binary format might get enforced by mistake?

There are two places in the code inside of PersistentState. All of them rely on maxLogEntrySize private field. If possible, you can dump the value of this field.

sakno · 2024-06-28T10:13:23Z

You mentioned ramfs. Is there any chance that the file content shared between deletions of the file? Could you reset Linux page cache or reboot the device?

LarsWithCA · 2024-06-28T11:50:29Z

I'm rebooting all the devices between each test run.

LarsWithCA · 2024-06-28T14:39:47Z

Is there any other way the legacy binary format might get enforced by mistake?

There are two places in the code inside of PersistentState. All of them rely on maxLogEntrySize private field. If possible, you can dump the value of this field.

maxLogEntrySize is always 0

sakno · 2024-06-28T14:59:25Z

Another approach to catch mistake: go to PersistentState.LegacyPartition class and replace its constructor with throw statement to see if there is any caller of it.

LarsWithCA · 2024-06-29T14:38:58Z

Another approach to catch mistake: go to PersistentState.LegacyPartition class and replace its constructor with throw statement to see if there is any caller of it.

I did this and PersistentState.LegacyPartition is not getting used/constructed.

sakno · 2024-06-29T15:56:09Z

New format uses a footer of the same size as header in the old format. Probably, due some bug, footer placed at position 0. I'll try to reproduce this.

LarsWithCA · 2024-06-29T16:10:51Z

OK 👍 let me know if you want me to try out something.

sakno · 2024-06-29T17:44:50Z

I've added some assertions to catch this situation. You can try develop branch. However, I don't see how is it possible.

LarsWithCA · 2024-06-30T12:01:19Z

I've added some assertions to catch this situation. You can try develop branch. However, I don't see how is it possible.

Yours assertions are not hit 😕

sakno · 2024-06-30T19:35:28Z

Table.FlushAndSealAsync is the only place inside of PersistentState class where footer is written. Assertion Debug.Assert(writer.FilePosition > HeaderSize); means that the footer is not written at position 0.

LarsWithCA · 2024-07-01T12:10:26Z

Does it make sense to add the same assertion in Table.WriteThroughAsync (Debug.Assert(writeAddress >= HeaderSize))?
I tried doing that and can see that we try to write on address 0. Is that expected or the bug we're after?

writeAddress=0 HeaderSize=512
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.Table.WriteThroughAsync(CachedLogEntry entry, Int32 index, CancellationToken token) in C:UsersDellsourcereposdotNextsrcclusterDotNext.Net.ClusterNetClusterConsensusRaftPersistentState.Partition.cs:line 600
at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.Table.WriteThroughAsync(CachedLogEntry entry, Int32 index, CancellationToken token)
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.Partition.WriteAsync[TEntry](TEntry entry, Int64 absoluteIndex, CancellationToken token) in C:UsersDellsourcereposdotNextsrcclusterDotNext.Net.ClusterNetClusterConsensusRaftPersistentState.Partition.cs:line 245
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendUncachedAsync[TEntry](ILogEntryProducer`1 supplier, Int64 startIndex, Boolean skipCommitted, CancellationToken token) in C:UsersDellsourcereposdotNextsrcclusterDotNext.Net.ClusterNetClusterConsensusRaftPersistentState.cs:line 481
at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendUncachedAsync[TEntry](ILogEntryProducer`1 supplier, Int64 startIndex, Boolean skipCommitted, CancellationToken token)
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.UnsafeAppendAsync[TEntry](ILogEntryProducer`1 supplier, Int64 startIndex, Boolean skipCommitted, CancellationToken token) in C:UsersDellsourcereposdotNextsrcclusterDotNext.Net.ClusterNetClusterConsensusRaftPersistentState.cs:line 422
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendAsync[TEntry](ILogEntryProducer`1 entries, Int64 startIndex, Boolean skipCommitted, CancellationToken token) in C:UsersDellsourcereposdotNextsrcclusterDotNext.Net.ClusterNetClusterConsensusRaftPersistentState.cs:line 520
at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendAsync[TEntry](ILogEntryProducer`1 entries, Int64 startIndex, Boolean skipCommitted, CancellationToken token)
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendAndCommitSlowAsync[TEntry](ILogEntryProducer`1 entries, Int64 startIndex, Boolean skipCommitted, Int64 commitIndex, CancellationToken token) in C:UsersDellsourcereposdotNextsrcclusterDotNext.Net.ClusterNetClusterConsensusRaftPersistentState.cs:line 550
at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendAndCommitSlowAsync[TEntry](ILogEntryProducer`1 entries, Int64 startIndex, Boolean skipCommitted, Int64 commitIndex, CancellationToken token)
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.DotNext.IO.Log.IAuditTrail<DotNext.Net.Cluster.Consensus.Raft.IRaftLogEntry>.AppendAndCommitAsync[TEntry](ILogEntryProducer`1 entries, Int64 startIndex, Boolean skipCommitted, Int64 commitIndex, CancellationToken token) in C:UsersDellsourcereposdotNextsrcclusterDotNext.Net.ClusterNetClusterConsensusRaftPersistentState.cs:line 537
at DotNext.Net.Cluster.Consensus.Raft.RaftCluster`1.AppendEntriesAsync[TEntry](ClusterMemberId sender, Int64 senderTerm, ILogEntryProducer`1 entries, Int64 prevLogIndex, Int64 prevLogTerm, Int64 commitIndex, IClusterConfiguration config, Boolean applyConfig, CancellationToken token)
at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
at DotNext.Net.Cluster.Consensus.Raft.RaftCluster`1.AppendEntriesAsync[TEntry](ClusterMemberId sender, Int64 senderTerm, ILogEntryProducer`1 entries, Int64 prevLogIndex, Int64 prevLogTerm, Int64 commitIndex, IClusterConfiguration config, Boolean applyConfig, CancellationToken token)
at DotNext.Net.Cluster.Consensus.Raft.RaftCluster.DotNext.Net.Cluster.Consensus.Raft.TransportServices.ILocalMember.AppendEntriesAsync[TEntry](ClusterMemberId sender, Int64 senderTerm, ILogEntryProducer`1 entries, Int64 prevLogIndex, Int64 prevLogTerm, Int64 commitIndex, IClusterConfiguration config, Boolean applyConfig, CancellationToken token) in C:UsersDellsourcereposdotNextsrcclusterDotNext.Net.ClusterNetClusterConsensusRaftRaftCluster.DefaultImpl.cs:line 300

Btw, are you counting on the footer memory buffer being initialized with all 0s (in the Table constructor)?

sakno · 2024-07-02T12:27:27Z

Please check dependency version

at DotNext.Net.Cluster.Consensus.Raft.PersistentState.Table.UnsealIfNeededAsync(Int64 truncatePosition, CancellationToken token) in /_/src/cluster/DotNext.Net.Cluster/Net/Cluster/Consensus/Raft/PersistentState.Partition.cs:line 590

Line 590 points to a different location in develop branch:

dotNext/src/cluster/DotNext.Net.Cluster/Net/Cluster/Consensus/Raft/PersistentState.Partition.cs

Lines 590 to 597 in 2edc792

    
           // This is expensive operation in terms of I/O. However, it is needed only when 
        
           // the consumer decided to rewrite the existing log entry, which is rare. 
        
           IsSealed = false; 
        
           await WriteHeaderAsync(token).ConfigureAwait(false); 
        
           RandomAccess.FlushToDisk(Handle); 
        
           // destroy all entries in the tail of partition 
        
           RandomAccess.SetLength(Handle, truncatePosition);

LarsWithCA · 2024-07-02T12:36:26Z

The line numbering is shifted a bit due to my added assertions and catching/throwing exception to get more data.
I've tried using both the latest Develop branch and DotNext.Net.Cluster 5.7.2 nuget package with the same result.

sakno · 2024-07-03T14:09:06Z

I've changed the way how the partition restores metadata table. You can try develop branch. But I was not able to reproduce this.

LarsWithCA · 2024-07-03T14:19:23Z

Thank you 👍 I'll try it out tomorrow

LarsWithCA · 2024-07-04T09:03:02Z

It still happens with the latest changes on the develop branch. Below is the stack trace of Debug.Assert(writeAddress >= HeaderSize) getting hit in WriteThroughAsync.

I see this happening for all nodes receiving a snapshot.
I've attached the raft.log (Debug) for node 192.168.100.1 (crashing after restarting and then receiving snapshot) and node 192.168.100.286 (leader).
Between 2024-07-04 08:34:11.4617 and 08:34:20.5587 the node 192.168.100.1 was restarted.
logs.zip

at DotNext.Net.Cluster.Consensus.Raft.PersistentState.Table.WriteThroughAsync(CachedLogEntry entry, Int32 index, CancellationToken token) in C:UsersDellsourcereposdotNextsrcclusterDotNext.Net.ClusterNetClusterConsensusRaftPersistentState.Partition.cs:line 539
at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.Table.WriteThroughAsync(CachedLogEntry entry, Int32 index, CancellationToken token)
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.Partition.WriteAsync[TEntry](TEntry entry, Int64 absoluteIndex, CancellationToken token) in C:UsersDellsourcereposdotNextsrcclusterDotNext.Net.ClusterNetClusterConsensusRaftPersistentState.Partition.cs:line 193
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendUncachedAsync[TEntry](ILogEntryProducer`1 supplier, Int64 startIndex, Boolean skipCommitted, CancellationToken token) in C:UsersDellsourcereposdotNextsrcclusterDotNext.Net.ClusterNetClusterConsensusRaftPersistentState.cs:line 483
at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendUncachedAsync[TEntry](ILogEntryProducer`1 supplier, Int64 startIndex, Boolean skipCommitted, CancellationToken token)
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.UnsafeAppendAsync[TEntry](ILogEntryProducer`1 supplier, Int64 startIndex, Boolean skipCommitted, CancellationToken token) in C:UsersDellsourcereposdotNextsrcclusterDotNext.Net.ClusterNetClusterConsensusRaftPersistentState.cs:line 423
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendAsync[TEntry](ILogEntryProducer`1 entries, Int64 startIndex, Boolean skipCommitted, CancellationToken in C:UsersDellsourcereposdotNextsrcclusterDotNext.Net.ClusterNetClusterConsensusRaftPersistentState.cs:line 523
at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendAsync[TEntry](ILogEntryProducer`1 entries, Int64 startIndex, Boolean skipCommitted, CancellationToken
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendAndCommitSlowAsync[TEntry](ILogEntryProducer`1 entries, Int64 startIndex, Boolean skipCommitted, Int64 commitIndex, CancellationToken token) in C:UsersDellsourcereposdotNextsrcclusterDotNext.Net.ClusterNetClusterConsensusRaftPersistentState.cs:line 553
at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.AppendAndCommitSlowAsync[TEntry](ILogEntryProducer`1 entries, Int64 startIndex, Boolean skipCommitted, Int64 commitIndex, CancellationToken token)
at DotNext.Net.Cluster.Consensus.Raft.PersistentState.DotNext.IO.Log.IAuditTrail<DotNext.Net.Cluster.Consensus.Raft.IRaftLogEntry>.AppendAndCommitAsync[TEntry](ILogEntryProducer`1 entries, Int64 startIndex, Boolean skipCommitted, Int64 commitIndex, CancellationToken token) in C:UsersDellsourcereposdotNextsrcclusterDotNext.Net.ClusterNetClusterConsensusRaftPersistentState.cs:line 540
at DotNext.Net.Cluster.Consensus.Raft.RaftCluster`1.AppendEntriesAsync[TEntry](ClusterMemberId sender, Int64 senderTerm, ILogEntryProducer`1 entries, Int64 prevLogIndex, Int64 prevLogTerm, Int64 commitIndex, IClusterConfiguration config, Boolean applyConfig, CancellationToken token)
at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
at DotNext.Net.Cluster.Consensus.Raft.RaftCluster`1.AppendEntriesAsync[TEntry](ClusterMemberId sender, Int64 senderTerm, ILogEntryProducer`1 entries, Int64 prevLogIndex, Int64 prevLogTerm, Int64 commitIndex, IClusterConfiguration config, Boolean applyConfig, CancellationToken token)
at DotNext.Net.Cluster.Consensus.Raft.RaftCluster.DotNext.Net.Cluster.Consensus.Raft.TransportServices.ILocalMember.AppendEntriesAsync[TEntry](ClusterMemberId sender, Int64 senderTerm, ILogEntryProducer`1 entries, Int64 prevLogIndex, Int64 prevLogTerm, Int64 commitIndex, IClusterConfiguration config, Boolean applyConfig, CancellationToken token) in C:UsersDellsourcereposdotNextsrcclusterDotNext.Net.ClusterNetClusterConsensusRaftRaftCluster.DefaultImpl.cs:line 300
at DotNext.Net.Cluster.Consensus.Raft.TransportServices.ConnectionOriented.Server.AppendEntriesAsync(ProtocolStream protocol, CancellationToken token) in C:UsersDellsourcereposdotNextsrcclusterDotNext.Net.ClusterNetClusterConsensusRaftTransportServicesConnectionOrientedServer.cs:line 142
at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
at DotNext.Net.Cluster.Consensus.Raft.TransportServices.ConnectionOriented.Server.AppendEntriesAsync(ProtocolStream protocol, CancellationToken token)
at DotNext.Net.Cluster.Consensus.Raft.TransportServices.ConnectionOriented.Server.ProcessRequestAsync(MessageType type, ProtocolStream protocol, CancellationToken token) in C:UsersDellsourcereposdotNextsrcclusterDotNext.Net.ClusterNetClusterConsensusRaftTransportServicesConnectionOrientedServer.cs:line 47
at DotNext.Net.Cluster.Consensus.Raft.Tcp.TcpServer.HandleConnection(Socket remoteClient) in C:UsersDellsourcereposdotNextsrcclusterDotNext.Net.ClusterNetClusterConsensusRaftTcpTcpServer.cs:line 134

sakno · 2024-07-04T17:36:31Z

Finally I've reproduced it. You can execute RegressionIssue244 test from develop branch and see that Debug.Assert(writeAddress >= HeaderSize); assertion failed.

sakno · 2024-07-04T17:49:38Z

This is how it works on production environment. Let's assume that we have three nodes A, B, C in the cluster. A is the leader. AppendAndCommitAsync method is used by the follower to process AppendEntries message:

A sends log entries with indices (4, 5) and commitIndex=2 to B and C. recordsPerPartition=4.
C accepted the replication, but B is unavailable. However, consensus is reached. New commitIndex is 5
On the next heartbeat round, B is now available. A sends (4, 5) and commitIndex=5 to it. As you can see here, index 5 is not aligned to the partition size.

Sequential compaction squashes partitions only when commitIndex reaches the end of the partition. In other words, the index is always aligned to the logical size of partition. However, AppendAndCommitAsync ignores this assumption (not a bug, just behavioral feature I was not aware of, 😄 ) and squashes everything up to index 5 from the example above. Index 5 means relative index 1 in a partition 1. Now we have a snapshot up to that index, but the partition doesn't contain log entry with index 0. When we trying to append a log entry at index 6 (relative index 1), the implementation trying to read previous log entry at index 0 which doesn't exist (physically, its metadata header filled with zeroes). Zero bits gives zero writeAddress.

It's not a bug of a new binary format, it's unexpected behavior of compaction on the follower side that leads to failed assertion. With legacy format it was not a problem, because previous log entry is not needed to calculate the position of a new log entry.

sakno · 2024-07-05T07:32:21Z

Done, please check develop branch.

LarsWithCA · 2024-07-05T09:01:24Z

Hi @sakno
Good to hear and thanks for the explanation!
I just tried it twice with my 6-node cluster, and both times did not go well - a quick glance at the raft log shows the nodes going into an election-spree. Here are the logs - with a restart around 2024-07-05 08:00:00
logs.zip

sakno · 2024-07-05T09:09:28Z

Yeah, I forgot to fix it in one more place, inside of Table.Initialize.

…244)

LarsWithCA · 2024-07-05T11:39:04Z

Not sure if you were done with the fix, but I tried your changes to Table.Initialize and I still get continuous election timeouts.

sakno · 2024-07-08T08:29:54Z

I see elections timeouts locally with RaftNode example as well. I had to fix IsCompactionRequired also to fix a determinism of sequential snapshots. Now it's done.

sakno · 2024-07-08T23:51:19Z

Found more issues with partition sealing. Fixed and pushed to develop.

sakno · 2024-07-09T09:06:01Z

Release 5.7.3 has been published.

sakno · 2024-07-09T18:12:47Z

Reopened for confirmation.

freddyrios · 2024-07-10T06:38:41Z

Hi, thanks, it seems to be gone (tested in the same test system). I think it makes sense to close it now. We will do some longer tests, roll it out to more systems and will let you know if we see it again.

sakno added the bug Something isn't working label Jun 25, 2024

sakno self-assigned this Jun 25, 2024

sakno closed this as completed Jun 27, 2024

sakno reopened this Jun 28, 2024

sakno added a commit that referenced this issue Jun 29, 2024

Added assertions to catch #244

ab3bf96

sakno reopened this Jul 2, 2024

sakno added a commit that referenced this issue Jul 4, 2024

Repro #244

620cde0

sakno added a commit that referenced this issue Jul 5, 2024

Align sequential compaction scope to the partition border (#244)

292ea41

sakno added a commit that referenced this issue Jul 5, 2024

Fill metadata table with stub entries if they are zeroed on the disk (#…

70a8edf

…244)

sakno added a commit that referenced this issue Jul 8, 2024

Fixed consequences of patch for #244

f4f0088

sakno added a commit that referenced this issue Jul 8, 2024

Fixed incorrect index comparison when seal partition (#244)

9deaf12

sakno closed this as completed in 38e8395 Jul 9, 2024

sakno reopened this Jul 9, 2024

sakno closed this as completed Jul 10, 2024

DotNext.Net.Cluster: System.ArgumentOutOfRangeException: Non-negative number required. (Parameter 'length') #244

DotNext.Net.Cluster: System.ArgumentOutOfRangeException: Non-negative number required. (Parameter 'length') #244

Comments

sakno commented Jun 25, 2024

Discussed in #243

sakno commented Jun 25, 2024

LarsWithCA commented Jun 26, 2024

sakno commented Jun 26, 2024

LarsWithCA commented Jun 26, 2024

sakno commented Jun 26, 2024 • edited Loading

LarsWithCA commented Jun 26, 2024

sakno commented Jun 26, 2024 • edited Loading

sakno commented Jun 26, 2024

LarsWithCA commented Jun 27, 2024

sakno commented Jun 28, 2024

sakno commented Jun 28, 2024

sakno commented Jun 28, 2024

LarsWithCA commented Jun 28, 2024

LarsWithCA commented Jun 28, 2024

sakno commented Jun 28, 2024

LarsWithCA commented Jun 29, 2024 • edited Loading

sakno commented Jun 29, 2024

LarsWithCA commented Jun 29, 2024

sakno commented Jun 29, 2024

LarsWithCA commented Jun 30, 2024

sakno commented Jun 30, 2024

LarsWithCA commented Jul 1, 2024 • edited Loading

sakno commented Jul 2, 2024

LarsWithCA commented Jul 2, 2024 • edited Loading

sakno commented Jul 3, 2024

LarsWithCA commented Jul 3, 2024

LarsWithCA commented Jul 4, 2024

sakno commented Jul 4, 2024

sakno commented Jul 4, 2024 • edited Loading

sakno commented Jul 5, 2024

LarsWithCA commented Jul 5, 2024

sakno commented Jul 5, 2024

LarsWithCA commented Jul 5, 2024 • edited Loading

sakno commented Jul 8, 2024

sakno commented Jul 8, 2024

sakno commented Jul 9, 2024

sakno commented Jul 9, 2024

freddyrios commented Jul 10, 2024

sakno commented Jun 26, 2024 •

edited

Loading

sakno commented Jun 26, 2024 •

edited

Loading

LarsWithCA commented Jun 29, 2024 •

edited

Loading

LarsWithCA commented Jul 1, 2024 •

edited

Loading

LarsWithCA commented Jul 2, 2024 •

edited

Loading

sakno commented Jul 4, 2024 •

edited

Loading

LarsWithCA commented Jul 5, 2024 •

edited

Loading