Skip to content

Latest commit

 

History

History
70 lines (43 loc) · 2.63 KB

5_0_6.md

File metadata and controls

70 lines (43 loc) · 2.63 KB

5.0.6

This is an upgrade with patches mitigate an ongoing network incident. More details here: https://hackmd.io/KY7VklgAR72Bg7ETTQZAxA

These changes do not affect Move code (no changes to system state or policies). It only upgrades diem-node

TL;DR

Just update binaries to use patches.


# install new binaries
cd ~/libra
git fetch
git checkout v5.0.6 -f
make bins install
<stop diem-node>
<restart diem-node>

Summary

TBD

Changes

Move Changes

None

Rust Changes
- prevent DiemAccount::1003 txs from remaining in mempool

One line change to prevent the VM transaction verification on admission from allowing transactions from clients that are more advanced than the current state of validator. The change prevents a DiemAccount::1003 PROLOGUE_ESEQUENCE_NUMBER_TOO_NEW transaction error from remaining in the mempool.

This happens when a TX has a sequence number newer than the validator has for that account. This means the validator is behind sync compared to the client.

Possible fix to inundation of prologue errors DiemAccount::1003 during the Dec 16-18 network incident described here https://hackmd.io/KY7VklgAR72Bg7ETTQZAxA.

It seems txs with prologue errors keep accumulating but they in mempool and then keep getting rebroadcast to other nodes that were disconnected (related PR: diem/diem#9437). This seems to contribute network to halt. But it's still unclear why this would cause transport errors, and eventually Validators from contacting each other. cc @gregnazario

Tip from user shb https://discord.com/channels/903339070925721652/916100220855648296/921932021884928030 "One idea for a mitigation: try setting this flag to false (which should tell mempool to drop any txs with new sequence #'s on the floor instead of holding on to them) https://github.com/OLSF/libra/blob/main/language/diem-vm/src/diem_transaction_validator.rs#L91

#906

- disable mempool resends all txns after disconnect

Validators don't resend txns after disconnect to ones that have already confirmed which could cause some issues when many nodes restart or get stuck.

This patch already exists in diem Upstream in a newer version (1.4) than 0L's fork (1.3). The upstream PR is diem/diem#9437

#905

Tests
  • All continuous integration tests passed.
  • The diem-code was tested on production nodes that were both synced and out of sync.
Compatibility

The Move framework changes are backwards compatible with diem-node from v5.0.1

Rust changes

Diem Node

Changes to diem-node is compatible with on-chain state from v5.0.5.