Skip to content

Latest commit

 

History

History
115 lines (73 loc) · 6.92 KB

5_0_5.md

File metadata and controls

115 lines (73 loc) · 6.92 KB

5.0.5

This upgrade has changes to the network Move bytecode, diem-node JSON RPC, and command-line tools.

TL;DR

Submit an upgrade vote with hash, then update diem-node to v5.0.5

# do a lazy vote
txs oracle-upgrade -v -h 3f46ba5768165dfb4503e5ab2474c3bbf81cbf0cf2550f7ff5252b15a5b985e4

# install new binaries
cd ~/libra
git checkout v5.0.5 -f
make bins install
<stop diem-node>
<restart diem-node>

Summary

5.0.5 predominantly includes bugfixes to system policies (Move code) which had diverged from the specification.

The upgrade can be done without halting the network with a Hot Upgrade. A hot network upgrade is required before the new policies are in place. Read about network upgrades.

The upgrade includes an autonomous state migration, which will occur on-the-fly without operator intervention. THIS IS THE FIRST TIME such a state migration attempted on mainnet. While this upgrade was tested extensively in CI, and went through manual QA awith pre-flight-check, THERE IS A NON-ZERO RISK OF NETWORK HALT due to the change in data structures and APIs.

Compatibility: The design is the upgrade is universally compatible (backward and forward).

  • The new 5.0.5 diem-node is backwards compatible with the current network bytecode (5.0.4).

  • The 5.0.5 Move bytecode is backwards compatible with previous diem-node 5.0.4

The stdlib payload hash for voting is:

3f46ba5768165dfb4503e5ab2474c3bbf81cbf0cf2550f7ff5252b15a5b985e4

You can build and check the hash of stdlib from project root with: make stdlib. Read about Hot Upgrade procedures and tools.

Changes

Move Changes
- Carpe underpayment

5.0.5 solves an unfortunate code regression when mainnet was launched and Carpe alpha users are having their Identity Subsidy grossly undercounted, and decreasing each day.

Post-mortem: An investigation showed that the root-cause of the error was that Continuous Integration tests were presenting a false positive of the behavior of the Move code policy. The relevant counters were not getting reset at epoch boundaries in production code, only in test code. This prevented the dev team from catching this pre-flight. Additionally there were not sufficient metrics being collected until the Web Explorer was developed to demonstrate the divergence in the calculations, it was only when a Carpe user presented the conflicting data that the developers behind the Explorer, Carpe, and Move core could identigfy the cause. Many thanks to @mannybothanz who correctly identified the issue, @gnudrew25 who rapidly deployed the analytics, @jamesm who identified the source of the bug, and @0o-de-lally for developing the fixes.

Fixes: This upgrade solves the issue going forward. This upgrade does not correct for historical account balances that are below what was expected. A future upgrade will do this.

PR resolves the underpayment (incorrect resetting of proofs per epoch) that was affecting Carpe miners receiving exponentially less rewards every new epoch.

These changes include modifications to how TowerState collects proof counts, but also requires a state migration. A state migration is necessary for this upgrade, the TowerState struct will be deprecated in favor of TowerCounter. The state migration needs to happen at the start of the epoch (after the upgrade writeset takes place).

#890

- Epochs validator compliant

This upgrade also solves the issue with rate-limiting of validator account creation being miscounted. There was an edge case where end-users which had a Tower height, and later changed their account type to validators were able to onboard users before the appropriate amount of time passed.

The issue was relatively simple, on the epoch boundary the updating of epoch statistics for Tower was happening for all miners, instead of the subset which were validators. This is a reversion, likely due to inclusion of Carpe use cases, and a faulty merge. The tests were also not robust enough for these cases.

The practical matter is that there is a corner case where a miner (not yet a validator) builds a tower for n periods, and then is upgraded to validator (not the workflow we anticipate for validator onboarding) and then instead of epochs_mining_and_validating being 0, it was n.

We take the opportunity to patch this to implement "lazy computation" of resetting the epoch proof count instead of iterating through the entire list of miners, for a relatively small benefit. This change may cause issues with client software (web explorer, Carpe, web monitor) so devs should get familiarized with the changes.

#880

- End-user transfers enabled

As a policy end-users transfers have been enabled since genesis. However an issue with extracting the "withdrawal capability token" was implementing a deprecated policy from the experimental network (where no transfers were possible). The functional tests were giving a false positive because the Test settings use different payment settings to facilitate in writing tests.

This patch now allows for transfers from end-users in an unrestricted manner. Functional tests were updated. CLI tools were already implemented and are correct. Carpe could also introduce this feature.

#887

- Operator balance in validator audit

Prior to entering a validator set, the VM does an audit on the validator settings. One check was causing issues with onboarding, and making prospective validator accounts "bricked" when the operator balance fell below a threshold after repeatedly failing to enter the validator set. This check had little benefit to the network and just caused confusion for onboarding validators. The check is now removed.

Tests
  • Many Move functional tests were added to the test suite for each of the changes above.
  • All continuous integration tests have passed.
Compatibility

The Move framework changes are backwards compatible with diem-node from v5.0.1

Rust changes

Diem Node

The diem-node service also serves JSON-RPC requests. A new method for RPC requests related to TowerState was developed with the change to lazy computation of the current proofs in epoch. Queries to that method will return an error if the node serving responses is not on v5.0.5.

All Carpe fullnodes should update to v5.0.5.

NOTE: If you have not yet upgraded to previous versions (e.g. v5.0.2) there are critical upgrade to diem-node. It's safe to skip 5.0.2 directly to 5.0.5 if that upgrade was not yet done.

Preflight checks on Rex (devnet)

- [x] ol: web monitor starts
- [x] tower: miner tx submit
- [x] txs: set community wallet
- [x] confirm autopay values
- [x] txs: create end user account "eve"
- [x] txs: eve submits miner proof
- [x] epoch change
- [x] txs: send stdlib upgrade tx
- [x] second epoch change after upgrade vote
- [x] stdlib upgrade
- [x] txs: validator onboarding