Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update online_weight_quorum default to 67 #3052

Merged
merged 2 commits into from
Jan 5, 2021

Conversation

SergiySW
Copy link
Contributor

PR #2884 was originally intended to include changes to the default quorum value from 50% to 67%. This update includes this change.

Previously the online weight minimum of 60M was being applied as the floor for the number of votes required for quorum, but this caused the actual quorum % required to increase as online weight dropped towards that minimum 60M instead of staying consistent. With #2884 the quorum % is consistent regardless of online weight and thus the more online weight there is, the higher the vote amounts required. In order to provide similar security on the network prior to #2884, this update sets a 67% default value for quorum.

Given current online weight of 92M on the main network, this 67% requirements equates to ~61.6M votes required, just above the previous 60M, so little impact to confirmation times is expected.

PR nanocurrency#2884 was originally intended to include changes to the default quorum value from 50% to 67%. This update includes this change.

Previously the online weight minimum of 60M was being applied as the floor for the number of votes required for quorum, but this caused the actual quorum % required to increase as online weight dropped towards that minimum 60M instead of staying consistent. With nanocurrency#2884 the quorum % is consistent regardless of online weight and thus the more online weight there is, the higher the vote amounts required. In order to provide similar security on the network prior to nanocurrency#2884, this update sets a 67% default value for quorum.

Given current online weight of 92M on the main network, this 67% requirements equates to ~61.6M votes required, just above the previous 60M, so little impact to confirmation times is expected.
@SergiySW SergiySW added quality improvements This item indicates the need for or supplies changes that improve maintainability configuration default change Change to default configuration values beta test cases labels Nov 17, 2020
@SergiySW SergiySW added this to the V22.0 milestone Nov 17, 2020
@SergiySW SergiySW added this to Pending testing evaluation in V22.0DB10 via automation Nov 17, 2020
@SergiySW SergiySW self-assigned this Nov 17, 2020
@zhyatt zhyatt moved this from Pending testing evaluation to Pending targeted testing in V22.0DB10 Nov 18, 2020
@Srayman
Copy link
Contributor

Srayman commented Dec 7, 2020

Adding some notes from discussions on Discord in #protocol.

Are there any concerns about the lower floor/minimum this change along with #2884 ? Prior to either of these PR's the network would stall below 60M weight, whereas now it seems that it would stall below 40.2M weight. If an attacker controlled 40.2M could they send 2 forks to different nodes along with votes for each and cause forks to be confirmed on different nodes? Previously it would require 60M minimum weight to confirm so this is reducing the weight required to confirm a block if the online weight drops to 60M or less.

I think it's certainly debateable what should happen in that circumstance, should the protocol adapt automatically or should a change in the node be required if the weight drops that low?

In the short term should the minimum online weight be increased to 90M which would result in a stall happening at 60M again instead of 40.2M to keep that consistent?

The concern is reducing security in favor of liveness as the weight drops below 90M compared to what has historically been in place.

@keerifox
Copy link

online_weight_minimum
Online weight minimum required to confirm a block.

is it possible to keep this description accurate? if the value across every PR has to be set to 90M to stall at 60M, and is effectively 40.2M when it's set to 60M, that introduces a lot of confusion

argakiig pushed a commit that referenced this pull request Dec 18, 2020
@argakiig argakiig merged commit 7af41fb into nanocurrency:develop Jan 5, 2021
@clemahieu
Copy link
Contributor

clemahieu commented Jan 8, 2021

Adding some notes from discussions on Discord in #protocol.

Are there any concerns about the lower floor/minimum this change along with #2884 ? Prior to either of these PR's the network would stall below 60M weight, whereas now it seems that it would stall below 40.2M weight. If an attacker controlled 40.2M could they send 2 forks to different nodes along with votes for each and cause forks to be confirmed on different nodes? Previously it would require 60M minimum weight to confirm so this is reducing the weight required to confirm a block if the online weight drops to 60M or less.

I think it's certainly debateable what should happen in that circumstance, should the protocol adapt automatically or should a change in the node be required if the weight drops that low?

In the short term should the minimum online weight be increased to 90M which would result in a stall happening at 60M again instead of 40.2M to keep that consistent?

The concern is reducing security in favor of liveness as the weight drops below 90M compared to what has historically been in place.

Most of the discussion I think comes from the purpose of the online_weight_minimum term. Internally the node tracks the network's online weight total. The online_weight_minimum setting is a failsafe intended to be the lowest online weight that is expected. This would come in to play if, over a long period of time, the actual online voting weight from reps approached this value.

The confirmation quorum requirement of 2/3, tolerating 1/3 byzantine actors, now fully scales with changes in calculated online weight. Setting the quorum too high eats in to the "stall budget", where a byzantine actor can withold their votes to stall the network. Too low and eats in to the "byzantine budget" where the network is less tolerant to byzantine actors.

Prior to these changes online_weight_minimum was also being used as a quorum delta. As online network weight was trending down, this was eating in to our stall budget. Now it will always approximate the optimal going forward.

@Vyryn
Copy link

Vyryn commented Jan 8, 2021

Thanks for the discussion Colin.
My concern is that removing the 60M minimum like this increases the potential for irreconcilable network forks. My understanding is that if a node cements any block that didn't win the gossip vote on the absolute majority of nodes, that node can never be reconciled with the main network. Potentially leading to double spends and lost funds. I understand that the only reasons forks arise are either programming error or malicious action, but if those funds are then able to be spent it still does create a nasty situation.
If two fragments of the network were disconnected from each other with the hard 60M minimum, the smaller network fragment would stall if such a situation. Now, the smaller fragment will stall for up to two weeks (and the larger fragment may stall as well, for less time) but the moving average will make the new reduced peer weight acceptable. So if an attacker were to fragment a node from the network for an extended period of time, with the removal of the 60M minimum it is now conceivable that any honest node, when thus disconnected, can be coopted by a malicious node with a 33% higher stake into accepting a fork cemented block. Furthermore, coopted peers can potentially even be 'used' to magnify the coopted weight by repeating this attack on a new target with the previous coopted peer connected. Nodes will never undo their cemented transactions - so if the network is reconnected, it could perpetually stall if a large enough minority of peers were coopted. I won't explore that in further detail, but I will sketch out a potential attack.

Here is a proposed attack for a malicious actor Bob, who doesn't even need a substantial vote weight:
Bob, who happens to share the first letter of his name with a different large actor on the nano network, performs an impressive traffic intercepting or dos based attack wherein he partitions the network into two parts, Primary and Fragment.
Bob is connected to Primary, consisting of 80M weight, and continues to vote and validate transactions as normal to them. He makes one change to his node; that he will no longer gossips to them about nodes in Fragment to Primary or the nodes in Primary to Fragment. To Primary, a potentially substantial but relatively small number of nodes have unexpectedly gone offline. It poses no apparent threat to the network security.
Fragment, meanwhile, is formed of a relatively small number of nodes whose vote weight is 45M and whose connections to the rest of the network have been cut off by Bob and remain cut off for an extended period of time. The most plausible such scenario is perhaps the China firewall disconnecting from the rest of the internet, but other such attacks could potentially accomplish the same thing. To these nodes, the large loss of voting weight is potentially, but not imminently, dangerous. If the 60M minimum threshold applied, they would cease voting until being reconnected to the rest of the network. With this change, they will cease voting for some time until the trended weight is reduced to 45M. I believe this will take six days, as (1006+408)/14*0.67 < 45M. However, this time can be reduced if Bob holds a significant state and reduce further if each node in Fragment can be further separated from its peers; a 1M weight node which only sees itself would have its trended online weight reduced to 45M after less than 5 days.

After the trended online vote weight of Fragment has been reduced to Fragment's weight, Bob reconnects to Fragment, while remaining connected to Primary. Primary sees no change and a secure network. Fragment now has 45M vote weight, sufficient to resume verifying and cementing transactions. Bob now spends a potentially large sum on Fragment, broadcasting to Fragment transactions he hasn't broadcast on Primary. These can be confirmed and cemented. Then Bob broadcasts different transactions, from the same accounts, to Primary, and has them confirmed and cemented. Bob was able to earn twice whatever nano stake he acquired, potentially depositing at exchanges and cashing out. Furthermore, when the networks are allowed to interact once again, both versions of his transactions are cemented which will presumably cause issues for those funds down the road.

Here's a summary.
With 60M minimum:
Partition A -> Unaffected.
Partition B -> Stalled for 2 weeks. No double spends, no forks.

Without 60M minimum:
Partition A -> spent and cemented one version of Bob's spends.
Partition B -> Stalled for 6 days. Spent and cemented the other version of Bob's spends.
Network now must somehow reconcile two different cemented versions, in addition to suffering Bob's theft.

These are difficult and unlikely attacks, to be sure, but they were impossible with the 60M minimum, and their consequences seem grave. As such they should be strongly defended against. The removal of the 60M minimum marginally helps liveliness, but it also appears to immediately open new attack vectors that were not previously possible. Surely security should take priority over liveliness in all situations where the liveliness benefits haven't been demonstrated to be substantial and significant?

@qwahzi
Copy link
Collaborator

qwahzi commented Jan 8, 2021

@Vyryn Maybe I'm misunderstanding your comment, Colin's comment, or the code changes, but I believe that the 60M online_weight_minimum still exists, so your 80M vs 45M partition scenario can't happen, even after two weeks. The minority partition won't confirm transactions:

https://github.com/nanocurrency/nano-node/blob/develop/nano/node/online_reps.cpp

And if the malicious entity is operating in both partitions, then the quorum requirement is still the max of "2 week trended online weight" or the "online peers' stake" times .67

EDIT:

@Srayman corrected me on Discord, and it looks like Vyryn's right. Per have_quorum in election.cpp, online_weight_minimum would not get checked in this change, so theoretically you could have an 80M and 45M partition (after 2 weeks trended) and achieve confirmation in both partitions (though the non-communicating partition requirement + 2 week timeframe would admittedly make this extremely unlikely).

Proposed develop branch:

bool nano::election::have_quorum (nano::tally_t const & tally_a) const
{
    auto i (tally_a.begin ());
    ++i;
    auto second (i != tally_a.end () ? i->first : 0);
    auto delta_l (node.online_reps.delta ());
    release_assert (tally_a.begin ()->first >= second);
    bool result{ (tally_a.begin ()->first - second) >= delta_l };
    return result;
}

Current V21 branch

bool nano::election::have_quorum (nano::tally_t const & tally_a, nano::uint128_t tally_sum) const
{
    bool result = false;
    if (tally_sum >= node.config.online_weight_minimum.number ())
    {
        auto i (tally_a.begin ());
        ++i;
        auto second (i != tally_a.end () ? i->first : 0);
        auto delta_l (node.delta ());
        result = tally_a.begin ()->first > (second + delta_l);
    }
    return result;
}

While manually setting the online_weight_minimum number to online would effectively re-enable a v21-style online_weight_minimum, I have to agree with Vyryn that this seems like a change that would slightly prioritize liveness over security by default, and given Nano's deterministic finality with no easy way re-merge partitions, I personally would prefer the default behavior be to stall

@Vyryn
Copy link

Vyryn commented Jan 9, 2021

@qwahzi I do agree it is unlikely, but if allowed it would become the least expensive double spend that can be done on Nano. You can also dos individual nodes for a while and double spend on them, eventually de-syncing all the nodes on the network. Dosing just one node is something quite achievable for an attacker.
Also, the two week timeline can potentially be reduced to as little as 5 days if the malicious actor has a 25M stake (my original scenario involved a 25M stake, but I realized the attacker, if they can dos for two weeks, doesn't actually require any stake at all to execute this double spend and simplified it accordingly), or less if the malicious actor has a higher stake.

@Vyryn
Copy link

Vyryn commented Jan 10, 2021

Some options to mitigate the added security risk discussed on discord:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
beta test cases configuration default change Change to default configuration values quality improvements This item indicates the need for or supplies changes that improve maintainability
Projects
No open projects
V22.0DB10
Pending targeted testing
Development

Successfully merging this pull request may close these issues.

None yet

8 participants