Force new era only if 1/3 validators is disabled. by tomusdrw · Pull Request #3533 · paritytech/substrate

tomusdrw · 2019-09-02T15:55:19Z

Currently validators that are slashed are disabled till the end of the era (i.e. we ignore anything they do). We also trigger a new era right after the end of current session, which seems completely unnecessary.

This PR forces a new era only if we disable at least one third of the validators to prevent stalling block production.

Marking as WiP as I'm unsure if this is actually needed either or if we should rather never force a new era.

burdges · 2019-09-02T16:22:15Z

We're still missing an implementation for the finality fall back from GRANDPA to BABE. We therefore cannot actually finalize anything if 1/3 get disabled and thus cannot switch, so this should probably happen lower like 1/5 or 1/6. cc @AlistairStewart

srml/session/src/lib.rs

srml/staking/src/lib.rs

Co-Authored-By: Bastian Köcher <bkchr@users.noreply.github.com>

Demi-Marie

Pinging @rphmeier ― does this make us vulnerable to attacks? If not, this should be fine.

srml/session/src/lib.rs

kianenigma

I am not the biggest expert here (consensus stuff) but afaik this we are screwed only if 1/3 are disabled seem to be Grandpa-specific. I think it might be worthwhile to make this configurable in session:

/// Force a new era if we are left with less than the given ratio of validators as _active_.
ForceEraThreshold: Get<Perbill>

or sth like this.

rphmeier · 2019-09-05T13:54:14Z

we are screwed only if 1/3 are disabled seem to be Grandpa-specific.

See the FLP impossibility result - 1/3 offline is the boundary for asynchronous liveness. But yes, could be made configurable. Many protocols dealing with a common coin to achieve async liveness can only tolerate 1/5 or 1/7 byzantine.

tomusdrw · 2019-09-06T07:37:48Z

Ready for review. Added parametrisation and set to 1/5 for node-runtime.

kianenigma

nice

gavofyork · 2019-09-06T17:27:32Z

srml/authority-discovery/src/lib.rs

 	}

+	parameter_types! {
+		pub const DisabledValidatorsThreshold: Perbill = Perbill::from_percent(33);


why 33 here?

I'ts only a test, I don't think it matters at all.

gavofyork · 2019-09-06T17:35:06Z

I'm not convinced this is necessary and it's strictly less safe than before.

It makes us more open to liveness attacks where the attacker cannot control the exact timing of each individual node that they take down or where they can't take down all nodes at once.

Basically, by further delaying the replacement of offline or even outright Byzantine validators by upto 24 hours, we reduce our liveness tolerance. In the simplest case with 4 validators, then as soon as as one validator goes offline (accidentally or otherwise), then a Byzantine party has up to 24 hours in which to mount an attack to stall the network. In reality, the sooner we replace known faulty nodes the safer.

CC @AlistairStewart

gavofyork · 2019-09-06T17:35:48Z

on ice until I have proof from the research team that this is not strictly less safe.

tomusdrw · 2019-09-08T08:16:44Z

@gavofyork afaict, there is nothing currently in code preventing the same validator to be reelected in the next era, so forcing it does not guarantee that the offline/byzantine validators are replaced at all.
Actually before we do elections they are at least disabled, if they are elected again they are considered back online afaict.

burdges · 2019-09-08T11:46:23Z

@gavofyork If we rerun elections due to slashing, then small shashes can influence the randomness, maybe influencing parachain assignments, elections, smart contract results, etc., so this improves security for those functions. We cannot do 1/3rd here like I said above, so the issue is balancing safer randomness with any power attackers gain by luck or imprecise attacks.

I gather: We currently oversupply validator candidates in Phragmen, which helps ensure all elected validators actually start when an era starts somehow? We cannot however reuse those extras when one gets kicked because we cannot expect a validator who fails to be elected to remain online? We must therefore rerun Phragmen to begin a new era, right?

We could allow extras to make blocks and do secondary validity checks, so that they'd stay online for the profits, and so that inserting them cannot influence the randomness.

We'll discuss the options some this week.

@tomusdrw I discussed that issues in the new Slashing with NPoS (md) document, thanks to @rphmeier noticing our slashing issues. Anytime a nominator N gets slashed due to actions by a validator V then we should:

First, we post a slashing transaction to the chain, which drops V from the active validator list by invalidating their session keys, which makes everyone ignore V for the remainder of the era, and also invalidates any future blocks that do not ignore V. We also remove all nomination approval votes by any nominator for V, even those who currently allocate V zero stake.

As a result, V should stay out until nominators manually reinstate them.

Second, we remove all N's nomination approval votes for future eras. We do not remove N's current nominations for the current era or reduce the stake currently backing other validators. Also we permit N to add new nomination approval votes for future eras during the current era. We also notify N that V cause them to be slashed.

It'd remain possible for nominators to reinstate V in the next era with manual effort though. We could permit reinstating in the next session even without an era change, but not without manual approval since this approval switches the slashing computation from a max to a sum.

gavofyork · 2019-09-11T13:27:11Z

We currently oversupply validator candidates in Phragmen

incorrect. you might be thinking of our usage of phragmen for council elections.

If we rerun elections due to slashing

we don't rerun them immediately; only at the end of the session. it has no effect on the rotation of babe authorities - the next epoch's validator set will be the same. all it means is that we rotate them off a bit faster than we otherwise would. for polkadot, assuming a validator wanted to grief the network in this way, they would essentially rotate themselves off/on every epoch, thereby causing 6x more elections than normal (there are only 6 epochs/sessions in an era). this isn't such a big deal.

Actually before we do elections they are at least disabled

this currently has no effect.

--

the only question in my mind is: is a 6x increase in elections (that can be caused by only 1 bad validator) sufficiently annoying to bring in this logic that would essentially increase the requirement from only 1 validator to some minimum number of validators.

@AlistairStewart @burdges

burdges · 2019-09-11T19:48:41Z

Anytime we rerun elections we should wait at least one full sessions so that nobody can bias their voting based on the expected randomness. I gather this is already baked into the code pretty deep, so actually not problem there.

I doubt a the factor of 6 matters here. I'll push Al to weigh in.

node/runtime/src/lib.rs

gavofyork · 2019-09-19T08:25:29Z

Updated to 1/6 of validators as per @AlistairStewart 's suggestion.

* Force new era only when over third validators is disabled. * Update srml/session/src/lib.rs Co-Authored-By: Bastian Köcher <bkchr@users.noreply.github.com> * Update srml/staking/src/lib.rs Co-Authored-By: Bastian Köcher <bkchr@users.noreply.github.com> * Parametrize the threshold. * Bump runtime version. * Update node/runtime/src/lib.rs * Fix build. * Fix im-online test.

tomusdrw added 2 commits September 2, 2019 17:18

Force new era only when over third validators is disabled.

3fa6c4a

Merge branch 'master' into td-new-era

09d1397

tomusdrw added A0-please_review Pull request needs code review. M4-core labels Sep 2, 2019

tomusdrw requested a review from kianenigma as a code owner September 2, 2019 15:55

bkchr reviewed Sep 3, 2019

View reviewed changes

srml/session/src/lib.rs Outdated Show resolved Hide resolved

srml/staking/src/lib.rs Outdated Show resolved Hide resolved

tomusdrw and others added 2 commits September 3, 2019 10:58

Update srml/session/src/lib.rs

e5b5c84

Co-Authored-By: Bastian Köcher <bkchr@users.noreply.github.com>

Update srml/staking/src/lib.rs

52dee45

Co-Authored-By: Bastian Köcher <bkchr@users.noreply.github.com>

Demi-Marie reviewed Sep 3, 2019

View reviewed changes

srml/session/src/lib.rs Outdated Show resolved Hide resolved

kianenigma reviewed Sep 4, 2019

View reviewed changes

srml/session/src/lib.rs Outdated Show resolved Hide resolved

kianenigma reviewed Sep 4, 2019

View reviewed changes

tomusdrw added 3 commits September 5, 2019 18:32

Merge branch 'master' into td-new-era

e3b32ce

Parametrize the threshold.

edcb349

Bump runtime version.

f3083d9

tomusdrw changed the title ~~WiP: Force new era only if 1/3 validators is disabled.~~ Force new era only if 1/3 validators is disabled. Sep 6, 2019

kianenigma approved these changes Sep 6, 2019

View reviewed changes

kianenigma added A8-looksgood and removed A0-please_review Pull request needs code review. labels Sep 6, 2019

gavofyork reviewed Sep 6, 2019

View reviewed changes

gavofyork added the A1-onice label Sep 6, 2019

gavofyork removed the A1-onice label Sep 19, 2019

Merge branch 'master' into td-new-era

e1da8bb

gavofyork reviewed Sep 19, 2019

View reviewed changes

node/runtime/src/lib.rs Outdated Show resolved Hide resolved

Update node/runtime/src/lib.rs

0bf14e9

paritytech-ci added the B2-breaksapi label Sep 19, 2019

tomusdrw added 2 commits September 19, 2019 10:50

Fix build.

0326823

Fix im-online test.

4b0de1e

gavofyork merged commit d320249 into master Sep 19, 2019

gavofyork deleted the td-new-era branch September 19, 2019 10:04

Conversation

tomusdrw commented Sep 2, 2019

Uh oh!

burdges commented Sep 2, 2019

Uh oh!

Uh oh!

Uh oh!

Demi-Marie left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kianenigma left a comment

Choose a reason for hiding this comment

Uh oh!

rphmeier commented Sep 5, 2019

Uh oh!

tomusdrw commented Sep 6, 2019

Uh oh!

kianenigma left a comment

Choose a reason for hiding this comment

Uh oh!

gavofyork Sep 6, 2019

Choose a reason for hiding this comment

Uh oh!

tomusdrw Sep 8, 2019

Choose a reason for hiding this comment

Uh oh!

gavofyork commented Sep 6, 2019

Uh oh!

gavofyork commented Sep 6, 2019

Uh oh!

tomusdrw commented Sep 8, 2019

Uh oh!

burdges commented Sep 8, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gavofyork commented Sep 11, 2019

Uh oh!

burdges commented Sep 11, 2019

Uh oh!

Uh oh!

gavofyork commented Sep 19, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

burdges commented Sep 8, 2019 •

edited

Loading