New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Ban scoring #1543

Merged
merged 13 commits into from Feb 1, 2019

Conversation

Projects
None yet
4 participants
@psteckler
Copy link
Contributor

psteckler commented Jan 29, 2019

This is a proposed RFC for ban scoring.

Paul Steckler

@psteckler psteckler force-pushed the rfc/ban-scoring branch from 5330ea1 to 49aac0b Jan 29, 2019

@bkase bkase changed the title Ban scoring [RFC] Ban scoring Jan 30, 2019

Paul Steckler and others added some commits Jan 30, 2019

Paul Steckler
@enolan

This comment has been minimized.

Copy link
Contributor

enolan commented Jan 30, 2019

I've been working with the rule that a peer gets banned if its trust score is ≤ -1. It makes the math slightly cleaner, but is arguably less human-readable. The difference is purely cosmetic since we can scale the punishment values to match.

One way of thinking about how much to punish peers is to decide the maximum rate they can do whatever we're talking about without being banned. From the code I'm working on:

(** Trust increment that sets a maximum rate of doing a bad thing (presuming the
    peer does no good things) in seconds/action. *)
(* The amount of trust that decays in the specified time period, when we're at
   the ban threshold (-1) *)
let max_rate secs = ((Record.decay_rate ** secs) *. -1.) +. 1.

Where decay_rate is the discount factor applied to trust scores per second.

Paul Steckler
Let's classify those places where punishment has been flagged, and annotate
them with suggested constructors:

- in `bootstrap_controller.ml`, for bad proofs (SEV), and a validation error when

This comment has been minimized.

@enolan

enolan Jan 30, 2019

Contributor

It'd be best if we separated the set of offenses in two.

  1. Absent hardware failures, an honest peer never does this. Should be an insta-ban, since it's a clear signal the peer is faulty or malicious.

  2. Honest peers may do this if they are out of sync with us. Set punishment level based on the cost to us of receiving/processing the bad message. Not sure how that converts into a number

This comment has been minimized.

@psteckler

psteckler Jan 30, 2019

Author Contributor

Maybe there's still room for trivial infractions, the kind where Bitcoin increments the ban score by one, like a duplicate version message. We could see that kind of thing if there are multiple Coda implementations.

@psteckler

This comment has been minimized.

Copy link
Contributor Author

psteckler commented Jan 30, 2019

One way of thinking about how much to punish peers is to decide the maximum rate they can do whatever we're talking about without being banned.

That seems to be a consideration for DoS behavior as opposed to giving-bad-information behavior, which may be Bitcoin considers those separately.

locations where punishment is warranted.

In the current code, there are often calls to the logger where
punishment is mentioned in a `TODO`. There could be an API that calls

This comment has been minimized.

@enolan

enolan Jan 30, 2019

Contributor

In the trust scores code I have:

  val record : t -> peer -> action -> unit
  (** Record an action a peer took. This may result in a ban event being
      emitted *)

Where action is a type we functor over. The intent is that calling that function emits a log message, updates the trust score, and potentially writes a ban event to a pipe. The module passed to the functor needs to provide a function action -> string so we can write useful log messages. It's an open question whether action should be a variant that enumerates everything that affects trust scores or just something like float * string. The former keeps all the trust increments together and makes changing them as a set easier, but is likely to lead to a really big variant. I'm kinda leaning towards the latter now, but 🤷‍♂️.

This comment has been minimized.

@bkase

bkase Jan 30, 2019

Contributor

I like the idea of a variant so we can keep infraction types next to each other and indexed in one spot

This comment has been minimized.

@psteckler

psteckler Jan 30, 2019

Author Contributor

In the Bitcoin sources, where misbehavior is flagged, sometimes they provide a message, sometimes not. We should always provide a message.

This comment has been minimized.

@psteckler

psteckler Jan 31, 2019

Author Contributor

The implementation of record can use the variant of bad behavior severities.

@bkase

This comment has been minimized.

Copy link
Contributor

bkase commented Jan 30, 2019

@enolan could put up a PR with the trust stuff you've been working on in an mli and you can stub the implementations -- so we have something to refer to for this RFC

@enolan

This comment has been minimized.

Copy link
Contributor

enolan commented Jan 30, 2019

I can put that up in the AM.

Show resolved Hide resolved rfcs/0011-ban-scoring.md
- in `ledger_catchup.ml`, when a root hash can't be found (SEV), or a peer returns an empty list
of transitions (instead of `None`) (TRV)
- in `linked_tree.ml`, for peers requesting nonexistent ancestor paths (MOD)
- in `parallel_scan.ml`, in `update_new_job` for unneeded merges (?) (MOD)

This comment has been minimized.

@bkase

This comment has been minimized.

@bkase

bkase Jan 30, 2019

Contributor

I think this should be SEV though

@psteckler

This comment has been minimized.

Copy link
Contributor Author

psteckler commented Jan 30, 2019

One way of thinking about how much to punish peers is to decide the maximum rate they can do whatever we're talking about without being banned.

So the higher the rate, the less the punishment?

How does this rate idea combine with the potential harm of the bad action? I'm wondering if this is more complexity than needed. I was thinking that the potential harm alone would determine the severity of punishment.

@psteckler

This comment has been minimized.

Copy link
Contributor Author

psteckler commented Jan 30, 2019

Addressed @bkase comments.

- in `catchup_scheduler.ml` and `processor.ml`, when a breadcrumb can't be built from a
transition (SEV) (same failure as in `bootstrap_controller.ml`, above)
- in `ledger_catchup.ml`, a transition could not be validated (SEV)
- in `transaction_pool.ml`, a payment check fails (SEV)

This comment has been minimized.

@bkase

bkase Jan 30, 2019

Contributor

also missing from here is the bit that Echo is adding about sending a transaction that doesn't make sense on the current staged ledger.

This is the bit that justifies the trust scoring I think. I think Echo was mentioning this somewhere, but if you receive transactions at high-velocity from some node and 99% of them work on the staged ledger, but some of them don't due to inconsistent state between the two nodes, we'd want that to be okay.

Whereas if you get one transaction each day from a node, maybe on the second or third wrong one, you'd want to ban it.

This is something unique to Coda w.r.t. Bitcoin because we expect to support high-transaction throughput.

However, on the one hand, you could argue that we should be banning based on how much pain or wasted work was inflicted by the accused. In that case, you'd want to ban if you ever receive some constant amount of failures from a node, despite seeing correct ones and no matter in what rate you're receiving them.

This comment has been minimized.

@psteckler

psteckler Jan 31, 2019

Author Contributor

added section about trust scoring

Show resolved Hide resolved rfcs/0011-ban-scoring.md Outdated

Paul Steckler added some commits Jan 31, 2019

Paul Steckler
Paul Steckler
Paul Steckler
@@ -144,7 +144,8 @@ history leading to a ban score.

## Prior art

There is existing code in Coda to maintain a set of peers banned by IP address.
There is existing code in Coda to maintain a set of peers banned by IP address in
`banlist_lib`. It will be superseded this RFC and RFC 0010 are implemented.

This comment has been minimized.

@enolan

enolan Feb 1, 2019

Contributor

Missing word: "when"

@enolan

This comment has been minimized.

Copy link
Contributor

enolan commented Feb 1, 2019

Other than the typo, I'm good.

Paul Steckler
@enolan

enolan approved these changes Feb 1, 2019

@psteckler psteckler merged commit 8a753a4 into master Feb 1, 2019

12 checks passed

ci/circleci: build Your tests passed on CircleCI!
Details
ci/circleci: build-macos Your tests passed on CircleCI!
Details
ci/circleci: build_public Your tests passed on CircleCI!
Details
ci/circleci: build_withsnark_sig Your tests passed on CircleCI!
Details
ci/circleci: build_withsnark_stake Your tests passed on CircleCI!
Details
ci/circleci: lint Your tests passed on CircleCI!
Details
ci/circleci: test-all_sig_integration_tests Your tests passed on CircleCI!
Details
ci/circleci: test-all_stake_integration_tests Your tests passed on CircleCI!
Details
ci/circleci: test-fake_hash_full_test Your tests passed on CircleCI!
Details
ci/circleci: test-unit-test Your tests passed on CircleCI!
Details
ci/circleci: test-withsnark Your tests passed on CircleCI!
Details
ci/circleci: tracetool Your tests passed on CircleCI!
Details

@psteckler psteckler deleted the rfc/ban-scoring branch Feb 1, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment