Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Ban scoring #1543

Merged
merged 13 commits into from Feb 1, 2019
Merged

[RFC] Ban scoring #1543

merged 13 commits into from Feb 1, 2019

Conversation

psteckler
Copy link
Member

This is a proposed RFC for ban scoring.

@bkase bkase changed the title Ban scoring [RFC] Ban scoring Jan 30, 2019
@enolan
Copy link
Contributor

enolan commented Jan 30, 2019

I've been working with the rule that a peer gets banned if its trust score is ≤ -1. It makes the math slightly cleaner, but is arguably less human-readable. The difference is purely cosmetic since we can scale the punishment values to match.

One way of thinking about how much to punish peers is to decide the maximum rate they can do whatever we're talking about without being banned. From the code I'm working on:

(** Trust increment that sets a maximum rate of doing a bad thing (presuming the
    peer does no good things) in seconds/action. *)
(* The amount of trust that decays in the specified time period, when we're at
   the ban threshold (-1) *)
let max_rate secs = ((Record.decay_rate ** secs) *. -1.) +. 1.

Where decay_rate is the discount factor applied to trust scores per second.

Let's classify those places where punishment has been flagged, and annotate
them with suggested constructors:

- in `bootstrap_controller.ml`, for bad proofs (SEV), and a validation error when
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be best if we separated the set of offenses in two.

  1. Absent hardware failures, an honest peer never does this. Should be an insta-ban, since it's a clear signal the peer is faulty or malicious.

  2. Honest peers may do this if they are out of sync with us. Set punishment level based on the cost to us of receiving/processing the bad message. Not sure how that converts into a number

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there's still room for trivial infractions, the kind where Bitcoin increments the ban score by one, like a duplicate version message. We could see that kind of thing if there are multiple Coda implementations.

@psteckler
Copy link
Member Author

One way of thinking about how much to punish peers is to decide the maximum rate they can do whatever we're talking about without being banned.

That seems to be a consideration for DoS behavior as opposed to giving-bad-information behavior, which may be Bitcoin considers those separately.

locations where punishment is warranted.

In the current code, there are often calls to the logger where
punishment is mentioned in a `TODO`. There could be an API that calls
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the trust scores code I have:

  val record : t -> peer -> action -> unit
  (** Record an action a peer took. This may result in a ban event being
      emitted *)

Where action is a type we functor over. The intent is that calling that function emits a log message, updates the trust score, and potentially writes a ban event to a pipe. The module passed to the functor needs to provide a function action -> string so we can write useful log messages. It's an open question whether action should be a variant that enumerates everything that affects trust scores or just something like float * string. The former keeps all the trust increments together and makes changing them as a set easier, but is likely to lead to a really big variant. I'm kinda leaning towards the latter now, but 🤷‍♂️.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of a variant so we can keep infraction types next to each other and indexed in one spot

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Bitcoin sources, where misbehavior is flagged, sometimes they provide a message, sometimes not. We should always provide a message.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of record can use the variant of bad behavior severities.

@bkase
Copy link
Member

bkase commented Jan 30, 2019

@enolan could put up a PR with the trust stuff you've been working on in an mli and you can stub the implementations -- so we have something to refer to for this RFC

@enolan
Copy link
Contributor

enolan commented Jan 30, 2019

I can put that up in the AM.

rfcs/0011-ban-scoring.md Show resolved Hide resolved
- in `ledger_catchup.ml`, when a root hash can't be found (SEV), or a peer returns an empty list
of transitions (instead of `None`) (TRV)
- in `linked_tree.ml`, for peers requesting nonexistent ancestor paths (MOD)
- in `parallel_scan.ml`, in `update_new_job` for unneeded merges (?) (MOD)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be SEV though

@psteckler
Copy link
Member Author

One way of thinking about how much to punish peers is to decide the maximum rate they can do whatever we're talking about without being banned.

So the higher the rate, the less the punishment?

How does this rate idea combine with the potential harm of the bad action? I'm wondering if this is more complexity than needed. I was thinking that the potential harm alone would determine the severity of punishment.

@psteckler
Copy link
Member Author

Addressed @bkase comments.

- in `catchup_scheduler.ml` and `processor.ml`, when a breadcrumb can't be built from a
transition (SEV) (same failure as in `bootstrap_controller.ml`, above)
- in `ledger_catchup.ml`, a transition could not be validated (SEV)
- in `transaction_pool.ml`, a payment check fails (SEV)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also missing from here is the bit that Echo is adding about sending a transaction that doesn't make sense on the current staged ledger.

This is the bit that justifies the trust scoring I think. I think Echo was mentioning this somewhere, but if you receive transactions at high-velocity from some node and 99% of them work on the staged ledger, but some of them don't due to inconsistent state between the two nodes, we'd want that to be okay.

Whereas if you get one transaction each day from a node, maybe on the second or third wrong one, you'd want to ban it.

This is something unique to Coda w.r.t. Bitcoin because we expect to support high-transaction throughput.

However, on the one hand, you could argue that we should be banning based on how much pain or wasted work was inflicted by the accused. In that case, you'd want to ban if you ever receive some constant amount of failures from a node, despite seeing correct ones and no matter in what rate you're receiving them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added section about trust scoring

@@ -144,7 +144,8 @@ history leading to a ban score.

## Prior art

There is existing code in Coda to maintain a set of peers banned by IP address.
There is existing code in Coda to maintain a set of peers banned by IP address in
`banlist_lib`. It will be superseded this RFC and RFC 0010 are implemented.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing word: "when"

@enolan
Copy link
Contributor

enolan commented Feb 1, 2019

Other than the typo, I'm good.

@psteckler psteckler merged commit 8a753a4 into master Feb 1, 2019
@psteckler psteckler deleted the rfc/ban-scoring branch February 1, 2019 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants