Skip to content

Commit

Permalink
Draft of CAP-0034 (change closetime semantics)
Browse files Browse the repository at this point in the history
First draft of CAP-0034, intended to address
stellar-protocol issue stellar#622, by changing the semantics of
the combination of candidates by the SCP ballot protocol
from producing an unsigned form of union of the candidates
(taking, for example, the maximum closetime of all
candidates as the closetime to externalize) to selecting
one particular signed, nominated StellarValue (comprising
transaction set, closetime, and upgrades) as the one to
externalize.
  • Loading branch information
rokopt committed Jul 7, 2020
1 parent 284dd16 commit d415114
Show file tree
Hide file tree
Showing 2 changed files with 296 additions and 0 deletions.
1 change: 1 addition & 0 deletions core/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@
| [CAP-0029](cap-0029.md) | AllowTrust when not AUTH_REQUIRED | Tomer Weller | Draft |
| [CAP-0032](cap-0032.md) | Trustline Preauthorization | Jonathan Jove | Draft |
| [CAP-0033](cap-0033.md) | Sponsored Reserve with EphemeralSponsorshipEntry | Jonathan Jove | FCP |
| [CAP-0034](cap-0034.md) | Single-Candidate Externalized Values | Terence Rokop | Draft |

### Rejected Proposals
| Number | Title | Author | Status |
Expand Down
295 changes: 295 additions & 0 deletions core/cap-0034.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,295 @@
## Preamble

```
CAP: To Be Assigned
Title: Single-Candidate Externalized Values
Recommender: Nicolas Barry, Terence Rokop
Owner: Terence Rokop
Consulted: Nicolas Barry, Jonathan Jove, David Mazières, Eric Saunders, Leigh McCulloch
Status: Draft
Created: 2020-07-06
Discussion: TBD
Protocol version: TBD
Protocol Request: https://github.com/stellar/stellar-protocol/issues/622
```

## Simple Summary
To date, the ballot protocol has generated a value to externalize by combining
all candidates in a deterministic way: the transaction set comes from a
"best-transaction-set" heuristic (generally, bigger transaction sets are
"better"), the closetime is the maximum of all of the candidates' closetimes,
and the set of ledger upgrades is one ledger upgrade type per type represented
among all candidates, where multiple upgrades of a given type are combined into
a single upgrade of that type by a type-specific algorithm (in general --
though, as it happens, for current upgrade types, all of the algorithms are
"maximum value", like the closetime combination algorithm).

This CAP proposes to change the externalized-value generation to remove the
combining and simply select one candidate value from the set of candidates.
It proposes to keep the selection heuristic for "best transaction set", which
would result in the closetime and upgrade set being precisely the ones proposed
in the candidate which contained the "best transaction set".

Because the entire externalized value would therefore be precisely one which
had been nominated by some particular node, that would also allow us to
preserve the signature from the nomination, and therefore use a signed
externalized value (currently we externalize an unsigned, or "basic", value,
because, as a combination of nominated values, it may not be equal to any one
nominated value for which we have a signature). This CAP therefore proposes
to do that -- it does not propose any specific use of that signature yet, but
preserving the signature as part of the same protocol change that allows it
to be preserved will allow us to use it in the future if we discover a way to
do so.

## Motivation
With the current protocol, a transaction with an expiration time might make it
into a candidate `StellarValue`'s transaction set, with its expiration being no
later than the `closeTime` in that `StellarValue`, but then fail as "too late"
during ledger close, as the transaction applies during ledger close happen
after the "current" ledger header, including the `closeTime`, has been updated
to reflect the externalized `StellarValue`.

When that happens, a transaction is charged a fee, and consumes a sequence
number, despite having been delayed long enough to exceed its expiration time.
The fee and sequence number aspects are potentially both problems:

- Transactions submitted by SDKs generally set expiration times so that they
know when it's safe to re-submit them (with new sequence numbers and perhaps
higher fees) if they don't seem to be making it into transaction sets with their
current offered fees (or because of transient network congestion or memory
constraints on a validator). It would make more sense if they were only charged
the fee that they were willing to pay that actually got them accepted into a
transaction set.

- Probably worse, there is a design pattern in some smart contracts which
breaks, allowing inconsistent transactions to be committed, if it encounters
this race: a smart contract submits two transactions, `A` and `B`, which it
intends to be mutually exclusive, and which it tries to ensure are mutually
exclusive by marking `A` as being valid only up to some time `t` (it will fail
with `txTOO_LATE` if it has not been committed by time `t`) and `B` as being
valid only after time `t` (it will fail with `txTOO_EARLY` if it is accepted
into a transaction set before `t`). If the race that this CAP is intended to
prevent occurs, as it can in the current protocol, then `A` might consume a
sequence number but then fail. But its consumption of the sequence number will
allow transactions dependent on `A` (which the smart contract chains after `A`)
to be executed, and if any of them succeeds, then they will perform operations
which the smart contract intended to be performed only if the operations of `A`
had been performed. This race is probably uncommon, because smart contracts
tend to set expiration times much farther out than SDKs (on the order of
anywhere from hours to years), but potentially arbitrarily bad for a smart
contract that did hit it.

Additionally, externalizing transaction sets that contain significant numbers
of transactions that are doomed to fail with `txTOO_LATE` during ledger close
might mean that we're choosing sets that we would have considered worse than
other available candidates if we had taken into account that those transactions
were doomed.

### Goals Alignment
* The Stellar Network should be a blockchain that people trust. Though the
events that could lead to a smart contract executing inconsistently because of
the problem that this CAP intends to address might be very rare, they might
also be of a kind of which even a single event could significantly damage trust.
* The Stellar Network should be fast and reliable (the stellar-core OKRs contain
specific targets). Keeping transactions which are guaranteed to fail later in
externalized transaction sets is wasteful.

## Abstract
We change the (deterministic) mechanism by which the ballot protocol computes
the `StellarValue` to be externalized from producing a value comprising the
best transaction set (selected by a deterministic heuristic which favors larger
sets) of any candidate, the maximum `closeTime` of any candidate, and a set of
"maximal" ledger `upgrades` from all candidates to one which simply produces the
exact candidate `StellarValue` which contains the best transaction set (the
heuristic which decides which transaction set is best does not change).

In particular, this affects how soon the core notices when transactions expire,
and also how soon ledger upgrades are performed (as different proposed upgrades
from different nodes in the same set of candidate `StellarValues` will no
longer be performed all at once; only one set will be selected at a time).

## Specification

### XDR
There are no changes to any XDR in this CAP. The treatment of the `closeTime`
and the `upgrades` in the `StellarValue` XDR in some code paths changes, but
not the XDR itself.

### Semantics

The changes required are limited to stellar-core, and comprise the following:

- The protocol version is bumped (currently anticipated to be to 14).
- `HerderSCPDriver::validateValueHelper()` changes to expect, in new protocols
only, that values used by the nomination protocol are `SIGNED`.
- `HerderSCPDriver::combineCandidates()`, in new protocols only (it preserves
its old behavior in old protocols), changes in the following ways:
- It produces a "combined" `closeTime` and set of `upgrades` directly from
the `StellarValue` containing the winning transaction set. In particular,
it does not attempt any trimming of the winning transaction set. (That set
should already be consistent with `closeTime` and `upgrades`, since they
all came from one nominated `StellarValue`.)
- It returns `SIGNED` `StellarValue`s.

## Design Rationale
The proposal in this CAP is the third that we have considered as a means of
fixing the problems described in the "Motivation" section.

- The first idea was to change the point at which the current ledger header's
`closeTime` was updated from before transactions are applied to afterwards.
(Ledger `upgrades` are already performed after transaction applies.) This
would prevent an increase in the `closeTime` brought about by candidates other
than the one whose transaction set was selected from causing transactions to
fail (it would not occur until after they had succeeded). However, this
raised the concern that a long delay between ledger closes would lead to
transactions succeeding despite having expiration times arbitrarily earlier
than the real time of their committing -- possibly long after clients had
already assumed that they would expire (and acted on that assumption).

- The second idea was to remove transactions from the best transaction set
selected by the combine-candidates code in the ballot protocol based on the
(maximal) `closeTime` just selected by that code. However, that would seem to
have opened up a new denial-of-service attack on the ledger:
an attacker could create large numbers of transactions with very high fees,
making it likely that they would be accepted into transaction sets, but with
extremely short expiration times, that would almost certainly come before the
next ledger close. They could therefore fill up the transaction sets with
such transactions, yet avoid the transaction fees, leaving the transaction
sets produced by the ballot protocol with little or no room for legitimate
transactions.

After rejecting those two possibilities for the reasons given above, we hit
on the idea in the current proposal of making the ballot protocol simply
select the `closeTime` from the `StellarValue` containing the best transaction
set (rather than doing any combining with the `closeTimes` from other
`StellarValue`s). A (well-behaved) node that had nominated the ultimately
winning `StellarValue` would already have left any expired transactions out of
its transaction set, and those transactions would neither pay fees nor consume
sequence numbers.

Always choosing a `closeTime` from the same `StellarValue` as the winning
transaction set raised the possibility, already described in the "Simple
Summary", of also preserving the _signature_ of that `StellarValue`,
from the node that nominated it (specifically, returning a `SIGNED` rather
than `BASIC` `StellarValue` from `HerderSCPDriver::combineCandidates()`).
Experimenting with implementing that possibility raised a further design choice:
because the `SIGNED` `StellarValue` also contains a set of ledger `upgrades`,
we could only start using `SIGNED` `StellarValue`s in the ballot protocol if
we also changed `upgrades` to come from the single winning `StellarValue`
rather than being combined in a way similar to the `closeTime`s as they are
in the current protocol.

Here are some factors we considered before deciding to propose to change the
`upgrade` selection as well as the `closeTime` selection, and thereby to allow
(and to propose) preserving `SIGNED` `StellarValue`s in the ballot protocol.

- Upgrades do happen independently of the transactions: we could view the ledger
closes as an optimization of alternating transaction set applies and ledger
upgrades. So we did have a free choice between proposing to change upgrades in
the same way as we were proposing to change closetime, or not doing so and
letting them work differently (single-candidate-selection for closetimes and
txsets, candidate-combination for upgrades).

- It's clear that we could gain the new information of `SIGNED` `StellarValues`,
identifying one particular node which nominated the whole `StellarValue`, only
if we do change `upgrades` in the same way as `closeTime`s. What isn't clear
to us yet is how valuable that is (if at all). If someone were to come up with
a clearly-valuable way of putting the new information to use, then that would
provide a push in the direction of changing `upgrades` to match `closeTimes`.

- There isn't any correlation that we've found among the different upgrades that
we combine, in the sense that today, if changing any given parameter `A` from
`1` to `2` and changing any given parameter `B` from `1` to `2` would each be
sensible individually, then it would also be sensible to make both changes at
once. But if the type of possible upgrades could expand to one in which certain
combinations of parameter changes would be insensible, then it might no longer
make sense just to combine each parameter individually -- the simple union of
individual changes could produce a result which was inconsistent. In that case,
the right way to combine them consistently would become ambiguous; there could
be multiple options of which none would include all candidate parameter changes.
That would not be inherently wrong -- we have the same situation with
transaction sets, and we make a heuristic but deterministic choice as to which
is the best. But it might be some argument for choosing a single nominated
upgrade in the same way as we already choose a single nominated tx set, and are
proposing to choose a single nominated closetime, just because choosing the best
single nominated set of upgrades could be simpler (there might be far fewer
options to choose among) than choosing the best consistent combination of all
tx sets.

- A risk of not combining upgrades might be that upgrades that some nodes
proposed, which would be positives from their perspective, and happened to be
neutral from other nodes' perspectives, and therefore good for the network as a
whole, might never happen, because the nodes that considered the upgrades
neutral (and therefore did not propose them) always had their tx sets nominated,
perhaps because they were much higher-traffic validators. (Open question: is
that what would happen, or would the high-traffic validators, upon seeing
upgrade nominations that they found neutral, start flooding them too on the
grounds that some nodes' wanting them and the others' not actively not wanting
them meant that they were a net good for the network as a whole?)

## Protocol Upgrade Transition

### Backwards Incompatibilities
Semantic changes resulting from this CAP we hope to be limited to the
correction of the potential inconsistency of smart contract behavior
described in the "Motivation". Smart contracts, and other clients, should
not have _relied_ on that inconsistency, as it requires a race to occur,
and one which is probably very rare.

### Impacts on layers outside Core
We do not expect any layers outside of stellar-core to have to make any code
changes as a result of this CAP. The principal potentially visible (though
perhaps not yet seen) effect that we intend is to prevent smart contracts
structured as described in the "Motivation" section from encountering the race
also described therein.

### Resource Utilization
Using `SIGNED` `StellarValue`s in the ballot protocol will result in the ballot
protocol checking signatures in some code paths where there were previously no
signatures to check. However, the signatures will be from nodes that had
nominated values in the latest round of the nomination protocol, so the
signature cache might ameliorate any noticeable effects of this.

## Security Concerns
Two classes of security concerns in particular have come up repeatedly during
consideration of this proposal:

- Consistency in treatment of transactions across different phases of the
stellar-core protocol. The most severe problem that we are attempting to
address is the result of what smart contracts might at least interpret as an
inconsistency between the consumption of a sequence number and the rejection
of a transaction as not having committed soon enough; the problem which caused
us to reject our first considered solution was what clients might at least
interpret as an inconsistency between the clock time and the success of a
transaction (which had been set to expire long before the clock time when the
client finds out that it has succeeded).

- Denial-of-service attacks on the ledger. The problem which caused us to
reject our second considered solution was that it would have allowed attackers
to spam transactions but avoid fees.

So far, we have not found analogous problems introduced by this third attempt.

## Test Cases
- The "SCP Driver" test, which, using the current protocol and an older one,
confirms that a set of candidate `StellarValue`s produces the expected
combined `closeTime`, changes in three ways:
- It tests that old protocols continue to behave the same way, but that
in new protocols, the combined `closeTime` is simply that of the
`StellarValue` containing the best transaction set, not the maximum
`closeTime` of all candidates.
- It is enhanced to test for expected `upgrades` as well -- and these, like
`closeTime`, behave differently between old and new protocols.
- It tests that old protocols produce `BASIC` `StellarValue`s, but new
protocols produce `SIGNED` ones.
- The `PendingEnvelopes recvSCPEnvelope` and `quorum tracker` change to
expect `BASIC` `StellarValue`s from old protocols (as before this CAP) but
`SIGNED` `StellarValue`s from new protocols (as after this CAP).

## Implementation
An implementation of this protocol has been written, which as far as the
author knows is complete and correct in its semantics; it will certainly need
refactoring, but it might constitute a demonstration of the practicality of
implementing this proposal, and of a bound on the scope of the changes required,
as well as a way of experimenting with its consequences:
https://github.com/stellar/stellar-core/compare/master...rokopt:proto-622-closetime?w=1

0 comments on commit d415114

Please sign in to comment.