Skip to content

Commit

Permalink
Draft of CAP-0034 (change closetime semantics) (#671)
Browse files Browse the repository at this point in the history
First draft of CAP-0034, intended to address
stellar-protocol issue #622, by changing the semantics of
the selection of the closetime when combining candidates in
the SCP nomination protocol.

The current nomination protocol produces an unsigned
("basic") form of union of the candidates which uses the
maximum closetime of all candidates as the closetime to
nominate.

The protocol proposed by this CAP would produce a signed
StellarValue, whose transaction set and upgrades are chosen
in the same way as in the current protocol, but whose
closetime is precisely the one in the same StellarValue as
the best transaction set.

(The change to closetime semantics is sufficient by itself
to address stellar-protocol issue #622; the change to make
the nominated StellarValue signed is an opportunity that we
take which the closetime semantics change provides.  The
signature is on the <txSet, closetime> pair, so by taking
the exact closetime from the same StellarValue as the chosen
transaction set, we allow ourselves to use the signature
from the node that proposed that StellarValue.)
  • Loading branch information
rokopt committed Jul 9, 2020
1 parent a0e4114 commit 1a155df
Show file tree
Hide file tree
Showing 2 changed files with 315 additions and 0 deletions.
1 change: 1 addition & 0 deletions core/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@
| [CAP-0029](cap-0029.md) | AllowTrust when not AUTH_REQUIRED | Tomer Weller | Draft |
| [CAP-0032](cap-0032.md) | Trustline Preauthorization | Jonathan Jove | Draft |
| [CAP-0033](cap-0033.md) | Sponsored Reserve with EphemeralSponsorshipEntry | Jonathan Jove | FCP |
| [CAP-0034](cap-0034.md) | Preserve Transaction-Set/Close-Time Affinity During Nomination | Terence Rokop | Draft |

### Rejected Proposals
| Number | Title | Author | Status |
Expand Down
314 changes: 314 additions & 0 deletions core/cap-0034.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,314 @@
# CAP-0034: Preserve Transaction-Set/Close-Time Affinity During Nomination

## Preamble

```text
CAP: To Be Assigned
Title: Preserve Transaction-Set/Close-Time Affinity During Nomination
Authors: Terence Rokop, Nicolas Barry
Status: Draft
Created: 2020-07-06
Discussion: https://github.com/stellar/stellar-protocol/issues/622
Protocol version: TBD
```

## Simple Summary

Make the nomination protocol select the close time from the single candidate
whose transaction set it selects (rather than the current protocol's selection
of the maximum close time of all candidates), and preserve the signature of the
chosen `<close time, transaction set>` pair.

## Motivation

With the current protocol, the following sequence of events can occur:

- A transaction `T` with an expiration time makes it into a candidate
`StellarValue`'s transaction set, with its expiration being no later than the
`closeTime` in that `StellarValue`.
- That transaction set is nominated, but with a later `closeTime`, as some
other candidate(s) had a later `closeTime`, and the current protocol selects
the maximum.
- The ledger-closing process begins, and the ledger header is updated to reflect
the newly-chosen `closeTime`.
- The ledger-closing process reaches the point of applying transactions.
Each transaction is charged a fee and consumes its sequence number, and is then
re-validated (and then applied if and only if the re-validation succeeds).
`T`'s `closeTime` turns out to be before the newly-chosen `closeTime`. `T`
therefore returns `txTOO_LATE` without being applied.

So `T` consumes a sequence number (and is charged a fee) but is never applied.
There is a design pattern in some smart contracts which breaks, allowing
inconsistent transactions to be committed, if it encounters this race: a smart
contract might submit a chain of transactions, with incrementing sequence
numbers, each intended to be executed only if the previous ones were. If there
were transactions that were intended to be constrained by the incrementing
sequence numbers only to execute if `T` executed, they would do so because `T`
had consumed its sequence number, even though it was never applied. The
dependent transactions would then perform operations which the smart contract
had intended to be performed only if `T` had been executed. This would
potentially be arbitrarily bad for the smart contract.

A less catastrophic problem, but an inefficiency, is that externalizing
transaction sets that contain significant numbers of transactions that are
doomed to return `txTOO_LATE` during ledger close might mean that we're
choosing sets that we would have considered worse than other available
candidates if we had taken into account that those transactions were doomed.

More cosmetically, it might also come across as unfair, or at least
counterintuitive, considering that getting into a transaction set is what a
client pays a fee for, that the occasional transaction might be charged a fee
despite being rejected for not having made it into a transaction set in time.
Although this race is different from the perspective of stellar-core from the
usual case of `txTOO_LATE` in which the transaction is discovered to have
expired before being accepted into the transaction set, the race looks the same
from the client's perspective as any other `txTOO_LATE` -- except that when the
race is the cause, the client is, inexplicably from its perspective, charged a
fee.

### Goals Alignment

This CAP is aligned with the following Stellar Network Goals:

- The Stellar Network should be secure and reliable.

- The Stellar Network should run at scale and at low cost to all users.

It also aligns with the following Stellar Network Value:

- The protocol should bias toward simplicity.

## Abstract

Currently, the nomination protocol produces, from a set of candidate
`StellarValue`s, a nominated `StellarValue` comprising the best transaction
set (as selected by a deterministic heuristic which favors larger sets) of
any candidate, the maximum `closeTime` of any candidate, and a set of
"maximal" ledger `upgrades` from all candidates.

We propose to change the protocol to take the `closeTime` from the same
candidate `StellarValue` as the best transaction set (the heuristic which
decides which transaction set is best does not change; nor does the method of
generating `upgrades`).

In particular, this affects how soon the core notices when transactions expire.

## Specification

There are no changes to any XDR in this CAP. The treatment of the `closeTime`
in the `StellarValue` XDR in some code paths changes, as does the use of
`SIGNED` `StellarValue`s in some places where `BASIC` ones are currently used,
but the XDR itself does not change.

This CAP proposes to change the externalized-value generation to remove the
current combining of closetimes and simply select the one from the same
nominated value as the best transaction set. (It proposes to preserve the
selection heuristic for "best transaction set" unchanged.)

This CAP also proposes to make the nomination protocol produce a signed
value (currently we externalize an unsigned, or "basic", value, because, as a
combination of candidate values, it may not be equal to any one candidate value
for which we have a signature). It does not propose any specific use of that
signature yet, but preserving the signature as part of the same protocol
change that allows it to be preserved will allow us to use it in the future if
we discover a way to do so.

As with any protocol change, the new code must remember and maintain
older protocols' behavior, and continue to use old behavior until after
the network externalizes the ledger upgrade to the new protocol, through
consensus.

Once the network has upgraded to the new protocol, the following semantic
changes occur:

- The selection during the nomination protocol of a `StellarValue` to
externalize from the set of candidates occurs by choosing the `closeTime`
from the `StellarValue` which contains the "best" transaction set (according
to the same heuristic metric that already exists for "best transaction set").
The signature of that `StellarValue` is preserved, so it remains `SIGNED`.
The externalized `upgrades` set is still computed as a maximal combination of
all the candidate upgrade sets, but that does not affect the signature -- we
were already (even in older protocols) signing only a version of the
`StellarValue` with the `upgrades` stripped.

- Validation of values during the ballot protocol phase expects to find
`SIGNED` rather than `BASIC` values, and checks their signatures.

## Design Rationale

The proposal in this CAP is the third that we have considered as a means of
fixing the problems described in the "Motivation" section. It has seemed
from the beginning that the problem could be addressed only through a CAP:
it is a question of what relationships to maintain among the payment of
fees, the consumption of sequence numbers, and the validity of transactions,
that also leads to a question of what value to externalize given a set of
candidates, and all of those questions must be answered by the Core through
consensus.

- The first idea was to change the point at which the current ledger header's
`closeTime` was updated from before transactions are applied to afterwards.
(Ledger `upgrades` are already performed after transaction applies.) This
would prevent an increase in the `closeTime` brought about by candidates other
than the one whose transaction set was selected from causing transactions to
fail (it would not occur until after they had been applied). However, this
raised the concern that a long delay between ledger closes would lead to
transactions succeeding despite having expiration times arbitrarily earlier
than the real time of their committing -- possibly long after clients had
already assumed that they would expire (and acted on that assumption).

- The second idea was to remove transactions from the best transaction set
selected by the combine-candidates code in the nomination protocol based on the
(maximal) `closeTime` just selected by that code. However, that would seem to
have opened up a new denial-of-service attack on the ledger:
an attacker could create large numbers of transactions with very high fees,
making it likely that they would be accepted into transaction sets, but with
extremely short expiration times, that would almost certainly come before the
next ledger close. They could therefore fill up the transaction sets with
such transactions, yet avoid the transaction fees, leaving the transaction
sets produced by the nomination protocol with little or no room for legitimate
transactions.

After rejecting those two possibilities for the reasons given above, we hit
on the idea in the current proposal of making the nomination protocol simply
select the `closeTime` from the `StellarValue` containing the best transaction
set (rather than doing any combining with the `closeTimes` from other
`StellarValue`s). A (well-behaved) node that had nominated the ultimately
winning `StellarValue` would already have left any expired transactions out of
its transaction set, and those transactions would neither pay fees nor consume
sequence numbers.

Always choosing a `closeTime` from the same `StellarValue` as the winning
transaction set raised the possibility, already described in the "Simple
Summary", of also preserving the _signature_ of that `StellarValue`,
from the node that nominated it (making it a `SIGNED` rather than a `BASIC`
`StellarValue`). Experimenting with implementing that possibility raised a
further design choice: because the `SIGNED` `StellarValue` also contains a set
of ledger `upgrades`, we could choose additionally to change `upgrades` to come
from the single winning `StellarValue` rather than being combined in a way
similar to the `closeTime`s as they are in the current protocol.

Here are some factors we considered before deciding not to propose to change the
`upgrade` selection while changing the `closeTime` selection.

- Upgrades do happen independently of the transactions: we could view the ledger
closes as an optimization of alternating transaction set applies and ledger
upgrades. So we did have a free choice between proposing to change upgrades in
the same way as we were proposing to change closetime, or not doing so and
letting them work differently (single-candidate-selection for closetimes and
txsets, candidate-combination for upgrades).

- There isn't any correlation that we've found among the different upgrades that
we combine, in the sense that today, if changing any given parameter `A` from
`1` to `2` and changing any given parameter `B` from `1` to `2` would each be
sensible individually, then it would also be sensible to make both changes at
once. But if the type of possible upgrades could expand to one in which certain
combinations of parameter changes would be insensible, then it might no longer
make sense just to combine each parameter individually -- the simple union of
individual changes could produce a result which was inconsistent. In that case,
the right way to combine them consistently would become ambiguous; there could
be multiple options of which none would include all candidate parameter changes.
That would not be inherently wrong -- we have the same situation with
transaction sets, and we make a heuristic but deterministic choice as to which
is the best. But it might be some argument for choosing a single nominated
upgrade in the same way as we already choose a single nominated tx set, and are
proposing to choose a single nominated closetime, just because choosing the best
single nominated set of upgrades could be simpler (there might be far fewer
options to choose among) than choosing the best consistent combination of all
tx sets.

- The selection of a single `StellarValue` from the set of candidates
during the nomination protocol reflects the literature on and formal models of
the Stellar Consensus Protocol better than the current implementation's
combination of candidates.

- The selection of a single `StellarValue` from the set of candidates
is the simplest way of ensuring that the combination function, given a
set of self-consistent candidates, produces a consistent externalized value.

Overall, changing the selection of `upgrades` to match that of `closeTime`
could be said to make the resulting protocol simpler -- but the CAP change
itself more complicated. (We did start with and do some testing of an
implementation that also changed `upgrades` selection, but in the latest
implementation we have undone that part and are changing only `closeTime`.)

Regarding potential performance impacts, using `SIGNED` `StellarValue`s in the
ballot protocol will result in the ballot protocol checking signatures in some
code paths where there were previously no signatures to check. However, the
signatures will be from nodes that had nominated values in the latest round of
the nomination protocol, so the signature cache might ameliorate any noticeable
effects of this.

## Backwards Incompatibilities

Semantic changes resulting from this CAP we hope to be limited to the
correction of the potential inconsistency of smart contract behavior
described in the "Motivation". Smart contracts, and other clients, should
not have _relied_ on that inconsistency, as it requires a race to occur,
and one which is probably very rare.

## Security Concerns

So far, we have not found new potential issues with safety or denial-of-service
introduced by the proposal in this CAP, as we did with earlier ideas that we
then rejected, as described above in the "Design Rationale".

## Test Cases

- The "SCP Driver" test, which, using the current protocol and an older one,
confirms that a set of candidate `StellarValue`s produces the expected
combined `closeTime`, changes in three ways:
- It tests that old protocols continue to behave the same way, but that
in new protocols, the combined `closeTime` is simply that of the
`StellarValue` containing the best transaction set, not the maximum
`closeTime` of all candidates.
- It is enhanced to test for expected `upgrades` as well (but these should
behave the same between old and new protocols).
- It tests that old protocols produce `BASIC` `StellarValue`s, but new
protocols produce `SIGNED` ones.
- The `PendingEnvelopes recvSCPEnvelope` and `quorum tracker` tests change to
expect `BASIC` `StellarValue`s from old protocols (as before this CAP) but
`SIGNED` `StellarValue`s from new protocols (as after this CAP).
- A test must be added to confirm that when a node selects a `StellarValue`
to nominate, it trims from the selected transaction set precisely those
transactions which would be expired according to the `closeTime` in that
`StellarValue`, and no others (in previous protocol versions, it should
trim those which it estimates are likely to expire by next transaction close).
- A test must be added to confirm that nominated `StellarValue`s are checked
for internal consistency between their close times and transaction sets --
that is, that they contain no transactions which are expired according to
their own close times (a nominated `StellarValue` that does not meet that
condition should be rejected by other nodes).

## Implementation

An implementation of this protocol has been written, which as far as the
author knows is complete and correct in its semantics; it will certainly need
refactoring, but it might constitute a demonstration of the practicality of
implementing this proposal, and of a bound on the scope of the changes required,
as well as a way of experimenting with its consequences:

[Closetime non-whitespace diffs](https://github.com/stellar/stellar-core/compare/master...rokopt:proto-622-closetime?w=1)

The changes therein are limited to stellar-core, and comprise the following:

- The protocol version is bumped (currently anticipated to be to 14).
- The herder's choice of which transactions to include in the transaction set
that it is going to nominate to depend, in new protocols only, on the close
time that it is going to nominate, rather than (as in old protocols) an
estimate of when the next ledger is likely to close.
- The herder's validation of `StellarValue`s nominated by other nodes changes
to expect, in new protocols only, that values used by the ballot protocol
are `SIGNED`.
- The herder's validation of `StellarValue`s nominated by other nodes changes
which `closeTime` to validate the transactions in the `StellarValue` against
from the last ledger close time to the `closeTime` in the `StellarValue`.
- The herder's combining of nominated `StellarValue`s into a single value to
externalize, in new protocols only (it preserves its old behavior in old
protocols), changes in the following ways:
- It chooses a "combined" `closeTime` directly from the `StellarValue`
containing the winning transaction set. In particular, it does not attempt
any trimming of the winning transaction set. (That set should already be
consistent with `closeTime`, since it came from one nominated `StellarValue`,
which the herder has already validated by the time it combines them.)
- It returns `SIGNED` `StellarValue`s.
- Tests are modified and extended as described above under "Test Cases".

0 comments on commit 1a155df

Please sign in to comment.