Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the LedgerHD initial sketch #4296

Merged
merged 1 commit into from
Jan 25, 2023

Conversation

jasagredo
Copy link
Contributor

Closes #4074

@jasagredo jasagredo added consensus issues related to ouroboros-consensus UTxO-HD 📒💽 labels Jan 20, 2023
@jasagredo jasagredo self-assigned this Jan 20, 2023
@jasagredo jasagredo linked an issue Jan 20, 2023 that may be closed by this pull request
Copy link
Contributor

@JaredCorduan JaredCorduan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall, this looks really excellent, thank you @jasagredo ! I made a few small comments.

ouroboros-consensus/docs/future-ledger-hd.md Show resolved Hide resolved
ouroboros-consensus/docs/future-ledger-hd.md Outdated Show resolved Hide resolved
ouroboros-consensus/docs/future-ledger-hd.md Show resolved Hide resolved
ouroboros-consensus/docs/future-ledger-hd.md Outdated Show resolved Hide resolved
ouroboros-consensus/docs/future-ledger-hd.md Outdated Show resolved Hide resolved
ouroboros-consensus/docs/future-ledger-hd.md Outdated Show resolved Hide resolved
ouroboros-consensus/docs/future-ledger-hd.md Outdated Show resolved Hide resolved
ouroboros-consensus/docs/future-ledger-hd.md Outdated Show resolved Hide resolved
ouroboros-consensus/docs/future-ledger-hd.md Outdated Show resolved Hide resolved
ouroboros-consensus/docs/future-ledger-hd.md Outdated Show resolved Hide resolved
@jasagredo
Copy link
Contributor Author

Thanks for your comments @JaredCorduan and @dnadales. I think I have addressed all of them. Also have in mind that once this document is good enough to be considered "a rough sketch of a plan" it should be good to go, we shouldn't invest much time on it.

Base automatically changed from dnadales/add-ad-hoc-benchmark-results to feature/utxo-hd January 23, 2023 15:16
Copy link
Contributor

@JaredCorduan JaredCorduan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great, thank you @jasagredo !

@jasagredo jasagredo force-pushed the jasagredo/ledger-hd-initial-sketch branch from 5485b74 to 4ce37c9 Compare January 25, 2023 08:55
@jasagredo jasagredo merged commit ba3a171 into feature/utxo-hd Jan 25, 2023
@jasagredo jasagredo deleted the jasagredo/ledger-hd-initial-sketch branch January 25, 2023 08:55
Copy link
Contributor

@dcoutts dcoutts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments below.

One thing that is not clear from this doc, but needs to become clear is what precisely are the new on-disk tables, and how do they correspond to the existing ledger state.

In the style of the prototype that would be done by paramaterising the ledger state types by the on-disk table types and using them in the appropriate places (at the appropriate types). That would make it precise.

Comment on lines +72 to +75
newtype PoolDistr c = PoolDistr
{ unPoolDistr ::
Map (KeyHash 'StakePool c) (IndividualPoolStake c)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember that stake distribution by pool is not a large table, and thus should remain in memory, not on disk.

That may be the intention but it's unclear since above we say "the tables that are expected to be moved to the disk ...".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the funny thing, though, @dcoutts : once the large structures are moved to a database, ledger doesn't even need the per-pool stake distribution! it is only used by consensus for the leader check, and it entirely derived from the other large data structures.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but that doesn't obviously mean that it should be moved across the ledger/consensus boundary. I'm not saying it shouldn't but we should be clear about the reasons.

The way I would look at this is that the on-disk tables still belong to the ledger. The consensus provides the ledger with the facility to have on-disk tables (just as it provides state persistence, rollbacks etc). So something being in an on-disk table doesn't itself imply it has moved across the consensus/ledger boundary. It's like the consensus layer is the interpreter and the ledger is the logic or program being interpreted. The consensus layer shouldn't need to know about the content or purpose of the tables, it should just manage them as instructed by the ledger.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's totally reasonable, but we are in the gray grey zone here for the consensus/ledger split. The "ledger protocol" logic/rules were moved to the consensus repo at the vasil hard fork. and all the per-pool stake distribution really is is an aggregation of the table that we know is not up for debate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this particular case, of the "protocol" rules needing access to the stake distribution, I think that does exactly fall either side of the boundary.

At a high level we would say that the Praos protocol (which is part of consensus) depends on the ledger for access to the stake distribution (or some suitable notion of a stake distribution). So maintaining that stake distribution (updating it as transactions are validated) should be the responsibility of the ledger, and then it needs to provide it to the Praos layer when needed.

So I think in this case, where one would logically draw the boundary at the high level tells us fairly clearly where we should draw the boundary here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try this from another angle. The ledger should never be given the entire per-stake-credential stake distribution (henceforth the PSCSD 😁 ), since that violates the rule that the ledger never hold "large" structures in memory. Instead, the ledger should give consensus instructions for what to do.

The per-pool stake distribution (PPSD) is small, and so the ledger is fine to hold it. But how does it the ledger give consensus instructions on how to make it? The instructions are the delegation map, which could be provided incrementally (each delegation is a "delta", as is each pool retirement"). And this is already a part of the plan, we know that the delegation relation will be a table (probably as a part of the "unified map").

So I can see two ways to get the PPSD to the ledger:

  1. it maintains it's own incremental computation, using the txins, txouts, delegations, and retirements. and hopes that this always stays in since with the PSCSD maintained by consensus using the tables and the deltas provided by ledger.
  2. consensus uses the PSCSD and the delegation relation to maintain another table for the small PPSD.

And just remember, that the PPSD is not used by any code in the ledger repo any more, it is only used for the leader check. It is not even a member of what consensus deems "the ledger state".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the reasonable designs to consider are continuously incrementally maintaining the by-pool distribution or doing a bulk computation over some phase of the epoch (implemented as an incremental traversal of the snapshot of the previous epoch by-stake-cred distribution).

Continuously incrementally maintaining the by-pool distribution sounds preferable, but also sounds tricky in the presence of fractional delegation. But if you think we can do it, then it means it's all in memory which is a lot easier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And just remember, that the PPSD is not used by any code in the ledger repo any more

@dcoutts as soon as I make a bold claim things begin too change. 😆 SPOs will get a stake-based vote in the Voltaire phase.

Comment on lines +83 to +100
As noted by [@JaredCorduan](https://github.com/JaredCorduan), after CIP-1694 is
complete, there will probably also be a `ssStakeMarkDRepDistr :: Map (KeyHash
'DRep c) Coin` and a `ssDRepDelegations :: VMap VB VB (Credential 'Staking c)
(KeyHash 'DRep c)`.

- The reward update

```haskell
( ne :: NewEpochState era )
& ( nesRu :: NewEpochState era -> StrictMaybe (PulsingRewUpdate (EraCrypto era)) )

data RewardUpdate c = RewardUpdate
{ deltaT :: !DeltaCoin
, deltaR :: !DeltaCoin
, rs :: !(Map (Credential 'Staking c) (Set (Reward c)))
, deltaF :: !DeltaCoin
, nonMyopic :: !(NonMyopic c)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't as clear as it could be. I don't see what the proposed on-disk table actually is.

Comment on lines +134 to +151
The update to this map is performed by `updateStakeDistribution`. It is known
before calling the `BBODY` rule which UTxOs are going to be deleted and the rule
execution logic itself knows the UTxOs that are going to be added. Therefore we
can follow a logic similar to the above:

- Before calling the ledger rule we can query for the needed entries
- We can present the ledger with the values they asked for, but restricted to
the data available in the disk + changelog
- The ledger will either provide diffs for the given values or will return
updated values that we can then diff with the provided ones
- The resulting differences can be included in the current definition of the
`DbChangelog`.

> **_PLAN:_** Consensus will have a new table on the `LedgerTables` that will
> represent the incremental stake distribution (maybe two tables, one for creds
> one for ptrs). It will have its own place on the `DbChangelog` too. The flow
> above describes the general strategy for calling the `BBODY` rule, i.e.
> `(re)applyBlockOpts`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The incrementally maintained stake distribution should be a table where we normally only need to do monoidal updates. This means we turn a read/write workload into a write only workload. This is covered in the original report and is the motivation for the inclusion of monoidal update tables (including in the prototype).

This is important for performance, otherwise we are going to be doing almost as many lookups in the stake distribution table as we do lookups in the utxo table. Lookups are the bottleneck for a write-optimised disk backend.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting that we do something more like double entry booking, where we (monoidally) send updates to two separate things, additions and deletions, and only combine them when needed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about two separate things (what two?) but yes I am suggesting a table that can support monoidal updates, as well as insert/delete/lookup.

So for an incrementally updated stake distribution, that maps stake address to lovelace, it would be updated by the ledger working out the delta in stake arising for a stake address from a tx, and that is the table update. So in particular, the lovelace value of a spent input is a negative delta for that input's stake address, while lovelace outputs contribute positively to the balance of their corresponding stake addresses. Note that no lookup is needed: the sum total of the lovelace for a stake address is not needed.

And the practical consequence is the update-only workload is a write-only workload, which is much cheaper than a read/write workload.

Comment on lines +231 to +248
The `checkIsLeader` functions for Praos and TPraos makes use of the stake
distribution by stake pool in the `LedgerView` (see the definitions in
`ouroboros-consensus-protocol`). If the snapshots (and therefore the
`ssStakeMarkPoolDistr` field) reside in the Consensus side, we can produce the
relevant stake distributions when needed and don't involve the ledger. In any
case this functionality is in between Ledger and Consensus so it makes sense to
move it out of the ledger.

> **_PLAN:_** Consensus will manage the snapshots to produce stake distribution
> by pool that can be used by Consensus later to resolve queries about the
> LeaderSchedule. Ledger will not know about the Snapshots. In particular, the
> UTxO-HD report includes the concept of Snapshots of tables, which would be
> used to manage and access these snapshots.

Note that this implies creating a new package or component at the
Consensus-Ledger boundary whose owner would probably be the Consensus team as
its responsibilities would be related with computations required for the
Consensus protocol (leader checks, and similar).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly unclear about what this section is porposing. The stake distribution by pool is not large, so will not be an on-disk table. It will remain as an in-memory structure as it is now, within the ledger state and ledger view. I don't see the need for any consensus/ledger interface changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I commented on this above.

Comment on lines +252 to +255
- Should the Ledger return the diffs? it actually internally compute diffs, but
they the diffs are applied to the values before returning. If we wanted to
return the diffs instead, there are many intermediate layers through which
they will have to be floated, but it should be doable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this refers to what I think it does, it's covered in the second report and in the prototype. The solution is to paramaterise the ledger rules by the table type, and then to use a "partial tracking map" which fits the map interface and also tracks the diffs. This gives us the diffs we want without any "real" changes in the ledger rules (just abstracting over the map interface for the on-disk tables).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
consensus issues related to ouroboros-consensus UTxO-HD 📒💽
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Validate prototype by drafting future steps for other tables
4 participants