-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIMD: (APExB) Asynchronous Program Execution and Broadcast #45
Conversation
* Leader - the current leader for the slot that will propose a PoH | ||
ledger full of Votes and UserBlockEntry | ||
|
||
* Builder - a node that is scheduled to propose a block with non |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
proposer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't builder a better name? Proposer is too generic, since the leader also proposes a block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a proposer proposes the block to the network that others vote on. in this case i think they'd be proposing the ordering?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leader proposes the block technically. The builder builds UserBlocks that are broadcast over turbine, but it's the leader, just like today, that actually makes the block that validators vote on.
The N concurrent Builder have 200ms slots to create blocks out | ||
of user transactions. These are transmitted to the network. The | ||
leader receives and decodes them and generates a UserBlockEntry, | ||
and adds it to PoH as soon as the leaders PoH has passed the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do they determine which one to add? is this in the world of bankless leader (leader isn't executing transactions, just doing poh recording + shred signing?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if multiple UserBlockEntry contain same tx?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second one is skipped
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leader adds the UserBlockEntry to PoH as soon as it is received. Since UserBlocks are transmitted over turbine, if a leader sees it, it's very likely the rest of the network has seen it as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i thought the leader was doing the ordering by poh?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Each UserBlock is assumed to have been created simultaneously during
the UserBlockSlot that it was encoded by the leader. For each
UserBlock, the transactions are ordered by priority fee before
execution."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leader has 200ms to include all the valid UserBlocks. Each UserBlock is assumed to have been created at the same time. So the TX ordering between the user blocks is based on priority fees in the transactions. The PoH ordering of user blocks is just used as a tie breaker.
|
||
### Fork Choice | ||
|
||
If a validator doesn't have the builder's UserBlock, the validator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im confused where they repair from here if they're the only one getting the UserBlock. unless its propagated to the entire cluster somehow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UserBlocks are sent over turbine. So repair and everything else works as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im confused where they repair from here if they're the only one getting the UserBlock. unless its propagated to the entire cluster somehow
All validators that would vote on the fork containing the user block must get all user blocks for that fork before voting on it. Voting on a fork without even verifying that the user blocks it references exist would allow attacks on the network by malignant nodes that get user blocks entries inserted into blocks but never provide the actual transaction data. Not sure how anyone can expect to sanely evaluate the state of a fork for which certain transactions are hidden and never made available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct. Before voting on a fork each validator has to have all the data, including all the data form all the user blocks in that fork. Otherwise there is no way to guarantee that everyone can execute because the data could be withheld
|
||
### UserBlock execution | ||
|
||
Each UserBlock is assumed to have been created simultaneously |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a little confused in this paragraph; is the builder hashing their block with PoH? if the leader processes userblocks at the 200ms mark, why does the userblock need poh? can the userblock just be submitted with a poh in the past to get in the block earlier, which means all userblock submitters will not advance poh to game the system?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UserBlocks don't need PoH, but follow the same ledger format with entries. Fixed it in the update.
of transactions for execution by paying a priority fee for all of | ||
them, and executing the whole batch together. | ||
|
||
Priority fees now also imply execution priority. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LFG
transition instead of the full 2/3+, because if 1/3+ are incorrect | ||
the network will halt anyways. | ||
|
||
## Impact |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like we might want to include a bundle primitive in here too to avoid splitting up bundles that get reordered/not guaranteed to execute atomically. can imagine bundle like:
[user_high_priority_tx, arb_low_priority_tx]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the BundleTransaction
thoughts:
|
of user transactions. These are transmitted to the network. The | ||
leader receives and decodes them and generates a UserBlockEntry, | ||
and adds it to PoH as soon as the leaders PoH has passed the | ||
UserBlockSlot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is also nice bc you can spend more time scheduling ahead of time and massively parallelize the execution of these transactions.
- sort by priority
- DAG
- batch
- execute
|
||
## New Terminology | ||
|
||
* UserBlock - a block full of none vote transactions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to keep terminology, does RecordBatch or EntryBatch make more sense? this is essentially what the current validator sends to poh but in batch form? Vec<Vec>
during its scheduled slot. For each UserBlock, the transactions | ||
are ordered by priority fee before execution. If two transactions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this within a single UserBlock, or within the stage where multiple UserBlocks are merged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Within the stage when they are merged.
leader. Leaders also have to spend a ton of resources on prioritization | ||
of transactions. | ||
|
||
2. Executing programs before voting is a bottleneck. Fork choice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fork choice does depend on program execution: the stake program can change stake weight of voters which influences fork choice. Stake weight can only change at epoch boundaries though so there must be a 'sync' at epoch boundaries, where validators must have "caught up" on executing all tx at an epoch boundary before they can vote beyond that epoch boundary.
Is this wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Nodes need to be able to compute a snapshot once an epoch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Nodes need to be able to compute a snapshot once an epoch.
Would prefer if this issue was addressed more completely than a one-liner. Given that the whole purpose of this proposal is to increase asynchronous execution, having a big synchronization point once per epoch seems like a big deal. It's not clear to me that having a period of asynchronous execution followed by a synchronization point is an overall win, given that it will introduce a period of time at the beginning of each epoch where "nothing new happens until everyone is caught up". Is asynchronous execution during later parts of an epoch worth the reduction in throughput during early parts of an epoch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nodes can't fall behind that much because the overall CU limits are set for synchronous execution. But with the option of async execution it is much easier to catch up. Raw ledger processing without dealing with the network is 20-30x times faster.
Might want to temper your enthusiasm. This will make arbitrage alot more difficult since unless you can execute transactions faster than your competitors, you will be at a disadvantage. They will know the "current state" but you will only know "state at some time in the past". Trying to write arbitrage tx against old state seems like a really good way to waste priority fees. |
this will make arbitrage easier to be competitive, not harder, bc its less about latency and more about how much you're willing to pay |
|
||
Each UserBlock is assumed to have been created simultaneously during | ||
the UserBlockSlot that it was encoded by the leader. For each | ||
UserBlock, the transactions are ordered by priority fee before |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This predefined scheduling mechanism should be greatly expanded. It is making the same mistake that the original Solana design made: not thinking through all of the expected costs of transaction execution and ensuring that transactions are prioritized for execution in a way that maximizes efficiency. Transactions should be ordered in decreasing order by (total_fees / expected_execution_cost). This would mean that assuming that the expected_execution_cost can be approximated closely, then transactions will execute in priority order based on how much they pay to execute versus how much they cost to execute.
However, since you want to predefine the scheduling mechanism here and bake it into the consensus algorithm, you will have to predefine how to compute expected execution costs. This should include write locks, read locks, total accounts referenced, compute units paid for, byte size of tx, etc.
You should also define all the mandatory fees - it would be a good idea to enforce a mandatory fee for each of the "cost" categories mentioned in the previous paragraph. This will improve fee predictability for users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TL;DR ordering by priority fee alone is wholly inadequate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I agree. The UserBlock will be limited in the amount of CUs it can take up, so the order of execution can be based on priority fee which is lamports per CU. CUs will take into account the cost of locks and reads and writes. That is outside the scope of this design. A different design should be covering how each resource is counted towards the CUs used by the transaction. TXs must request the CUs up front, so the amount of compute used is known ahead of time.
Fork choice and voting is not blocked by user block propagation or | ||
execution of user transactions. | ||
|
||
Builders are going to capture all the MEV. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not believe this is correct. Builders will have to transmit some amount of the MEV to validators via priority fees in order to ensure that their BundleTransactions execute. A leader does not have to accept a BundleTransaction, it can ignore it, if the total fees paid by all tx in the bundle + the bundle priority fee do not make the bundle attractive enough to schedule ahead of other tx/bundles.
For this reason, builders will have to compete on priority fees for bundles, which will naturally transmit a significant fraction of MEV to validators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can the leader ignore a BundleTransaction? It's part of the UserBlock, so the entire block must be accepted, including the BundleTransaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added some clarifications
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey kinda confused here, so the builders can reorder txns and capture the MEV but what makes a block attractive for the leader to be included in the slot rather than just aiming for lower latency?
Builders must offer some reward (from the priority fee) to the validator, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
priority fees and base fees go to the leader
|
||
### Fork Choice | ||
|
||
If a validator doesn't have the builder's UserBlock, the validator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im confused where they repair from here if they're the only one getting the UserBlock. unless its propagated to the entire cluster somehow
All validators that would vote on the fork containing the user block must get all user blocks for that fork before voting on it. Voting on a fork without even verifying that the user blocks it references exist would allow attacks on the network by malignant nodes that get user blocks entries inserted into blocks but never provide the actual transaction data. Not sure how anyone can expect to sanely evaluate the state of a fork for which certain transactions are hidden and never made available.
leader. Leaders also have to spend a ton of resources on prioritization | ||
of transactions. | ||
|
||
2. Executing programs before voting is a bottleneck. Fork choice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Nodes need to be able to compute a snapshot once an epoch.
Would prefer if this issue was addressed more completely than a one-liner. Given that the whole purpose of this proposal is to increase asynchronous execution, having a big synchronization point once per epoch seems like a big deal. It's not clear to me that having a period of asynchronous execution followed by a synchronization point is an overall win, given that it will introduce a period of time at the beginning of each epoch where "nothing new happens until everyone is caught up". Is asynchronous execution during later parts of an epoch worth the reduction in throughput during early parts of an epoch?
Co-authored-by: segfaultdoctor <17258903+segfaultdoc@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be really awesome to get a state machine diagram for how clients are supposed to behave under this new scheme.
Multiple nodes can operate as Builder on the network concurrently. | ||
So clients can pick the nearest one, and the bandwidth to schedule | ||
and prioritize transactions is doubled. There needs to be a design | ||
for BundleTransactions that allow the bundler to prioritize a batch | ||
of transactions for execution by paying a priority fee for all of | ||
them, and executing the whole batch together. | ||
|
||
Priority fees now also imply execution priority. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate? Why is that the case now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the UserBlock execution section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lheeger-jump can make one on excali if you like, been reading about the eth proposal to have multiple concurrent block producers and this kinda feels related.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, everything is converging. I think the main difference is that solana has turbine and will provision validators to saturate the bandwidth available to them
But latency is also "how much you're willing to pay". |
By clients do you mean wallets? How do they decide if something has been confirmed? |
UserBlockEntry. | ||
|
||
The N concurrent Builder create blocks out of user transactions. | ||
These are transmitted to the cluster via turbine. The leader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the leader just another node in the turbine path receiving user blocks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, if the leader observes the UserBlocks is very likely that the supermajority of the cluster has as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial comments/concerns. Nice write up!
|
||
This feature changes how the ledger is broadcast and executed. It | ||
separates proposing blocks full of user transactions from blocks | ||
with votes. It allows for N concurrent builders of user transaction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is N
here meant to be a fixed limit or an arbitrary number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Configured by the cluster. Builders need to be scheduled ahead of time.
### UserBlock Compute Limits | ||
|
||
If the overall compute capacity for user transactions per leader | ||
block is 48m CU, and cluster is configured with 2 builders, then | ||
each UserBlock can use no more then 48m/4 or 12m CU. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a reasonable place to start imo. However, I have concerns that this could have unintentional side effects that lead to smaller blocks.
Validators are not spread evenly across the world, nor are users. There is significant clustering in North America and Europe.
As a hypothetical, let's say we have 2 builders 1 in Amsterdam and the other in Wellington NZ (roughly opposite sides of the world). The majority of users would simply target their nearest builder which will likely be Amsterdam. Amsterdam block gets packed and leaves people out while the Wellington block sits (relatively) idle.
It's certainly not a liveness issue, but I wonder if there is something that could be done to reduce this potential impact.
NOT WELL THOUGHT OUT SPITBALL IDEA
Could we have more builders than can possibly fit into the UserBlockSlot
, and leader takes the most heavily packed UserBlockEntries
. This might be more complicated since then the leader has to scan these entries before selecting it.
Example above: 3 builders (AMS, WEL, +TOR).
AMS, TOR get packed because they are geographically (reduced latency) closer to users, and WEL gets excluded by the leader.
ugh, thinking about this more...maybe not a good idea. There's no way for non-leader to verify the leader chose the most packed block. Though the leader would (hopefully) be incentivized to choose the most packed due to fee-collection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I generally agree, we could figure out some way to do work sharing. But I think this is a v2 optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another reason for probably very small blocks: the builder only knows the compute limits requested not the compute units consumed.
|
||
Builders should be shuffled and scheduled according to stake weight. | ||
|
||
TBD, deciding how should builders and leaders split the fees from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something to consider here is how to handle duplicate transactions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry I might be dumb but don't we already have dedup features?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem my previous message was not clear 😄
We have deduping that prevents txs from appearing in multiple different slots. With this proposal, we have multiple builders who could potentially include the same transaction in their UserBlock
since they can't know ahead of time what the other will include. We need to ignore one of them so we don't process the transaction twice, which is straight-forward to do.
My previous comment was mainly highlighting an edge-case for the economics of this deduplication. They both included a valid transaction in their UserBlock
, so how do we handle fees?
Potentially a few options:
- builder proportion of fees for tx split evenly between the two builders
- first builder's block to reach leader
1 seems a better option, but it's also going to entirely depend how we split fees between the leader and builders in general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the transaction being sent contain information specifying which leader to process it? That would avoid duplicates and it's easier to filter the block for transactions referencing one of the particular leaders to know which fee rewards it should receive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate TXs are skipped. See line 141
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe being able to specify the leader is also a feature though? Having your transaction processed by a certain leader might be undesirable for mev reasons so there could be reason for specificity?
Also, if filling blocks is based on requested/estimated cu's, skipping dupe transactions would lead to empty space whereas this solution wouldnt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wdym processed by a certain leader? Either the UserBlcok is included or it's not. The big question is if it can be included in any slot, or only the scheduled slot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe what is meant is a mechanism to make sure a transaction is processed by a specific builder
Having your transaction processed by a certain leader might be undesirable for mev reasons so there could be reason for specificity?
If there's a certain builder I know does some MEV that is undesirable for my transaction, then I would request my tx is processed by the other.
Problem
Transmitting blocks of user transactions and executing those transactions blocks fork choice. Only a single leader can propose user transactions at the moment, which means that client latency is going to be on average half way around the world.
Solution
Propose a design to split out user transactions into separate user blocks that are transmitted concurrently with blocks of consensus votes. This naturally follows into asynchronous execution of the user blocks.