Proposal for transaction scheduler based on fee priority #23438

carllin · 2022-03-02T05:21:25Z

Problem

Summary of Changes

Fixes #

tao-stones · 2022-03-02T16:46:10Z

docs/src/proposals/fee_transaction_priority.md

+Additional fees were introduced to transactions as a method to allow users to bid for priority for
+their transactions in the leader's queue.
+
+Let the additional fee for a transaction `T` be defined as `F(T)`.


F(T) should be defined as (additional_fee + base_fee) / requested_cuas fee-per-cu to prioritize transactions, instead of just usingadditional_fee`, so 100 additional lamports for 1,000 CU transaction should have lower priority compare to 10 additional lamports for 10 CU transaction. The thing is the base_fee (eg signature, write lock etc) are bank dependent, as the fee_schedule changes over epochs.

tao-stones · 2022-03-02T16:48:55Z

docs/src/proposals/fee_transaction_priority.md

+
+Pipeline:
+1. Sigverify
+2. Scheduler


+1 for a separate scheduler.

I would add 0. Filter stage. once a state auction is saturated, we can start dropping txs that are below the min price to be considered for inclusion.

carllin · 2022-03-02T22:29:49Z

One question, are we doing away with packet_batch?

@taozhu-chicago, Yes I think we'll have to push individual transactions into the the scheduler's heap, which means deserializing the transactions from the PacketBatch's that SigVerify is sending to this new scheduler (maybe we could have a pool of threads doing just this).

I think this has the added benefit that we don't have to keep deserializing transactions from packets in the BankingStage threads because they will now receive transactions instead of Packets

tao-stones · 2022-03-02T22:47:41Z

One question, are we doing away with packet_batch?

@taozhu-chicago, Yes I think we'll have to push individual transactions into the the scheduler's heap, which means deserializing the transactions from the PacketBatch's that SigVerify is sending to this new scheduler (maybe we could have a pool of threads doing just this).

I think this has the added benefit that we don't have to keep deserializing transactions from packets in the BankingStage threads because they will now receive transactions instead of Packets

Right, deserializing packets into versioned_transaction isn't added cost, I did just that in my proposed PR. But it needs a bank to sanitized into sanitized_transaction.

I was trying to avoid the additional copy from PacketBatch -> Transactions.

tao-stones · 2022-03-02T23:17:20Z

docs/src/proposals/fee_transaction_priority.md

+#### Components of the `Scheduler`:
+
+1. `default_transaction_queue` - A max-heap `BinaryHeap<Transaction>` that tracks all pending transactions.
+The priority in the heap is the additional fee of the transaction. Transactions are added to this queue


The priority in the heap is the F(T) = fee-per-cu. Part of the calculation is base_fee that includes signature, write_lock, and compute_fee that all depends on current bank's feature_set and fee_structure (in bank::calculate_fee(...) ). Probably need to pass leader's current bank to Scheduler somehow

t-nelson

This sounds familiar 🤔

I need to go over the algorithms again in the morning. Looking good though! Thanks for writing it up!

t-nelson · 2022-03-03T06:46:31Z

docs/src/proposals/fee_transaction_priority.md

+channel.
+
+Once a BankingStage thread finishes processing a transaction `T` , it sends the `T` back
+to the scheduler via the same channel to signal of completion.


Do we need the whole T? Seems like a () would be sufficient

Yeah it doesn't need the whole transaction, just needs:

The locked accounts https://github.com/solana-labs/solana/pull/23438/files#diff-3507d019efe6542cf0a788ecb226e735731e3b344e6bc2342a31d298fa5c7cefR169

The transaction signature: https://github.com/solana-labs/solana/pull/23438/files#diff-3507d019efe6542cf0a788ecb226e735731e3b344e6bc2342a31d298fa5c7cefR202

t-nelson · 2022-03-03T06:52:57Z

docs/src/proposals/fee_transaction_priority.md

+struct BlockedTransactionsQueue {
+    // The higher priority transactin blocking all the other transactions in
+    // `blocked_transactions` below
+    highest_priority_blocked_transaction: Transaction,


Isn't the root of the heap already this by definition?

I had it not part of the heap. Since this transaction would be referenced/checked a lot it made sense to me to clearly delineate it from the other transactions.

t-nelson · 2022-03-03T06:55:13Z

docs/src/proposals/fee_transaction_priority.md

+    Write(Pubkey),
+}
+```
+4. `blocked_transactions` -  A `HashMap<Signature, Rc<BlockedTransactionsQueue>>` keyed by


One entry per transaction in the BlockedTransactionQueue, right?

Yeah a transaction should only be entered into this heap once, guess we need an existence check/dedup

t-nelson · 2022-03-03T06:57:14Z

docs/src/proposals/fee_transaction_priority.md

+    other_blocked_transactions: BinaryHeap<Transaction>
+}
+```
+5. `blocked_transaction_queues_by_accounts` - A `HashMap<Pubkey, Rc<BlockedTransactionsQueue>>` keyed by


How will this work if the same account is referenced in multiple transactions?

they all get stuffed into the heap in the same BlockedTransactionsQueue

docs/src/proposals/fee_transaction_priority.md

t-nelson · 2022-03-03T07:06:34Z

docs/src/proposals/fee_transaction_priority.md

+
+#### Algorithm (Main Loop):
+
+Assume `N` BankingStage threads:


If we're smart about the implementation, I think we can track a separate state per banking thread for each iteration

what kind of state? Was thinking the banking thread state in the scheduler would be pretty lightweight, just channels to send transactions.

docs/src/proposals/fee_transaction_priority.md

tao-stones · 2022-03-03T17:40:23Z

docs/src/proposals/fee_transaction_priority.md

+2. If `T1` cannot be processed before `T2` because there's already a transaction currently being
+processed that contends on an account `A`, then `T2` should not be scheduled if it would grab
+any account locks needed by `T1`. This prevents lower fee transactions like `T2` from starving
+higher paying transactions like `T1`.


To clarify, the starvation is:

is a pre-existing issue due to banking-stage threads are isolated from each other, a TX that needs many accounts in one thread can be starved due to other threads keep submitting other TXs that take one of those accounts.

it becomes necessary to be solved now because we are to promise prioritize Txs by fee/CU.

to solve it, needs some kind central scheduling schema across banking-stage threads.

Is this the correct premises?

if you have the following 5 txs:

fee rate 300. write locks accounts: A, B, C

fee rate 200: write locks accounts: B, C, D

fee rate 250: write locks accounts: C, D

fee rate 400. write locks accounts: E, F

fee rate 500. write locks accounts: D

would the tx batching then be: [[5, 4, 1], [3], [2]]?

500: [A, B, C(read)]

450: [A, C(write)]

400: [C(read), D]

what about these transactions with account and rw flags? you could build a batches like:

[[1,3], [2]]

[[1], [2], [3]]

the second option would respect the fee ordering assuming you care about read lock fees, the first one would result in less batches

carllin · 2022-03-03T21:01:53Z

@buffalu feel free to take a look as well :)

ryoqun · 2022-03-09T14:41:45Z

docs/src/proposals/fee_transaction_priority.md

+1. Once a BankingStage thread finishes processing a batch of transactions `completed_transactions_batch` ,
+it sends the `completed_transactions_batch` back to the scheduler via the same channel to signal of completion.
+
+2. Upon receiving this signal, the BankingStage thread processes the locked accounts


typo? maybe Upon receiving this signal, the Scheduler thread processes

ryoqun · 2022-03-09T14:54:42Z

docs/src/proposals/fee_transaction_priority.md

+`transaction_accounts` for each `completed_transaction` in `completed_transactions_batch`:
+```
+let mut unlocked_accounts = vec![];
+// First remove all the locks from the tracking list
+for locked_account in transaction_accounts {
+    if self.locked_accounts.remove_reference(locked_account) {
+        unlocked_accounts.push(locked_account.key());
+    }
+}


fyi, I'm proposing another drastic change, departing from the batching altogether: #23548

if the scheduler thread still don't unlock at all until the whole batched (completed) transactions are returned back from the banking stage, i think we still suffer from somewhat constrained tps due to the problem described there (or i might be wrong...)

Agreed. What I think will happen is that people participating in a Raydium IDO or NFT drop will be submitting larger numbers of fee-prioritized transactions than other users and these will even more heavily drown at the ability of other transactions in the pool to be executed in parallel, because these fee-prioritized transactions will get batch preference, and will result in the same tps drops. It'll also be tough to deal with these acute spamming periods with the congestion fee raises since it will take time to increase the fees... Maximizing throughput/parallelism as much as possible will more quickly solve demand issues and relieve the spamming more quickly.

Maximizing throughput/parallelism as much as possible will more quickly solve demand issues and relieve the spamming more quickly

gm! https://en.wikipedia.org/wiki/Induced_demand

Hmm yeah understandable but I think induced demand > reduced demand due to not meeting expectations that have been set by the main selling point of Solana being the high tps. It seems like supply is going down with a lot of demand. I was seeing blocks during congestion that had 0 votes or non-votes at times which seems strange (Zan was the first one who spotted one and then I saw others thereafter). I just think it would be ideal to have a design where other types of transactions (i.e. payments) are better able to flow around jams caused by certain groups of transactions -- right now they can't when batches are filled with a monoculture of transactions. I mentioned a design idea in response to ryoqun's new thread.

Anyway, just looking to brainstorm with you Trent :)

#23548 (comment)

spamming periods with the congestion fee raises since it will take time to increase the fees.

well, I don't think Solana's priority fee is completely same thing as the eth gas. eth gas is for global fee market and solana is for local fee market. and the former is under the persistently saturated condition. the later is under the temporal spiky condition. that's why i want bidding info to be on chain so that people can react quickly to the demand.

It seems like supply is going down with a lot of demand. I was seeing blocks during congestion that had 0 votes or non-votes at times which seems strange

This is true. but this is a bug, not design by intention. Along with this proposal and #23548 and #21883, we're trying to localize the heavily contended accounts, while payments can be as fast and cheat as possible par solana's selling point. :)

Ah yeah, agree with you that it seems like it's a bug-type of situation. With the current situation in which parallelization seems to become compromised though (probably?) it essentially turns what should be a local fee market into a global fee market though I think. That's why I worry about the situation where you have an nft drop -> NFT people prioritize their transactions -> parallelism drops with the batch design -> people outside of the nft drop need to add more priority than the NFT people. It looks like the newer transaction scheduler should help deal with this situation better though so that's good to see.

ryoqun · 2022-03-16T04:18:16Z

overall, i think this is quite good direction with concrete algorithms

some random thoughts:

maybe needs to apply non-linear (yet small?) taxes for txes with huge number (~30) of accounts because of its scheduling cost (will need many collection lookups in the scheduling stage)
provide an ability to displace existing pending transaction with higher priority fee while avoiding the replay risk?

that being said, I still think all_transaction_queues should be on-chain-ed:

clients still spam the network with their txes as they can't get rid of fear of other's spamming. ;)
worse, they still can't have any clue whether their transaction is in the all_transaction_queues or not (i.e. recognized by the cluster)
also hard to know current prevailing priority fee for given interested accounts.
exposing this off-chain state would be a bit challenging from the operational point of view. rpc nodes aren't staked and staked nodes shouldn't expose rpc.

carllin · 2022-03-16T20:21:08Z

maybe needs to apply non-linear (yet small?) taxes for txes with huge number (~30) of accounts because of its scheduling cost (will need many collection lookups in the scheduling stage)

Yeah this is one of the tradeoffs here for the single-threaded scheduler, which is now the bottleneck for all of banking stage.

For now this might be ok, at <10k tps. My intuition though is that these hashmap lookups should be fast, and since these lookups are only incurred on the next highest fee transaction, that means these lookups only occur on transactions that are about to be scheduled. In comparison to the rest of the transaction execution, I think these will be rather cheap. Even at 10k tps, 30 accounts a transaction, that's 300,000 lookups, which should be worst case a few ms?

And +1 for adding a fee for data accesses that will have to be factored in later I think.

stale · 2022-04-16T13:43:01Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

stale · 2022-04-30T11:12:17Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

jarry-xiao · 2022-05-06T18:32:27Z

docs/src/proposals/fee_transaction_priority.md

+for account_key in unlocked_accounts {
+    if let Some(blocked_transaction_queue) = self.blocked_transaction_queues_by_accounts.get(account_key) {
+        // Check if the transaction blocking this queue can be run now, thereby unblocking this queue
+        if blocked_transaction_queue.highest_priority_blocked_transaction.can_get_locks() {


Instead of unblocking only the top of queue, you can unblock the first k elements to potentially avoid starvation. After processing you can also allow any of the first q elements of the global blocked queue to skip the line if they are now freed up.

We discussed separately, but it probably also makes sense to keep track of the number of times that the leader was cut, as to prevent the highest prio tx from also getting starved

buffalu · 2022-05-19T19:44:50Z

i have a WIP scheduler here that im still trying to convince myself works as designed 😆

https://github.com/buffalu/solana/pull/1/files#diff-ef114a0b70f4af706d610895893f1d3146402fdc427971af41db5dc1fe866186

carllin added 2 commits March 1, 2022 23:52

Add proposal

64fd735

Multiple transactions per banking thread

539cb5c

carllin requested review from ryoqun, t-nelson, jstarry and tao-stones March 2, 2022 05:21

carllin changed the title ~~Proposal for scheduler based on fee priority~~ Proposal for transaction scheduler based on fee priority Mar 2, 2022

tao-stones reviewed Mar 2, 2022

View reviewed changes

Update fee defintiion

d87efdd

Remove whitespace

07c7e86

tao-stones reviewed Mar 2, 2022

View reviewed changes

t-nelson reviewed Mar 3, 2022

View reviewed changes

tao-stones reviewed Mar 3, 2022

View reviewed changes

Fixup spelling

82862cd

ryoqun mentioned this pull request Mar 9, 2022

Introduce tiled tx execution model instead of batched entries at banking/replaying #23548

Open

ryoqun reviewed Mar 9, 2022

View reviewed changes

tao-stones mentioned this pull request Mar 16, 2022

Consider larger lookback window in BankingStage for producing entries #22096

Closed

nikhayes mentioned this pull request Apr 4, 2022

Use fee and sender_stake prioritization to add new packet_batch to ba… #23841

Closed

stale bot added stale [bot only] Added to stale content; results in auto-close after a week. and removed stale [bot only] Added to stale content; results in auto-close after a week. labels Apr 16, 2022

carllin mentioned this pull request Apr 23, 2022

Add scheduler/transaction pool before BankingStage to unify prioritization across all banking threads #24616

Closed

stale bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Apr 30, 2022

carllin merged commit 6a9a7df into solana-labs:master May 5, 2022

carllin requested a review from jarry-xiao May 6, 2022 17:54

jarry-xiao reviewed May 6, 2022

View reviewed changes

buffalu mentioned this pull request May 20, 2022

Transaction Scheduler Prototype #25389

Closed

Proposal for transaction scheduler based on fee priority #23438

Proposal for transaction scheduler based on fee priority #23438

Conversation

carllin commented Mar 2, 2022 • edited by ryoqun

Problem

Summary of Changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carllin commented Mar 2, 2022 • edited

tao-stones commented Mar 2, 2022

Choose a reason for hiding this comment

t-nelson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tao-stones Mar 3, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carllin commented Mar 3, 2022

Choose a reason for hiding this comment

ryoqun Mar 9, 2022 • edited

Choose a reason for hiding this comment

nikhayes Mar 9, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikhayes Mar 9, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ryoqun commented Mar 16, 2022 • edited

carllin commented Mar 16, 2022

stale bot commented Apr 16, 2022

stale bot commented Apr 30, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

buffalu commented May 19, 2022

carllin commented Mar 2, 2022 •

edited by ryoqun

carllin commented Mar 2, 2022 •

edited

tao-stones Mar 3, 2022 •

edited

ryoqun Mar 9, 2022 •

edited

nikhayes Mar 9, 2022 •

edited

nikhayes Mar 9, 2022 •

edited

ryoqun commented Mar 16, 2022 •

edited