Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ptrus/benchmark/runtime scheduler #3434

Merged
merged 6 commits into from
Oct 21, 2020
Merged

Conversation

ptrus
Copy link
Member

@ptrus ptrus commented Oct 19, 2020

Closes #3358

Done:

  • generalized incomming_queue into a txpool interface
  • adds map, ordered_map txpool implementations
  • adds benchmarks for simple txpool and simple scheduler
  • changes the implementation scheduler uses to use ordered_map
  • minor "fix" to the simple scheduler
  • do some more benchmarks & gather results
  • code-reviews (make sure to re-run benchmarks in case there's any faults in current impl.)
  • consider removing the incomming_queue & map txpool implementations before merging, as only the ordered queue will be used (left the commits in, so that the implementations can be easily brought back if ever needed in future)

Benchmark results:

Max Batch size 100, 10000 txs.

// Instant dispatch, this directly compares queue implementations.
BenchmarkSimpleSchedulerQueue/BenchmarkNoOpDispatch-8          76      299131505 ns/op   173677836 B/op      959513 allocs/op
BenchmarkSimpleSchedulerMap/BenchmarkNoOpDispatch-8          1278       19598526 ns/op     7015840 B/op       41988 allocs/op
BenchmarkSimpleSchedulerOrderedMap/BenchmarkNoOpDispatch-8    661       37891445 ns/op     8223897 B/op       62877 allocs/op

// Dispatch after some delay. This shows the overall impact of the queue implementation
// on the scheduler performance, in a more "real-world" scenario, where dispatch is not instant.
BenchmarkSimpleSchedulerQueue/BenchmarkDelayDispatch-8           4    5405653214 ns/op   188766716 B/op     1039317 allocs/op
BenchmarkSimpleSchedulerMap/BenchmarkDelayDispatch-8             5    5247941659 ns/op     6997904 B/op       41920 allocs/op
BenchmarkSimpleSchedulerOrderedMap/BenchmarkDelayDispatch-8      4    5304495489 ns/op     8130016 B/op       62004 allocs/op

Max Batch size 1000, 10000 txs.
BenchmarkSimpleSchedulerQueue/BenchmarkNoOpDispatch-8          534      45646138 ns/op    21148272 B/op      123215 allocs/op
BenchmarkSimpleSchedulerMap/BenchmarkNoOpDispatch-8           1274      20124504 ns/op     6871086 B/op       41866 allocs/op
BenchmarkSimpleSchedulerOrderedMap/BenchmarkNoOpDispatch-8    1101      23204846 ns/op     8028711 B/op       62228 allocs/op

BenchmarkSimpleSchedulerQueue/BenchmarkDelayDispatch-8         52      552231109 ns/op    26307832 B/op      149203 allocs/op
BenchmarkSimpleSchedulerMap/BenchmarkDelayDispatch-8           48      497977268 ns/op     6734791 B/op       40491 allocs/op
BenchmarkSimpleSchedulerOrderedMap/BenchmarkDelayDispatch-8    81      512892692 ns/op     7968430 B/op       61632 allocs/op
 
* "DelayDispatcher" delays the dispatch by random interval (uniformly distributed `0-0.1s`)
* not the most scientific benchmarks, ran on my laptop - but I did make sure to close Chrome.

As expected in a scenario where a lot (compared to batch size) of transactions are submitted concurrently, the current queue implementation performs an order of magnitude worse, due to the removes doing O(queueSize) operations, (which happens after each dispatch). The map backed tx-pool serves as a baseline implementation as it doesn't maintain transaction order. Ordered map implementation performance is comparable to the baseline map implementation, and also maintains the transactions order.

The delay benchmarks however show that the overall impact of the queue implementation is not as drastic whenever the dispatch is not instant.

Based on benchmark results I think it's reasonable to switch to the ordered queue implementation.

@ptrus ptrus force-pushed the ptrus/benchmark/runtime-scheduler branch 3 times, most recently from 7d21467 to 27045d9 Compare October 20, 2020 07:43
@codecov
Copy link

codecov bot commented Oct 20, 2020

Codecov Report

Merging #3434 into master will decrease coverage by 0.05%.
The diff coverage is 83.22%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3434      +/-   ##
==========================================
- Coverage   66.12%   66.07%   -0.06%     
==========================================
  Files         371      371              
  Lines       33368    33399      +31     
==========================================
+ Hits        22065    22067       +2     
- Misses       8059     8098      +39     
+ Partials     3244     3234      -10     
Impacted Files Coverage Δ
go/runtime/scheduling/simple/simple.go 78.68% <79.31%> (+0.11%) ⬆️
go/worker/compute/executor/committee/node.go 62.78% <80.00%> (+0.11%) ⬆️
...scheduling/simple/txpool/orderedmap/ordered_map.go 83.78% <83.78%> (ø)
go/runtime/scheduling/init.go 60.00% <100.00%> (ø)
go/worker/compute/executor/init.go 100.00% <100.00%> (ø)
go/worker/compute/executor/worker.go 85.71% <100.00%> (ø)
go/consensus/tendermint/abci/state/state.go 54.54% <0.00%> (-9.10%) ⬇️
go/consensus/tendermint/epochtime/epochtime.go 85.18% <0.00%> (-7.41%) ⬇️
go/staking/api/grpc.go 51.26% <0.00%> (-6.73%) ⬇️
go/consensus/tendermint/apps/staking/query.go 35.71% <0.00%> (-4.77%) ⬇️
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bd24b60...63fbd09. Read the comment docs.

@ptrus ptrus marked this pull request as ready for review October 20, 2020 10:00
Copy link
Member

@kostko kostko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I only have some minor nits.

consider removing the incomming_queue & map txpool implementations before merging, as only the ordered queue will be used

Yeah this makes sense.

go/runtime/scheduling/init.go Outdated Show resolved Hide resolved
go/runtime/scheduling/simple/simple.go Outdated Show resolved Hide resolved
go/runtime/scheduling/simple/simple.go Outdated Show resolved Hide resolved
go/runtime/scheduling/simple/simple.go Outdated Show resolved Hide resolved
go/runtime/scheduling/simple/simple.go Outdated Show resolved Hide resolved
go/runtime/scheduling/simple/txpool/api/api.go Outdated Show resolved Hide resolved
go/runtime/scheduling/simple/txpool/api/api.go Outdated Show resolved Hide resolved
go/runtime/scheduling/simple/txpool/api/api.go Outdated Show resolved Hide resolved
go/runtime/scheduling/simple/txpool/api/api.go Outdated Show resolved Hide resolved
@ptrus ptrus force-pushed the ptrus/benchmark/runtime-scheduler branch 4 times, most recently from 90147e1 to a99f9c5 Compare October 21, 2020 08:51
// Name is the transaction pool implementation name.
Name() string

// Add adds a single transaction into the transaction.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Add adds a single transaction into the transaction.
// Add adds a single transaction into the transaction pool.

// RemoveBatch removes a batch from the transaction pool.
RemoveBatch(batch [][]byte) error

// IsQueued returns whether a call is in the queue already.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// IsQueued returns whether a call is in the queue already.
// IsQueued returns whether a transaction is in the queue already.

// IsQueued returns whether a call is in the queue already.
IsQueued(txHash hash.Hash) bool

// Size returns size of the transaction pool.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Size returns size of the transaction pool.
// Size returns the number of transactions in the transaction pool.

@@ -565,8 +565,8 @@ func (n *Node) HandleNewBlockLocked(blk *block.Block) {
}
}

// checkTx checks the given call in the node's runtime.
func (n *Node) checkTx(ctx context.Context, call []byte) error {
// checkTx checks the given tx in the node's runtime.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// checkTx checks the given tx in the node's runtime.
// checkTx requests the runtime to check the validity of the given transaction.

@ptrus ptrus force-pushed the ptrus/benchmark/runtime-scheduler branch from a99f9c5 to 63fbd09 Compare October 21, 2020 09:45
@ptrus ptrus merged commit 127997d into master Oct 21, 2020
@ptrus ptrus deleted the ptrus/benchmark/runtime-scheduler branch October 21, 2020 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Benchmark runtime transaction queue implementation
2 participants