Introduce parallel processing #39

matjazv · 2022-08-26T13:48:49Z

Description

Introduce parallel processing on places in the code where it is applicable and make sense from a code execution performance perspective. An average CPU nowadays has a multiple cores so it makes sense to transform CPU intensive calculations into parallel processing to gain code execution speed

a crate called Rayon might be used for parallel processing
among other features, Rayon has parallel iterators which may be used to transform code from sequential processing into parallel processing
example code how it's possible to make a code to run in parallel using Rayon parallel iterators:

// instead of writing code like this
let mut sum = 0;
for element in vec_array {
  sum += element*element
}

// it would be better to translate above code into this
vec_array.iter().map(|&el| el * el).sum()

// because the above code can be trivially converted to be executed in parallel with Rayon
vec_array.par_iter().map(|&el| el * el).sum()

When using parallelism in a code, shared resources between different threads need to be protected by not having simultaneous operations on them. If two different threads want to change a shared resource it might occur that this resource will be in invalid state if it won't be locked before any write occurs.

a crate called parking_lot might be used for locking a shared resources
where applicable RwLock may be used because multiple threads can read a shared resource at the same time and only one thread can lock a shared resource for writing purpose

Acceptance Criteria

where applicable code is running in a parallel
some tests before and after introducing parallelism are taken and measurements are documented

Additional Information

some functions in smt.rs file are CPU intensive with a lot of calculations
concentrate on those functions

The text was updated successfully, but these errors were encountered:

matjazv · 2022-10-10T07:48:15Z

Optimization description	Performance result	Included in code
Use `par_iter()` from `Rayon` crate to parallel iterations over vector elements	Decreased	No
Use parking_lot instead of `std`	There is no big difference between these two libs, so `sdt` could be a better option for us	No
Convert loop to use `iter` every where it is possible	Decreased	No
Refactor `SparseMerkleTree` to use thread	With current implementation and some restrictions there is no option to implement multithreading for `SparseMerkleTree`	No
Refactor `Subtree` to use thread	Decreased	No
Introduce `Hasher` as a new `struct` instead of function type	Small improvement	Yes
Do not calculate empty hash but use constant instead	Quite big improvement	Yes
Remove `.clone()` function calls and use references instead	Improved	Yes
Remove copying objects when passing them to a functions and use references instead	Improved	Yes
Use 4 as a subtree height in `SparseMerkleTree` module	Big improvement	Yes
Initialize `vector` with a fixed size if it's final size is known in advance	Small improvement	Yes
Refactor `bytes_to_bools` function to use function from `bitvec` crate instead of our own implementation	Improved	Yes

matjazv · 2022-10-13T08:18:54Z

Below table shows benchmarking results from before and after code optimizations.
Commit used for before optimizations: 1e5cb1e
Commit used for after optimizations: 7ac861a
Script used for benchmarking: state_long.js
System used: Intel(R) Xeon(R) Platinum 8168 CPU @ 2.70GHz, 8 GB RAM, 50 GB SSD

Original data	Addition	Time to set (before optimization)	Time to update (before optimization)	Time to set (after optimization)	Time to update (after optimization)
0.5M	1k	89 ms	648 ms	70 ms	206 ms
1M	1k	71 ms	771 ms	81 ms	247 ms
0.5M	5k	433 ms	1354 ms	539 ms	789 ms
1M	5k	510 ms	1922 ms	314 ms	925 ms
0.5M	10k	683 ms	2145 ms	681 ms	1295 ms
1M	10k	794 ms	3250 ms	721 ms	1616 ms

Optimize processing time - Closes #39

matjazv added the type: improvement label Aug 26, 2022

shuse2 mentioned this issue Aug 26, 2022

Optimize lisk-db for production #26

Closed

20 tasks

matjazv mentioned this issue Sep 20, 2022

Profile Rust code to discover slow parts #53

Closed

shuse2 added this to the Sprint 9 milestone Sep 26, 2022

shuse2 mentioned this issue Sep 26, 2022

Improve performance of SMT using multi-thread #5

Closed

matjazv self-assigned this Sep 30, 2022

shuse2 modified the milestones: Sprint 9, Sprint 10 Oct 10, 2022

matjazv mentioned this issue Oct 13, 2022

Optimize processing time - Closes #39 #69

Merged

shuse2 closed this as completed in #69 Oct 13, 2022

shuse2 added a commit that referenced this issue Oct 13, 2022

Merge pull request #69 from LiskHQ/39-introduce-parallel-processing

658d4f1

Optimize processing time - Closes #39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce parallel processing #39

Introduce parallel processing #39

matjazv commented Aug 26, 2022

matjazv commented Oct 10, 2022 •

edited

Loading

matjazv commented Oct 13, 2022

Introduce parallel processing #39

Introduce parallel processing #39

Comments

matjazv commented Aug 26, 2022

Description

Acceptance Criteria

Additional Information

matjazv commented Oct 10, 2022 • edited Loading

matjazv commented Oct 13, 2022

matjazv commented Oct 10, 2022 •

edited

Loading