Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of TransactionLog.add() #15111

Merged

Conversation

@gdela
Copy link
Contributor

gdela commented May 30, 2019

Improve performance of TransactionLog.add() by avoiding LinkedList.remove() call.

Code may be considered cleaner/simpler now, but the reason for this change is performance. When in a single transaction entries were changed, and then changed again (TransactionalMap.put() called on the same entry), the TransactionLog.add() method did a LinkedList.remove() call which has complexity O(n). The new implementation that uses LinkedHashMap instead of LinkedList+HashMap is up to 20% faster on 1000-entry transaction, and up to three times faster on 10000-entry transaction. On small transactions performance is the same.

In this gist, you can find benchmark that I used to make sure that this change improves performance. Here are the results:

Original Implementation:

MyTransactionBenchmark._10000_baseline  avgt    6   287,601 ±   5,284  ms/op
MyTransactionBenchmark._10000_backward  avgt    6   901,295 ±   9,973  ms/op
MyTransactionBenchmark._10000_forward   avgt    6   307,499 ±  20,451  ms/op
MyTransactionBenchmark._10000_random    avgt    6  1082,282 ± 164,960  ms/op
MyTransactionBenchmark._1000_baseline   avgt    6    28,056 ±   1,384  ms/op
MyTransactionBenchmark._1000_backward   avgt    6    35,975 ±   0,623  ms/op
MyTransactionBenchmark._1000_forward    avgt    6    30,154 ±   1,075  ms/op
MyTransactionBenchmark._1000_random     avgt    6    36,131 ±   1,472  ms/op
MyTransactionBenchmark._100_baseline    avgt    6     2,839 ±   0,078  ms/op
MyTransactionBenchmark._100_backward    avgt    6     2,984 ±   0,042  ms/op
MyTransactionBenchmark._100_forward     avgt    6     2,990 ±   0,155  ms/op
MyTransactionBenchmark._100_random      avgt    6     3,051 ±   0,112  ms/op
MyTransactionBenchmark._10_baseline     avgt    6      ,320 ±   0,010  ms/op
MyTransactionBenchmark._10_backward     avgt    6      ,348 ±   0,006  ms/op
MyTransactionBenchmark._10_forward      avgt    6      ,344 ±   0,011  ms/op
MyTransactionBenchmark._10_random       avgt    6      ,346 ±   0,010  ms/op

After Change:

MyTransactionBenchmark._10000_baseline  avgt    6  294,238 ± 13,064  ms/op
MyTransactionBenchmark._10000_backward  avgt    6  305,480 ± 15,631  ms/op
MyTransactionBenchmark._10000_forward   avgt    6  300,084 ±  5,390  ms/op
MyTransactionBenchmark._10000_random    avgt    6  301,633 ±  8,136  ms/op
MyTransactionBenchmark._1000_baseline   avgt    6   28,368 ±  1,159  ms/op
MyTransactionBenchmark._1000_backward   avgt    6   30,114 ±  1,072  ms/op
MyTransactionBenchmark._1000_forward    avgt    6   29,761 ±  0,653  ms/op
MyTransactionBenchmark._1000_random     avgt    6   29,659 ±  0,729  ms/op
MyTransactionBenchmark._100_baseline    avgt    6    2,802 ±  0,033  ms/op
MyTransactionBenchmark._100_backward    avgt    6    2,921 ±  0,097  ms/op
MyTransactionBenchmark._100_forward     avgt    6    2,927 ±  0,045  ms/op
MyTransactionBenchmark._100_random      avgt    6    3,059 ±  0,503  ms/op
MyTransactionBenchmark._10_baseline     avgt    6     ,317 ±  0,007  ms/op
MyTransactionBenchmark._10_backward     avgt    6     ,339 ±  0,004  ms/op
MyTransactionBenchmark._10_forward      avgt    6     ,344 ±  0,025  ms/op
MyTransactionBenchmark._10_random       avgt    6     ,332 ±  0,002  ms/op

As you can see, when there are 1000 or 10000 entries in a transaction, the performance after change is better and more stable.

…move() call.

Code may be considered cleaner/simpler now, but the reason for this change is performance. When in a single transaction entries were changed, and then changed again (TransactionalMap.put() called on the same entry), the TransactionLog.add() method did a LinkedList.remove() call which has complexity O(n). The new implementation that uses LinkedHashMap instead of LinkedList+HashMap is up to 20% faster on 1000-entry transaction, and up to three times faster on 10000-entry transaction. On small transactions performance is the same.
@devOpsHazelcast

This comment has been minimized.

Copy link
Contributor

devOpsHazelcast commented May 30, 2019

Can one of the admins verify this patch?

@@ -38,53 +37,40 @@
* the TransactionLog will contains 4 {@link TransactionLogRecord} instances.
*
* planned optimization:
* Most transaction will be small, but an linkedlist + hashmap is created. Instead use an array and do a
* Most transaction will be small, but an linkedhashmap is created. Instead use an array and do a
* linear search in that array. When there are too many items added, then enable the hashmap.

This comment has been minimized.

Copy link
@gdela

gdela May 30, 2019

Author Contributor

Out of curiosity I've implemented the TransactionLog using a plain ArrayList and linear search. My benchmark showed that actually there is no performance benefits in such implementation, even for small transactions:

LinkedHashMap based implementation:

MyTransactionBenchmark._10_baseline     avgt    6     ,317 ±  0,007  ms/op
MyTransactionBenchmark._10_backward     avgt    6     ,339 ±  0,004  ms/op
MyTransactionBenchmark._10_forward      avgt    6     ,344 ±  0,025  ms/op
MyTransactionBenchmark._10_random       avgt    6     ,332 ±  0,002  ms/op

ArrayList based implementation:

MyTransactionBenchmark._10_baseline    avgt    6    ,315 ± 0,007  ms/op
MyTransactionBenchmark._10_backward    avgt    6    ,341 ± 0,009  ms/op
MyTransactionBenchmark._10_forward     avgt    6    ,328 ± 0,006  ms/op
MyTransactionBenchmark._10_random      avgt    6    ,336 ± 0,005  ms/op

Though maybe my ArrayList based implementation was too naive.

@TomaszGaweda

This comment has been minimized.

Copy link

TomaszGaweda commented May 30, 2019

FYI, this patch is already battle-tested on many our environments. We didn't see any problems, etc.

@gdela

This comment has been minimized.

Copy link
Contributor Author

gdela commented Jul 10, 2019

Any chances of merging this rather simple pull request into Hazelcast 3.12.x? The performance difference is even more evident for transactions in which 100000 entries are being modified. And yes, we do have such large transaction in our project that uses Hazelcast.

Original implementation:

MyTransactionBenchmark._100000_baseline  avgt    6    3496,810 ±   569,101  ms/op
MyTransactionBenchmark._100000_backward  avgt    6  103513,763 ± 51709,247  ms/op
MyTransactionBenchmark._100000_forward   avgt    6    3665,922 ±   381,869  ms/op
MyTransactionBenchmark._100000_random    avgt    6  168188,393 ± 47363,600  ms/op
MyTransactionBenchmark._10000_random     avgt    6    1157,420 ±    42,528  ms/op

After Change:

MyTransactionBenchmark._100000_baseline  avgt    6  3383,798 ± 977,911  ms/op
MyTransactionBenchmark._100000_backward  avgt    6  3778,254 ± 412,208  ms/op
MyTransactionBenchmark._100000_forward   avgt    6  3718,097 ± 215,745  ms/op
MyTransactionBenchmark._100000_random    avgt    6  3727,629 ± 758,245  ms/op
@mmedenjak

This comment has been minimized.

Copy link
Contributor

mmedenjak commented Aug 7, 2019

run-lab-run

Copy link
Contributor

mmedenjak left a comment

Looks great, thank you for this improvement. I've added some minor comments that you can address or not, as you wish. I think we may backport this to 3.12.3 since it looks backwards compatible. So please cherry-pick this commit and send one more PR for the maintenance-3.x branch.

gdela added a commit to gdela/hazelcast that referenced this pull request Sep 17, 2019
@mmedenjak mmedenjak merged commit 71da564 into hazelcast:master Sep 18, 2019
@mmedenjak

This comment has been minimized.

Copy link
Contributor

mmedenjak commented Sep 18, 2019

Merged, thank you for the PR again @gdela and the review @petrpleshachkov !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.