Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use SnapTreeMap instead of ConcurrentSkipListMap #6719

Closed
wants to merge 1 commit into from

Conversation

dyf6372
Copy link
Contributor

@dyf6372 dyf6372 commented Dec 10, 2018

Using jprofiler, I found that the function in ConcurrentSkipListMap was cost more than 30% cpu in our case.

image

SnapTreeMap is a drop-in replacement for ConcurrentSkipListMap. Form the paper http://ppl.stanford.edu/papers/ppopp207-bronson.pdf:
Experimental evidence shows that our algorithm outperforms a highly tuned concurrent skip list for many access patterns, with an average of 39% higher singlethreaded throughput and 32% higher multi-threaded throughput over a range of contention levels and operation mixes.

After I changing to SnapTreeMap , it's 20%~30% faster than ConcurrentSkipListMap.

image

@QiuMM
Copy link
Member

QiuMM commented Dec 10, 2018

Maybe other places that use ConcurrentSkipListMap can be replaced too.

@leventov
Copy link
Member

@QiuMM interesting, thanks. FYI, this PR addresses the same problem: #6327

The project's README (https://github.com/nbronson/snaptree) states that the latest released version is 0.2.

@b-slim
Copy link
Contributor

b-slim commented Dec 10, 2018

While 35% is spent in the .get call is true but most of that time 34% is consumed by the Druid comparator code, so not sure how the snapTreeMap is going to solve this? can you present some results or benchmarks ?
Plus the last commit in this project is 6 years old and seems that some issues are not fixed https://github.com/nbronson/snaptree/issues @dyf6372 are you aware of other projects using this ?

@dyf6372
Copy link
Contributor Author

dyf6372 commented Dec 10, 2018

@leventov just like nbronson/snaptree#6 said, v0.2 is not deployed at maven central repo.
@b-slim Cassandra use SnapTreeMap before version 2.1 (https://github.com/apache/cassandra/blob/cassandra-2.0/build.xml). In version 2.1, it used a custom in-memory BTree to replace SnapTreeMap(https://www.datastax.com/dev/blog/cassandra-2-1-now-over-50-faster).

@b-slim
Copy link
Contributor

b-slim commented Dec 10, 2018

as per readme

VERSION 0.2:  Incorporates an important bugfix to isEmpty()

I do not think it is a good idea to use code that is missing important bugfixes for such critical
section in Druid.
And as you pointed out Cassandra is not using it anymore.

@dyf6372 seems to me the project is not alive and does not have maintainers thus no one to fix bugs.

@dyf6372
Copy link
Contributor Author

dyf6372 commented Dec 11, 2018

While 35% is spent in the .get call is true but most of that time 34% is consumed by the Druid comparator code, so not sure how the snapTreeMap is going to solve this? can you present some results or benchmarks ?

@b-slim I'm not sure, I think it may decrease the number of compare or optimize the concurrency. Since the project seems to be not alive, I agree that it's may not a good idea to use this code now. However, it's really necessary for me to improve the performance in index since we have more than 1200 peon task for one cluster and data growth is fast. I'll try it in my cluster and if it's really work without any mistake I'll give the detail result or benchmark in the future.

@b-slim
Copy link
Contributor

b-slim commented Dec 11, 2018

@dyf6372 Thanks for doing this.
BTW there is this ongoing effort #6327
Maybe worth taking a look and perhaps joint effort if it make sense.
Happy coding !

@stale
Copy link

stale bot commented Feb 28, 2019

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

@stale stale bot added the stale label Feb 28, 2019
@stale
Copy link

stale bot commented Mar 7, 2019

This pull request has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

@stale stale bot closed this Mar 7, 2019
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants