-
Notifications
You must be signed in to change notification settings - Fork 6.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark(local) result of SQLParserEngine(MySQL) #8208
Comments
Thank you for the great performance test report. |
Yes, I just tested for the worse case. SkyWalking as an APM facing countless strange performance issues. We are trying to avoid to be the next one. |
@wu-sheng Now,cache in parser engine is not static, and create a new API to permit user customize the initial size of it. So, creating multiple parser engines will lead to a linear improvement in performance. |
I think this issue has already expired, the performance kept enhancing in each release. |
Hi @tristaZero , as promised, I run some tests about the performance of SQL Parser. I choose the MySQL grammar randomly.
Here is the tested SQLs, with the codes I run in the tests.
Single Thread Case W/O Cache
The length of SQL affects the performance pretty much, drop from 200k to 34k, with the complexity of SQLs. And have 10%+- fluctuation.
Active Cache With 1k/100k possible SQLs
I simply added a variable
I
to provide 1k different SQL samples to warm up the cache like this.The performance increases clearly, for the most complex SQL case, it increases from 34k to 1.6m, nearly 47x.
Then I change the possibility of SQLs to 100k, to make LRU cache not very effective, interesting things happened.
If the LRU cache doesn't work efficiently, it is worse than no cache. 34k down to 18k, nearly -50%
Concurrency Tests
I switch to the concurrency cases, first, LRU cache works(1k SQL samples), and activate 6 threads
168k -> 625k, 4x faster in 6 threads(My Local Laptop is 4Core Intel Core i7, MacBook Pro (15-inch, 2017)).
Then make LRU fails again(100k SQL samples), and still activate 6 threads.
Only 18k -> 27k.
One More
I am just curious about the performance of no LRU in the concurrency case, then I keep the 6 threads and close the LRU
The most interesting thing happened, it could provide 88k, comparing to 27k with LRU ON but with large data set.
Conclusion
The concurrent performance makes me a little concerned (not a block) to use this in the trace analysis core. Because, the performance of the OAP backend(one node) could be 10k+ segments/s very easily, in each segment, there could be 5-10 SQL statements normally. And the execution of the analysis would be concurrently for sure.
So, the current performance seems an impact on the current status.
Could you have a deeper discussion about what we could do to improve this?
The text was updated successfully, but these errors were encountered: