New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MCOL-4590 UNION Performance Improvement with the focus on the normalize functions. #2528
Conversation
d481ac3
to
523df4a
Compare
eee769b
to
27e14ec
Compare
…lize functions. This patch improves the runtime performance of UNION processing in CS, as reported JIRA issue MCOL 4590. The idea of the optimization is to infer the normalize seperate functions beforehand and perform the normalization individually later, instead of a huge switch body of all normalization. This patch also cover engineering optimization, removing the hotspots in UNION processing. After application of this patch, the normalize part takes only about 25% of the whole UNION query in our experiment avg case. Signed-off-by: Jigao Luo <luojigao@outlook.com>
Performance Testing of This PRExperiment EnvironmentThe experiments are run on the following hardware configuration:
DatasetThe benchmark dataset is provided by the community: https://github.com/mariadb-corporation/mariadb-columnstore-samples/ SchemaThere are details of the table flights: Benchmark Query Q1The following Query Q1 is the benchmark query in our experiments and the query to be optimized. Benchmark Query Q2The following Query Q2 has no Benchmark ToolThe benchmark tool & script are provided by the community: https://github.com/drrtuy/cs-docker-tools Here is how I run the Q1: Q2 PerformanceThe average runtime of Q2 is 632.42ms. Q1 Performance Without This PRI benchmark with this commit https://github.com/mariadb-corporation/mariadb-columnstore-engine/commits/develop, which is the last commit and this PR is based on. The average runtime of Q1 without the optimization of this PR is 3229.35ms. Q1 Performance With This PRThe average runtime of Q1 with the optimization of this PR is 1312.31ms. SummaryQ2 AVG Runtime: 0.63s The Runtime Slowdown Ratio of Q1 and Q2 is ~5x which means the Q1 has more than 5 times the runtime of Q2. Ideally, the Runtime Slowdown Ratio should be close to 2. The current Applying this patch, the runtime of Q1 is optimized to 1.31s, resulting in the Runtime Slowdown Ratio of 2.07. This ratio is very close to the ideal ratio. Moreover, the theoretical minimum is 2, which makes it impossible to optimize this ratio under 2. In summary, I have optimized the UNION processing in ColumnStore. The performance improvement is satisfying and close to a theoretical limit. |
|
Hello, |
|
Hi @Hinal-Srivastava, |
The Jira issue number for this PR is: MCOL-4590
NOTE: This project is for the Google Summer of Code 2022.
Task