Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MCOL-4590 UNION Performance Improvement with the focus on the normalize functions. #2528

Merged
merged 1 commit into from Sep 14, 2022

Conversation

JigaoLuo
Copy link
Contributor

@JigaoLuo JigaoLuo commented Aug 24, 2022

The Jira issue number for this PR is: MCOL-4590

NOTE: This project is for the Google Summer of Code 2022.

Task

  • Simple Approach: use separate functions instead of the huge switch
  • Code Review and correction in simple approach

@tntnatbry tntnatbry self-requested a review August 24, 2022 16:47
@JigaoLuo JigaoLuo force-pushed the MCOL-4590 branch 2 times, most recently from d481ac3 to 523df4a Compare September 5, 2022 14:54
@JigaoLuo JigaoLuo closed this Sep 5, 2022
@JigaoLuo JigaoLuo deleted the MCOL-4590 branch September 5, 2022 14:57
@JigaoLuo JigaoLuo reopened this Sep 5, 2022
@JigaoLuo JigaoLuo force-pushed the MCOL-4590 branch 2 times, most recently from eee769b to 27e14ec Compare September 5, 2022 15:17
@JigaoLuo JigaoLuo marked this pull request as ready for review September 5, 2022 15:26
@JigaoLuo JigaoLuo changed the title [WIP] MCOL-4590 UNION Performance Improvement [WIP] MCOL-4590 UNION Performance Improvement with the focus on the normalize functions. Sep 7, 2022
dbcon/joblist/tupleunion.h Outdated Show resolved Hide resolved
dbcon/joblist/tupleunion.cpp Outdated Show resolved Hide resolved
dbcon/joblist/tupleunion.cpp Outdated Show resolved Hide resolved
dbcon/joblist/tupleunion.cpp Outdated Show resolved Hide resolved
dbcon/joblist/tupleunion.cpp Outdated Show resolved Hide resolved
dbcon/joblist/tupleunion.cpp Outdated Show resolved Hide resolved
dbcon/joblist/tupleunion.cpp Outdated Show resolved Hide resolved
dbcon/joblist/tupleunion.cpp Outdated Show resolved Hide resolved
dbcon/joblist/tupleunion.cpp Outdated Show resolved Hide resolved
dbcon/joblist/tupleunion.cpp Show resolved Hide resolved
dbcon/joblist/tupleunion.cpp Outdated Show resolved Hide resolved
dbcon/joblist/tupleunion.cpp Show resolved Hide resolved
dbcon/joblist/tupleunion.h Outdated Show resolved Hide resolved
@tntnatbry tntnatbry changed the title [WIP] MCOL-4590 UNION Performance Improvement with the focus on the normalize functions. MCOL-4590 UNION Performance Improvement with the focus on the normalize functions. Sep 8, 2022
dbcon/joblist/tupleunion.cpp Outdated Show resolved Hide resolved
dbcon/joblist/tupleunion.cpp Show resolved Hide resolved
dbcon/joblist/tupleunion.cpp Show resolved Hide resolved
dbcon/joblist/tupleunion.cpp Show resolved Hide resolved
…lize functions.

This patch improves the runtime performance of UNION processing in CS, as reported JIRA issue MCOL 4590. The idea of the optimization is to infer the normalize seperate functions beforehand and perform the normalization individually later, instead of a huge switch body of all normalization. This patch also cover engineering optimization, removing the hotspots in UNION processing. After application of this patch, the normalize part takes only about 25% of the whole UNION query in our experiment avg case.

Signed-off-by: Jigao Luo <luojigao@outlook.com>
@JigaoLuo
Copy link
Contributor Author

JigaoLuo commented Sep 10, 2022

Performance Testing of This PR

Experiment Environment

The experiments are run on the following hardware configuration:

  • AWS Instance type: c5.4xlarge
  • Debian 11
  • 30GiB EBS SSD

Dataset

The benchmark dataset is provided by the community: https://github.com/mariadb-corporation/mariadb-columnstore-samples/
In this work, the table flights is focused, which has 32 columns and over 38M tuples.

Schema

There are details of the table flights:

MariaDB [bts]> describe flights;
--
+---------------------+-------------+------+-----+---------+-------+
\| Field               \| Type        \| Null \| Key \| Default \| Extra \|
+---------------------+-------------+------+-----+---------+-------+
\| year                \| smallint(6) \| YES  \|     \| NULL    \|       \|
\| month               \| tinyint(4)  \| YES  \|     \| NULL    \|       \|
\| day                 \| tinyint(4)  \| YES  \|     \| NULL    \|       \|
\| day_of_week         \| tinyint(4)  \| YES  \|     \| NULL    \|       \|
\| fl_date             \| date        \| YES  \|     \| NULL    \|       \|
\| carrier             \| varchar(2)  \| YES  \|     \| NULL    \|       \|
\| tail_num            \| varchar(6)  \| YES  \|     \| NULL    \|       \|
\| fl_num              \| smallint(6) \| YES  \|     \| NULL    \|       \|
\| origin              \| varchar(5)  \| YES  \|     \| NULL    \|       \|
\| dest                \| varchar(5)  \| YES  \|     \| NULL    \|       \|
\| crs_dep_time        \| varchar(4)  \| YES  \|     \| NULL    \|       \|
\| dep_time            \| varchar(4)  \| YES  \|     \| NULL    \|       \|
\| dep_delay           \| smallint(6) \| YES  \|     \| NULL    \|       \|
\| taxi_out            \| smallint(6) \| YES  \|     \| NULL    \|       \|
\| wheels_off          \| varchar(4)  \| YES  \|     \| NULL    \|       \|
\| wheels_on           \| varchar(4)  \| YES  \|     \| NULL    \|       \|
\| taxi_in             \| smallint(6) \| YES  \|     \| NULL    \|       \|
\| crs_arr_time        \| varchar(4)  \| YES  \|     \| NULL    \|       \|
\| arr_time            \| varchar(4)  \| YES  \|     \| NULL    \|       \|
\| arr_delay           \| smallint(6) \| YES  \|     \| NULL    \|       \|
\| cancelled           \| smallint(6) \| YES  \|     \| NULL    \|       \|
\| cancellation_code   \| smallint(6) \| YES  \|     \| NULL    \|       \|
\| diverted            \| smallint(6) \| YES  \|     \| NULL    \|       \|
\| crs_elapsed_time    \| smallint(6) \| YES  \|     \| NULL    \|       \|
\| actual_elapsed_time \| smallint(6) \| YES  \|     \| NULL    \|       \|
\| air_time            \| smallint(6) \| YES  \|     \| NULL    \|       \|
\| distance            \| smallint(6) \| YES  \|     \| NULL    \|       \|
\| carrier_delay       \| smallint(6) \| YES  \|     \| NULL    \|       \|
\| weather_delay       \| smallint(6) \| YES  \|     \| NULL    \|       \|
\| nas_delay           \| smallint(6) \| YES  \|     \| NULL    \|       \|
\| security_delay      \| smallint(6) \| YES  \|     \| NULL    \|       \|
\| late_aircraft_delay \| smallint(6) \| YES  \|     \| NULL    \|       \|
+---------------------+-------------+------+-----+---------+-------+
32 rows in set (0.000 sec)
 
MariaDB [bts]> select count(*) from INFORMATION_SCHEMA.COLUMNS where table_name = "flights";
+----------+
\| count(*) \|
+----------+
\|       32 \|
+----------+
1 row in set (0.003 sec)
MariaDB [bts]> select count(*) from flights;
+----------+
\| count(*) \|
+----------+
\| 38083735 \|
+----------+
1 row in set (0.166 sec)

Benchmark Query Q1

The following Query Q1 is the benchmark query in our experiments and the query to be optimized.

MariaDB [bts]> select count(*) from (select * from flights union all select * from flights) as Q1;
--
+----------+
\| count(*) \|
+----------+
\| 76167470 \|
+----------+
1 row in set (3.011 sec)

Benchmark Query Q2

The following Query Q2 has no UNION statement. Ideally, Q1 should be close to 2x runtime of Q2.

MariaDB [bts]> select count(*) from (select * from flights) as Q2;
--
+----------+
\| count(*) \|
+----------+
\| 38083735 \|
+----------+
1 row in set (1.049 sec)

Benchmark Tool

The benchmark tool & script are provided by the community: https://github.com/drrtuy/cs-docker-tools

Here is how I run the Q1: ~/cs-docker-tools/slapit$ sudo ./sysbench_wrapper.sh 1 100 slapunion.lua bts. The slapunion.lua bts is loaded only with Q1.
The way to benchmark Q2 is similar.

Q2 Performance

The average runtime of Q2 is 632.42ms.

Latency histogram (values are in milliseconds)
       value  ------------- distribution ------------- count
     623.335 |**************************************** 63
     634.661 |*************                            21
     646.192 |******                                   10
     657.933 |**                                       3
     694.452 |**                                       3
 
SQL statistics:
    queries performed:
        read:                            100
        write:                           0
        other:                           0
        total:                           100
    transactions:                        100    (1.58 per sec.)
    queries:                             100    (1.58 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          63.2444s
    total number of events:              100

Latency (ms):
         min:                                  623.54
         avg:                                  632.42
         max:                                  698.28
         95th percentile:                      657.93
         sum:                                63242.44

Threads fairness:
    events (avg/stddev):           100.0000/0.00
    execution time (avg/stddev):   63.2424/0.00

Q1 Performance Without This PR

I benchmark with this commit https://github.com/mariadb-corporation/mariadb-columnstore-engine/commits/develop, which is the last commit and this PR is based on.

The average runtime of Q1 without the optimization of this PR is 3229.35ms.

Latency histogram (values are in milliseconds)
       value  ------------- distribution ------------- count
    2828.869 |***                                      1
    2880.269 |******                                   2
    2932.602 |***********                              4
    2985.887 |**************                           5
    3040.139 |*******************************          11
    3095.377 |***********************                  8
    3151.619 |***********************                  8
    3208.883 |*******************************          11
    3267.187 |*************************************    13
    3326.551 |**************************************** 14
    3386.993 |**************************               9
    3448.533 |*****************                        6
    3511.192 |*********                                3
    3574.989 |***                                      1
    3639.945 |*********                                3
    3706.081 |***                                      1
 
SQL statistics:
    queries performed:
        read:                            100
        write:                           0
        other:                           0
        total:                           100
    transactions:                        100    (0.31 per sec.)
    queries:                             100    (0.31 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          322.9365s
    total number of events:              100

Latency (ms):
         min:                                 2831.68
         avg:                                 3229.35
         max:                                 3707.17
         95th percentile:                     3511.19
         sum:                               322934.52

Threads fairness:
    events (avg/stddev):           100.0000/0.00
    execution time (avg/stddev):   322.9345/0.00

Q1 Performance With This PR

The average runtime of Q1 with the optimization of this PR is 1312.31ms.

Latency histogram (values are in milliseconds)
       value  ------------- distribution ------------- count
    1280.934 |*************************                30
    1304.208 |**************************************** 48
    1327.905 |*********                                11
    1352.033 |****                                     5
    1376.599 |**                                       2
    1401.611 |*                                        1
    1427.078 |*                                        1
    1453.007 |*                                        1
    1771.289 |*                                        1
 
SQL statistics:
    queries performed:
        read:                            100
        write:                           0
        other:                           0
        total:                           100
    transactions:                        100    (0.76 per sec.)
    queries:                             100    (0.76 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          131.2332s
    total number of events:              100

Latency (ms):
         min:                                 1285.18
         avg:                                 1312.31
         max:                                 1768.78
         95th percentile:                     1376.60
         sum:                               131231.23

Threads fairness:
    events (avg/stddev):           100.0000/0.00
    execution time (avg/stddev):   131.2312/0.00

Summary

Q2 AVG Runtime: 0.63s
Q1 AVG Runtime: 3.23s
Optimized Q1 AVG Runtime: 1.31s

The Runtime Slowdown Ratio of Q1 and Q2 is ~5x which means the Q1 has more than 5 times the runtime of Q2. Ideally, the Runtime Slowdown Ratio should be close to 2. The current develop branch has a very inefficient UNION processing logic with overhead.

Applying this patch, the runtime of Q1 is optimized to 1.31s, resulting in the Runtime Slowdown Ratio of 2.07. This ratio is very close to the ideal ratio. Moreover, the theoretical minimum is 2, which makes it impossible to optimize this ratio under 2.

In summary, I have optimized the UNION processing in ColumnStore. The performance improvement is satisfying and close to a theoretical limit.

@tntnatbry tntnatbry merged commit ef4c931 into mariadb-corporation:develop Sep 14, 2022
@Hinal-Srivastava
Copy link

Hello,
I am interested in contributing to this project. Can you please tell me some resources to get started and also guide me towards some first timer tasks?
Thank you! Looking forward to hearing from you

@drrtuy
Copy link
Collaborator

drrtuy commented Jan 29, 2023

Hi @Hinal-Srivastava,
I am glad to hear that. Could you get in touch with me in Zulip?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants