Regression Test: TPC-DS/H20AI and other adjustments #2263

pdet · 2021-09-10T11:17:36Z

I've added TPC-DS to the regression tests (Except queries 64,72,85)
I've increased the number of repetitions from 3 to 5 (Should take 1 - 1:30 to run), in an attempt of diminishing false negatives.

The output of the run will look something like this, which seems to me is a bit much info, so I can also adjust it to only print queries that were slower, or faster and slower, or maybe also cluster them into groups? In the end, I guess if the test breaks a simple ctrl F is enough to find the slower queries, but I'm happy to hear suggestions.:

######## Status TPC-H Benchmark Regression #######
Q1 fast (0.69 vs 0.77). 
Q2 fast (0.05 vs 0.06). 
Q3 fast (0.19 vs 0.22). 
Q4 same (0.25 vs 0.25). 
Q5 same (0.18 vs 0.18). 
Q6 fast (0.08 vs 0.1). 
Q7 same (0.38 vs 0.41). 
Q8 same (0.19 vs 0.21). 
Q9 same (2.61 vs 2.64). 
Q10 slow (0.51 vs 0.44). 
Q11 same (0.02 vs 0.02). 
Q12 same (0.47 vs 0.43). 
Q13 slow (0.24 vs 0.21). 
Q14 slow (0.11 vs 0.1). 
Q15 same (0.18 vs 0.17). 
Q16 same (0.1 vs 0.09). 
Q17 slow (0.32 vs 0.29). 
Q18 same (0.62 vs 0.59). 
Q19 same (0.66 vs 0.66). 
Q20 same (0.21 vs 0.22). 
######## End TPC-H Benchmark Regression #######
######## Status TPC-DS Benchmark Regression #######
Q1 same (0.03 vs 0.03). 
Q2 same (0.29 vs 0.29). 
Q3 slow (0.02 vs 0.01). 
Q4 same (1.12 vs 1.13). 
Q5 same (0.25 vs 0.24). 
Q6 same (0.33 vs 0.31). 
Q7 same (0.22 vs 0.2). 
Q8 same (0.04 vs 0.04). 
Q9 same (0.34 vs 0.36). 
Q10 fast (0.23 vs 0.26). 
Q11 same (0.73 vs 0.72). 
Q12 same (0.02 vs 0.02). 
Q13 same (0.47 vs 0.46). 
Q14 same (5.25 vs 4.99). 
Q15 same (0.15 vs 0.15). 
Q16 slow (0.09 vs 0.08). 
Q17 same (0.07 vs 0.07). 
Q18 same (1.83 vs 1.72). 
Q19 same (0.21 vs 0.19). 
Q20 same (0.04 vs 0.03). 
Q21 same (0.1 vs 0.11). 
Q22 same (1.85 vs 1.85). 
Q23 same (4.35 vs 4.54). 
Q24 same (0.3 vs 0.28). 
Q25 same (0.05 vs 0.05). 
Q26 same (0.29 vs 0.28). 
Q27 slow (0.7 vs 0.61). 
Q28 same (0.28 vs 0.26). 
Q29 slow (0.07 vs 0.06). 
Q30 slow (0.05 vs 0.04). 
Q31 slow (0.17 vs 0.14). 
Q32 slow (0.01 vs 0.01). 
Q33 slow (0.0 vs 0.0). 
Q34 slow (0.06 vs 0.05). 
Q35 slow (0.22 vs 0.2). 
Q36 same (0.34 vs 0.31). 
Q37 slow (0.07 vs 0.06). 
Q38 same (0.17 vs 0.16). 
Q39 same (0.27 vs 0.26). 
Q40 same (0.07 vs 0.07). 
Q41 same (0.01 vs 0.01). 
Q42 same (0.02 vs 0.02). 
Q43 slow (0.0 vs 0.0). 
Q44 same (0.08 vs 0.07). 
Q45 same (0.03 vs 0.03). 
Q46 slow (0.12 vs 0.11). 
Q47 same (3.73 vs 3.95). 
Q48 same (0.43 vs 0.46). 
Q49 same (0.01 vs 0.01). 
Q50 same (0.08 vs 0.07). 
Q51 same (6.59 vs 6.94). 
Q52 same (0.02 vs 0.02). 
Q53 same (0.06 vs 0.06). 
Q54 same (0.2 vs 0.19). 
Q55 same (0.02 vs 0.02). 
Q56 fast (0.01 vs 0.01). 
Q57 same (2.21 vs 2.13). 
Q58 same (0.1 vs 0.11). 
Q59 same (0.53 vs 0.52). 
Q60 same (0.01 vs 0.01). 
Q61 fast (0.01 vs 0.01). 
Q62 same (0.05 vs 0.05). 
Q63 same (0.04 vs 0.04). 
Q64 same (0.1 vs 0.11). 
Q66 same (0.11 vs 0.12). 
Q67 same (5.24 vs 5.39). 
Q68 same (0.09 vs 0.1). 
Q69 same (0.23 vs 0.26). 
Q70 same (0.37 vs 0.39). 
Q71 same (0.05 vs 0.05). 
Q72 same (0.06 vs 0.06). 
Q73 same (0.38 vs 0.37). 
Q75 same (0.28 vs 0.31). 
Q76 fast (0.0 vs 0.01). 
Q77 same (0.14 vs 0.15). 
Q78 same (0.56 vs 0.58). 
Q79 same (0.09 vs 0.1). 
Q80 fast (0.09 vs 0.11). 
Q81 same (0.06 vs 0.06). 
Q82 same (0.12 vs 0.12). 
Q83 same (0.03 vs 0.03). 
Q84 fast (0.07 vs 0.08). 
Q85 same (0.07 vs 0.06). 
Q86 fast (0.17 vs 0.19). 
Q88 same (0.21 vs 0.21). 
Q89 same (0.1 vs 0.11). 
Q90 same (0.01 vs 0.01). 
Q91 same (0.0 vs 0.0). 
Q92 fast (0.0 vs 0.01). 
Q93 same (0.15 vs 0.16). 
Q94 same (0.04 vs 0.03). 
Q95 slow (0.95 vs 0.8). 
Q96 same (0.02 vs 0.02). 
Q97 same (0.13 vs 0.13). 
Q98 same (0.06 vs 0.06). 
######## End TPC-DS Benchmark Regression #######
Traceback (most recent call last):
  File "scripts/regression_test.py", line 121, in <module>
    regression_test(0.1)
  File "scripts/regression_test.py", line 119, in regression_test
    assert(0)
AssertionError

…ion script to only re-run queries if they were not on the same time

pdet · 2021-09-11T15:58:43Z

So, I've changed how things work a bit.

I'm still comparing against the latest master python wheel, but now the PR is built under a different library name. This allows me to only perform re-runs on queries that did not present the same result.
So basically, if a query (the median of 5 runs) is either faster or slower, it will be executed up to 25x to check it. I've also increased the threshold from 10 to 15%.

This should speed up the regression tests and hopefully, diminish the chances of false (positive or negatives)

Mytherin

Thanks for the PR! Looks good. Some minor comments:

tools/pythonpkg/lib_name.hpp

Mytherin · 2021-09-13T07:50:56Z

Looks good, thanks!

pdet added 13 commits September 8, 2021 11:07

Check perfhj impact on tpch

99fb095

push

ee2be1c

Run 10 reps and TPC-DS on regression

22167cd

1 rep

f1de527

adding prints

6de9c57

Force flush

0d305a1

More debug info, skip Q72

d696aee

Regression adjustments

94ec0a8

extra space in yml file

0978743

This file should not exist yet

fd840b1

Allowing to instal duckdb under other package names, changing regress…

e2bc7a8

…ion script to only re-run queries if they were not on the same time

Prolly need to remove the depends on for debugging

9b2e957

Run up to 25 reps, increase threshold to 15

2ee81cd

pdet added 4 commits September 11, 2021 19:54

crank repetitions up to 100, only rerun for slow queries

2d81e1f

Let go to 200

7290b07

Merge branch 'master' of github.com:pdet/duckdb into checkperfhj

c1c9e8d

Separating regression tests to different machines on CI, adding H2OAI

b6e0250

pdet changed the title ~~Regression Test: TPC-DS and other adjustments~~ Regression Test: TPC-DS/H20AI and other adjustments Sep 12, 2021

pdet added 5 commits September 12, 2021 15:01

Build directly from master instead of using --pre

457a438

pip install numpy

bc62382

Adjusting other CI tests and small adjustment

de22b5b

Reverting the other CI jobs back

d259537

space in yml file

1fa5347

Mytherin reviewed Sep 12, 2021

View reviewed changes

tools/pythonpkg/lib_name.hpp Outdated Show resolved Hide resolved

tools/pythonpkg/lib_name.hpp Outdated Show resolved Hide resolved

PR requests

d362eef

Mytherin merged commit 74a14eb into duckdb:master Sep 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression Test: TPC-DS/H20AI and other adjustments #2263

Regression Test: TPC-DS/H20AI and other adjustments #2263

pdet commented Sep 10, 2021

pdet commented Sep 11, 2021 •

edited

Mytherin left a comment

Mytherin commented Sep 13, 2021

Regression Test: TPC-DS/H20AI and other adjustments #2263

Regression Test: TPC-DS/H20AI and other adjustments #2263

Conversation

pdet commented Sep 10, 2021

pdet commented Sep 11, 2021 • edited

Mytherin left a comment

Choose a reason for hiding this comment

Mytherin commented Sep 13, 2021

pdet commented Sep 11, 2021 •

edited