Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression Test: TPC-DS/H20AI and other adjustments #2263

Merged
merged 23 commits into from Sep 13, 2021

Conversation

pdet
Copy link
Member

@pdet pdet commented Sep 10, 2021

I've added TPC-DS to the regression tests (Except queries 64,72,85)
I've increased the number of repetitions from 3 to 5 (Should take 1 - 1:30 to run), in an attempt of diminishing false negatives.

The output of the run will look something like this, which seems to me is a bit much info, so I can also adjust it to only print queries that were slower, or faster and slower, or maybe also cluster them into groups? In the end, I guess if the test breaks a simple ctrl F is enough to find the slower queries, but I'm happy to hear suggestions.:

######## Status TPC-H Benchmark Regression #######
Q1 fast (0.69 vs 0.77). 
Q2 fast (0.05 vs 0.06). 
Q3 fast (0.19 vs 0.22). 
Q4 same (0.25 vs 0.25). 
Q5 same (0.18 vs 0.18). 
Q6 fast (0.08 vs 0.1). 
Q7 same (0.38 vs 0.41). 
Q8 same (0.19 vs 0.21). 
Q9 same (2.61 vs 2.64). 
Q10 slow (0.51 vs 0.44). 
Q11 same (0.02 vs 0.02). 
Q12 same (0.47 vs 0.43). 
Q13 slow (0.24 vs 0.21). 
Q14 slow (0.11 vs 0.1). 
Q15 same (0.18 vs 0.17). 
Q16 same (0.1 vs 0.09). 
Q17 slow (0.32 vs 0.29). 
Q18 same (0.62 vs 0.59). 
Q19 same (0.66 vs 0.66). 
Q20 same (0.21 vs 0.22). 
######## End TPC-H Benchmark Regression #######
######## Status TPC-DS Benchmark Regression #######
Q1 same (0.03 vs 0.03). 
Q2 same (0.29 vs 0.29). 
Q3 slow (0.02 vs 0.01). 
Q4 same (1.12 vs 1.13). 
Q5 same (0.25 vs 0.24). 
Q6 same (0.33 vs 0.31). 
Q7 same (0.22 vs 0.2). 
Q8 same (0.04 vs 0.04). 
Q9 same (0.34 vs 0.36). 
Q10 fast (0.23 vs 0.26). 
Q11 same (0.73 vs 0.72). 
Q12 same (0.02 vs 0.02). 
Q13 same (0.47 vs 0.46). 
Q14 same (5.25 vs 4.99). 
Q15 same (0.15 vs 0.15). 
Q16 slow (0.09 vs 0.08). 
Q17 same (0.07 vs 0.07). 
Q18 same (1.83 vs 1.72). 
Q19 same (0.21 vs 0.19). 
Q20 same (0.04 vs 0.03). 
Q21 same (0.1 vs 0.11). 
Q22 same (1.85 vs 1.85). 
Q23 same (4.35 vs 4.54). 
Q24 same (0.3 vs 0.28). 
Q25 same (0.05 vs 0.05). 
Q26 same (0.29 vs 0.28). 
Q27 slow (0.7 vs 0.61). 
Q28 same (0.28 vs 0.26). 
Q29 slow (0.07 vs 0.06). 
Q30 slow (0.05 vs 0.04). 
Q31 slow (0.17 vs 0.14). 
Q32 slow (0.01 vs 0.01). 
Q33 slow (0.0 vs 0.0). 
Q34 slow (0.06 vs 0.05). 
Q35 slow (0.22 vs 0.2). 
Q36 same (0.34 vs 0.31). 
Q37 slow (0.07 vs 0.06). 
Q38 same (0.17 vs 0.16). 
Q39 same (0.27 vs 0.26). 
Q40 same (0.07 vs 0.07). 
Q41 same (0.01 vs 0.01). 
Q42 same (0.02 vs 0.02). 
Q43 slow (0.0 vs 0.0). 
Q44 same (0.08 vs 0.07). 
Q45 same (0.03 vs 0.03). 
Q46 slow (0.12 vs 0.11). 
Q47 same (3.73 vs 3.95). 
Q48 same (0.43 vs 0.46). 
Q49 same (0.01 vs 0.01). 
Q50 same (0.08 vs 0.07). 
Q51 same (6.59 vs 6.94). 
Q52 same (0.02 vs 0.02). 
Q53 same (0.06 vs 0.06). 
Q54 same (0.2 vs 0.19). 
Q55 same (0.02 vs 0.02). 
Q56 fast (0.01 vs 0.01). 
Q57 same (2.21 vs 2.13). 
Q58 same (0.1 vs 0.11). 
Q59 same (0.53 vs 0.52). 
Q60 same (0.01 vs 0.01). 
Q61 fast (0.01 vs 0.01). 
Q62 same (0.05 vs 0.05). 
Q63 same (0.04 vs 0.04). 
Q64 same (0.1 vs 0.11). 
Q66 same (0.11 vs 0.12). 
Q67 same (5.24 vs 5.39). 
Q68 same (0.09 vs 0.1). 
Q69 same (0.23 vs 0.26). 
Q70 same (0.37 vs 0.39). 
Q71 same (0.05 vs 0.05). 
Q72 same (0.06 vs 0.06). 
Q73 same (0.38 vs 0.37). 
Q75 same (0.28 vs 0.31). 
Q76 fast (0.0 vs 0.01). 
Q77 same (0.14 vs 0.15). 
Q78 same (0.56 vs 0.58). 
Q79 same (0.09 vs 0.1). 
Q80 fast (0.09 vs 0.11). 
Q81 same (0.06 vs 0.06). 
Q82 same (0.12 vs 0.12). 
Q83 same (0.03 vs 0.03). 
Q84 fast (0.07 vs 0.08). 
Q85 same (0.07 vs 0.06). 
Q86 fast (0.17 vs 0.19). 
Q88 same (0.21 vs 0.21). 
Q89 same (0.1 vs 0.11). 
Q90 same (0.01 vs 0.01). 
Q91 same (0.0 vs 0.0). 
Q92 fast (0.0 vs 0.01). 
Q93 same (0.15 vs 0.16). 
Q94 same (0.04 vs 0.03). 
Q95 slow (0.95 vs 0.8). 
Q96 same (0.02 vs 0.02). 
Q97 same (0.13 vs 0.13). 
Q98 same (0.06 vs 0.06). 
######## End TPC-DS Benchmark Regression #######
Traceback (most recent call last):
  File "scripts/regression_test.py", line 121, in <module>
    regression_test(0.1)
  File "scripts/regression_test.py", line 119, in regression_test
    assert(0)
AssertionError

@pdet
Copy link
Member Author

pdet commented Sep 11, 2021

So, I've changed how things work a bit.

I'm still comparing against the latest master python wheel, but now the PR is built under a different library name. This allows me to only perform re-runs on queries that did not present the same result.
So basically, if a query (the median of 5 runs) is either faster or slower, it will be executed up to 25x to check it. I've also increased the threshold from 10 to 15%.

This should speed up the regression tests and hopefully, diminish the chances of false (positive or negatives)

@pdet pdet changed the title Regression Test: TPC-DS and other adjustments Regression Test: TPC-DS/H20AI and other adjustments Sep 12, 2021
Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Looks good. Some minor comments:

tools/pythonpkg/lib_name.hpp Outdated Show resolved Hide resolved
tools/pythonpkg/lib_name.hpp Outdated Show resolved Hide resolved
@Mytherin Mytherin merged commit 74a14eb into duckdb:master Sep 13, 2021
@Mytherin
Copy link
Collaborator

Looks good, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants