New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[KYUUBI #5925] Kyuubi TPC-DS support running benchmark with skipping some queries #5925
Conversation
how about changing |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #5925 +/- ##
============================================
- Coverage 61.57% 61.48% -0.10%
Complexity 23 23
============================================
Files 616 616
Lines 36388 36388
Branches 4979 4979
============================================
- Hits 22407 22374 -33
- Misses 11568 11586 +18
- Partials 2413 2428 +15 ☔ View full report in Codecov by Sentry. |
|
would it be clear if we replace the for priority, especially when
I lean to option 1, but both are fine, as long as we document the behavior clearly |
It's cleaner to just keep include and exclude in the end. I can make changes. My previous description of priority may have some misunderstandings. The meaning of exclude taking precedence over include is the same as option 1. |
that's fine, please go ahead |
…(list)/skip(list)
dev/kyuubi-tpcds/README.md
Outdated
| include | none(optional) | name of the queries to run, use , split multiple name, e.g. q1,q2 | | ||
| exclude | none(optional) | name of the queries to exclude, use , split multiple name, e.g. q2,q4 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| include | none(optional) | name of the queries to run, use , split multiple name, e.g. q1,q2 | | |
| exclude | none(optional) | name of the queries to exclude, use , split multiple name, e.g. q2,q4 | | |
| include | none(optional) | name of the queries to run, use comma to split multiple names, e.g. q1,q2 | | |
| exclude | none(optional) | name of the queries to exclude, use comma to split multiple names, e.g. q2,q4 | |
@@ -66,11 +63,16 @@ object RunBenchmark { | |||
opt[String]('r', "results-dir") | |||
.action((x, c) => c.copy(resultsDir = x)) | |||
.text("dir to store benchmark results, e.g. hdfs://hdfs-nn:9870/pref") | |||
opt[String]('q', "queries") | |||
opt[String]('I', "include") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the short opt is confusing, let's keep the long one only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
LGTM, only minor comments |
UT failure is unrelated, thanks, merged to master |
@rhh777 BTW, you may want to link your email haorenhui@kingsoft.com to your GitHub account, otherwise, your contribution will not be counted by GitHub :) |
Thanks, I'll adjust it later. |
BTW, it seems that 📝 Committer Pre-Merge Checklist is not welcomed at all. @pan3793 |
Is it my job to fill in this list? I wasn't sure before. |
thank you for your contribution and there is extra effort for you. @rhh777 |
@rhh777 it's my responsibility to fill in the |
TBH, the checklist is too long, especially when there are images in the description and the network is slow, selecting the long checkbox makes the screen flicker many times. I prefer to the previous brief template, only listing a few most important items and leave others to scripts, e.g. checking assignees, milestone are selected. |
BTW, I recommend avoid using emoji but prefer plain text in PR title and description, it doesn't render well on some platforms |
…pping some queries # 🔍 Description ## Issue References 🔗 When running Kyuubi's TPCDS, some SQL runs slowly, but there are no parameters to skip it. ## Describe Your Solution 🔧 Add the skip parameter, specifying a comma-separated list of SQL ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [x] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ no parameters to skip it. #### Behavior With This Pull Request 🎉 ``` $SPARK_HOME/bin/spark-submit \ --class org.apache.kyuubi.tpcds.benchmark.RunBenchmark \ kyuubi-tpcds_*.jar --db tpcds_sf10 --exclude q2,q4 ``` > == QUERY LIST == > q1-v2.4 > q3-v2.4 > q5-v2.4 > q6-v2.4 > q7-v2.4 > q8-v2.4 > q9-v2.4 > ..... #### Related Unit Tests --- # Checklists ## 📝 Author Self Checklist - [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project - [x] I have performed a self-review - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have made corresponding changes to the documentation - [ ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] New and existing unit tests pass locally with my changes - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) ## 📝 Committer Pre-Merge Checklist - [ ] Pull request title is okay. - [ ] No license issues. - [ ] Milestone correctly set? - [ ] Test coverage is ok - [ ] Assignees are selected. - [ ] Minimum number of approvals - [ ] No changes are requested **Be nice. Be informative.** Closes apache#5925 from rhh777/tpcds-support-skip-queries. Closes apache#5925 682f30c [haorenhui] Update some descriptions cd90fb5 [haorenhui] Use include(list) and exclude(list) to replace filter(string)/queries(list)/skip(list) 13744e5 [haorenhui] kyuubi tpcds RunBenchmark support skip some of the queries Authored-by: haorenhui <haorenhui@kingsoft.com> Signed-off-by: Cheng Pan <chengpan@apache.org>
🔍 Description
Issue References 🔗
When running Kyuubi's TPCDS, some SQL runs slowly, but there are no parameters to skip it.
Describe Your Solution 🔧
Add the skip parameter, specifying a comma-separated list of SQL
Types of changes 🔖
Test Plan 🧪
Behavior Without This Pull Request ⚰️
no parameters to skip it.
Behavior With This Pull Request 🎉
Related Unit Tests
Checklists
📝 Author Self Checklist
📝 Committer Pre-Merge Checklist
Be nice. Be informative.