Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[KYUUBI #5925] Kyuubi TPC-DS support running benchmark with skipping some queries #5925

Closed
wants to merge 3 commits into from

Conversation

rhh777
Copy link

@rhh777 rhh777 commented Dec 27, 2023

🔍 Description

Issue References 🔗

When running Kyuubi's TPCDS, some SQL runs slowly, but there are no parameters to skip it.

Describe Your Solution 🔧

Add the skip parameter, specifying a comma-separated list of SQL

Types of changes 🔖

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Test Plan 🧪

Behavior Without This Pull Request ⚰️

no parameters to skip it.

Behavior With This Pull Request 🎉

$SPARK_HOME/bin/spark-submit \
  --class org.apache.kyuubi.tpcds.benchmark.RunBenchmark \
  kyuubi-tpcds_*.jar --db tpcds_sf10 --exclude q2,q4

== QUERY LIST ==
q1-v2.4
q3-v2.4
q5-v2.4
q6-v2.4
q7-v2.4
q8-v2.4
q9-v2.4
.....

Related Unit Tests


Checklists

📝 Author Self Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • This patch was not authored or co-authored using Generative Tooling

📝 Committer Pre-Merge Checklist

  • Pull request title is okay.
  • No license issues.
  • Milestone correctly set?
  • Test coverage is ok
  • Assignees are selected.
  • Minimum number of approvals
  • No changes are requested

Be nice. Be informative.

@rhh777 rhh777 changed the title Kyuubi tpcds RunBenchmark support skip some of the queries [KYUUBI #5925]Kyuubi tpcds RunBenchmark support skip some of the queries Dec 27, 2023
@cxzl25 cxzl25 changed the title [KYUUBI #5925]Kyuubi tpcds RunBenchmark support skip some of the queries [KYUUBI #5925] Kyuubi tpcds RunBenchmark support skip some of the queries Dec 27, 2023
@pan3793
Copy link
Member

pan3793 commented Dec 27, 2023

how about changing filter to include and skip to exclude? And we should define the priority of these two options

@pan3793 pan3793 changed the title [KYUUBI #5925] Kyuubi tpcds RunBenchmark support skip some of the queries [KYUUBI #5925] Kyuubi TPC-DS support running benchmark with skipping some queries Dec 27, 2023
@codecov-commenter
Copy link

codecov-commenter commented Dec 27, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (c6177ab) 61.57% compared to head (cd90fb5) 61.48%.
Report is 1 commits behind head on master.

❗ Current head cd90fb5 differs from pull request most recent head 682f30c. Consider uploading reports for the commit 682f30c to get more accurate results

Additional details and impacted files
@@             Coverage Diff              @@
##             master    #5925      +/-   ##
============================================
- Coverage     61.57%   61.48%   -0.10%     
  Complexity       23       23              
============================================
  Files           616      616              
  Lines         36388    36388              
  Branches       4979     4979              
============================================
- Hits          22407    22374      -33     
- Misses        11568    11586      +18     
- Partials       2413     2428      +15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rhh777
Copy link
Author

rhh777 commented Dec 27, 2023

how about changing filter to include and skip to exclude? And we should define the priority of these two options

  1. It sounds great, but now there is also the queries parameter. Its function is similar to filter. Filter can only include one SQL, and queries can include multiple SQLs.

  2. Regarding priority, I think exclude should always take precedence over include.

@pan3793
Copy link
Member

pan3793 commented Dec 27, 2023

would it be clear if we replace the queries and filter with include(list) and exclude(list)? it's kind of a developer tool, such breaking changes are acceptable.

for priority, especially when include and exclude are present at the same time, there are some candidates:

  1. the final result is include minus exclude
  2. as you proposed, simply ignore the include

I lean to option 1, but both are fine, as long as we document the behavior clearly

@rhh777
Copy link
Author

rhh777 commented Dec 27, 2023

It's cleaner to just keep include and exclude in the end. I can make changes.

My previous description of priority may have some misunderstandings. The meaning of exclude taking precedence over include is the same as option 1.

@pan3793
Copy link
Member

pan3793 commented Dec 27, 2023

that's fine, please go ahead

Comment on lines 58 to 59
| include | none(optional) | name of the queries to run, use , split multiple name, e.g. q1,q2 |
| exclude | none(optional) | name of the queries to exclude, use , split multiple name, e.g. q2,q4 |
Copy link
Member

@pan3793 pan3793 Dec 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| include | none(optional) | name of the queries to run, use , split multiple name, e.g. q1,q2 |
| exclude | none(optional) | name of the queries to exclude, use , split multiple name, e.g. q2,q4 |
| include | none(optional) | name of the queries to run, use comma to split multiple names, e.g. q1,q2 |
| exclude | none(optional) | name of the queries to exclude, use comma to split multiple names, e.g. q2,q4 |

@@ -66,11 +63,16 @@ object RunBenchmark {
opt[String]('r', "results-dir")
.action((x, c) => c.copy(resultsDir = x))
.text("dir to store benchmark results, e.g. hdfs://hdfs-nn:9870/pref")
opt[String]('q', "queries")
opt[String]('I', "include")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the short opt is confusing, let's keep the long one only

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

@pan3793
Copy link
Member

pan3793 commented Dec 27, 2023

LGTM, only minor comments

@pan3793 pan3793 added this to the v1.9.0 milestone Dec 28, 2023
@pan3793 pan3793 closed this in e49af52 Dec 28, 2023
@pan3793
Copy link
Member

pan3793 commented Dec 28, 2023

UT failure is unrelated, thanks, merged to master

@pan3793
Copy link
Member

pan3793 commented Dec 28, 2023

@rhh777 BTW, you may want to link your email haorenhui@kingsoft.com to your GitHub account, otherwise, your contribution will not be counted by GitHub :)

@rhh777
Copy link
Author

rhh777 commented Dec 28, 2023

@rhh777 BTW, you may want to link your email haorenhui@kingsoft.com to your GitHub account, otherwise, your contribution will not be counted by GitHub :)

Thanks, I'll adjust it later.

@yaooqinn
Copy link
Member

BTW, it seems that 📝 Committer Pre-Merge Checklist is not welcomed at all. @pan3793

@rhh777
Copy link
Author

rhh777 commented Dec 28, 2023

BTW, it seems that 📝 Committer Pre-Merge Checklist is not welcomed at all. @pan3793

Is it my job to fill in this list? I wasn't sure before.

@yaooqinn
Copy link
Member

thank you for your contribution and there is extra effort for you. @rhh777
this is a reviewer's field

@pan3793
Copy link
Member

pan3793 commented Dec 28, 2023

@rhh777 it's my responsibility to fill in the committer checklist, you just need to fill in the "Author Self Checklist"

@pan3793
Copy link
Member

pan3793 commented Dec 28, 2023

BTW, it seems that 📝 Committer Pre-Merge Checklist is not welcomed at all. @pan3793

TBH, the checklist is too long, especially when there are images in the description and the network is slow, selecting the long checkbox makes the screen flicker many times. I prefer to the previous brief template, only listing a few most important items and leave others to scripts, e.g. checking assignees, milestone are selected.

@pan3793
Copy link
Member

pan3793 commented Dec 28, 2023

BTW, I recommend avoid using emoji but prefer plain text in PR title and description, it doesn't render well on some platforms

zhaohehuhu pushed a commit to zhaohehuhu/incubator-kyuubi that referenced this pull request Mar 21, 2024
…pping some queries

# 🔍 Description
## Issue References 🔗

When running Kyuubi's TPCDS, some SQL runs slowly, but there are no parameters to skip it.

## Describe Your Solution 🔧

Add the skip parameter, specifying a comma-separated list of SQL

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [x] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

#### Behavior Without This Pull Request ⚰️
no parameters to skip it.

#### Behavior With This Pull Request 🎉
```
$SPARK_HOME/bin/spark-submit \
  --class org.apache.kyuubi.tpcds.benchmark.RunBenchmark \
  kyuubi-tpcds_*.jar --db tpcds_sf10 --exclude q2,q4
```

> == QUERY LIST ==
> q1-v2.4
> q3-v2.4
> q5-v2.4
> q6-v2.4
> q7-v2.4
> q8-v2.4
> q9-v2.4
> .....

#### Related Unit Tests

---

# Checklists
## 📝 Author Self Checklist

- [x] My code follows the [style guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html) of this project
- [x] I have performed a self-review
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

## 📝 Committer Pre-Merge Checklist

- [ ] Pull request title is okay.
- [ ] No license issues.
- [ ] Milestone correctly set?
- [ ] Test coverage is ok
- [ ] Assignees are selected.
- [ ] Minimum number of approvals
- [ ] No changes are requested

**Be nice. Be informative.**

Closes apache#5925 from rhh777/tpcds-support-skip-queries.

Closes apache#5925

682f30c [haorenhui] Update some descriptions
cd90fb5 [haorenhui] Use include(list) and exclude(list) to replace filter(string)/queries(list)/skip(list)
13744e5 [haorenhui] kyuubi tpcds RunBenchmark support skip some of the queries

Authored-by: haorenhui <haorenhui@kingsoft.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants