Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable window-group-limit optimization on [databricks] #10550

Merged
merged 5 commits into from
Mar 11, 2024

Conversation

mythrocks
Copy link
Collaborator

Fixes #10531.

This is a followup to #10500, which added support to push down window-group-limit filters before the shuffle phase.

#10500 inadvertently neglected to ensure that the optimization works on Databricks. (It turns out that window-group-limit was cherry-picked into Databricks 13.3, despite the nominal Spark version being 3.4.1.)

This change ensures that the same optimization is available on Databricks 13.3 (and beyond).

@mythrocks mythrocks added the feature request New feature or request label Mar 5, 2024
@mythrocks mythrocks self-assigned this Mar 5, 2024
@mythrocks mythrocks marked this pull request as draft March 5, 2024 21:54
Signed-off-by: MithunR <mythrocks@gmail.com>
@mythrocks
Copy link
Collaborator Author

Build

@mythrocks mythrocks marked this pull request as ready for review March 6, 2024 04:15
@mythrocks mythrocks changed the title [DRAFT] Enable window-group-limit optimization on [databricks] Enable window-group-limit optimization on [databricks] Mar 6, 2024
@mythrocks mythrocks requested a review from revans2 March 6, 2024 19:28
@sameerz sameerz added performance A performance related task/issue and removed feature request New feature or request labels Mar 6, 2024
Leaning on shimplify to generate this for DB and non-DB.
@mythrocks
Copy link
Collaborator Author

Build

1 similar comment
@mythrocks
Copy link
Collaborator Author

Build

@mythrocks
Copy link
Collaborator Author

Build

@mythrocks
Copy link
Collaborator Author

I've manually verified that this optimization works on Databricks.
The CI run seems to be failing with something unrelated to this functionality:

[2024-03-07T02:24:19.427Z] org.jenkinsci.plugins.scriptsecurity.sandbox.RejectedAccessException: Scripts not permitted to use method org.jenkinsci.plugins.workflow.support.steps.build.RunWrapper getRawBuild
[2024-03-07T02:24:19.427Z] 	at org.jenkinsci.plugins.scriptsecurity.sandbox.whitelists.StaticWhitelist.rejectMethod(StaticWhitelist.java:229)
[2024-03-07T02:24:19.427Z] 	at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.rejectMethod(SandboxInterceptor.java:594)
[2024-03-07T02:24:19.427Z] 	at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:167)
[2024-03-07T02:24:19.427Z] 	at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:178)
[2024-03-07T02:24:19.427Z] 	at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:182)
[2024-03-07T02:24:19.427Z] 	at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.methodCall(SandboxInvoker.java:17)
[2024-03-07T02:24:19.427Z] 	at WorkflowScript.run(WorkflowScript:291)
[2024-03-07T02:24:19.427Z] 	at ___cps.transform___(Native Method)
...
[2024-03-07T02:24:19.427Z] 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[2024-03-07T02:24:19.427Z] 	at java.base/java.lang.Thread.run(Thread.java:829)
[2024-03-07T02:24:20.203Z] Finished: FAILURE

@mythrocks
Copy link
Collaborator Author

Build

@NvTimLiu
Copy link
Collaborator

I've manually verified that this optimization works on Databricks. The CI run seems to be failing with something unrelated to this functionality:

[2024-03-07T02:24:19.427Z] org.jenkinsci.plugins.scriptsecurity.sandbox.RejectedAccessException: Scripts not permitted to use method org.jenkinsci.plugins.workflow.support.steps.build.RunWrapper getRawBuild
[2024-03-07T02:24:19.427Z] 	at org.jenkinsci.plugins.scriptsecurity.sandbox.whitelists.StaticWhitelist.rejectMethod(StaticWhitelist.java:229)
[2024-03-07T02:24:19.427Z] 	at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.rejectMethod(SandboxInterceptor.java:594)
[2024-03-07T02:24:19.427Z] 	at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onMethodCall(SandboxInterceptor.java:167)
[2024-03-07T02:24:19.427Z] 	at org.kohsuke.groovy.sandbox.impl.Checker$1.call(Checker.java:178)
[2024-03-07T02:24:19.427Z] 	at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:182)
[2024-03-07T02:24:19.427Z] 	at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.methodCall(SandboxInvoker.java:17)
[2024-03-07T02:24:19.427Z] 	at WorkflowScript.run(WorkflowScript:291)
[2024-03-07T02:24:19.427Z] 	at ___cps.transform___(Native Method)
...
[2024-03-07T02:24:19.427Z] 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[2024-03-07T02:24:19.427Z] 	at java.base/java.lang.Thread.run(Thread.java:829)
[2024-03-07T02:24:20.203Z] Finished: FAILURE

Looks like getRawBuild was removed from Jenkins ScriptApproval list by someone, added it back. this Scripts not permitted to use method org.jenkinsci.plugins.workflow.support.steps.build.RunWrapper getRawBuild error should be not exist

@revans2
Copy link
Collaborator

revans2 commented Mar 11, 2024

build

@mythrocks mythrocks merged commit 9cf2acb into NVIDIA:branch-24.04 Mar 11, 2024
42 of 43 checks passed
@mythrocks
Copy link
Collaborator Author

This change has been merged. Thank you for the review (@revans2), and the help fixing CI (@NvTimLiu).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance A performance related task/issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Support "WindowGroupLimit" optimization on GPU for Databricks 13.3 ML LTS+
4 participants