Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-32923][FOLLOW-UP] Clean up older shuffleMergeId shuffle files when finalize request for higher shuffleMergeId is received #33605

Closed
wants to merge 2 commits into from

Conversation

venkata91
Copy link
Contributor

@venkata91 venkata91 commented Aug 2, 2021

What changes were proposed in this pull request?

Clean up older shuffleMergeId shuffle files when finalize request for higher shuffleMergeId is received when no blocks pushed for the corresponding shuffleMergeId. This is identified as part of #33034 (comment).

Why are the changes needed?

Without this change, older shuffleMergeId files won't be cleaned up properly.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Added changes to existing unit test to address this case.

@github-actions github-actions bot added the CORE label Aug 2, 2021
@venkata91
Copy link
Contributor Author

cc @Ngone51 @mridulm follow-up fix for the cleanup discussed here #33034 (comment). Thanks!

@SparkQA
Copy link

SparkQA commented Aug 2, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46450/

@SparkQA
Copy link

SparkQA commented Aug 2, 2021

Test build #141940 has finished for PR 33605 at commit 936ddc0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 3, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46487/

@Ngone51 Ngone51 closed this Aug 3, 2021
@Ngone51 Ngone51 reopened this Aug 3, 2021
@SparkQA
Copy link

SparkQA commented Aug 3, 2021

Test build #141976 has finished for PR 33605 at commit 1019d7a.

  • This patch fails from timeout after a configured wait of 500m.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mridulm
Copy link
Contributor

mridulm commented Aug 4, 2021

Looks good to me, thanks for fixing this @venkata91 !

@asfgit asfgit closed this in d816949 Aug 4, 2021
asfgit pushed a commit that referenced this pull request Aug 4, 2021
…when finalize request for higher shuffleMergeId is received

### What changes were proposed in this pull request?

Clean up older shuffleMergeId shuffle files when finalize request for higher shuffleMergeId is received when no blocks pushed for the corresponding shuffleMergeId. This is identified as part of #33034 (comment).

### Why are the changes needed?

Without this change, older shuffleMergeId files won't be cleaned up properly.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Added changes to existing unit test to address this case.

Closes #33605 from venkata91/SPARK-32923-follow-on.

Authored-by: Venkata krishnan Sowrirajan <vsowrirajan@linkedin.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
(cherry picked from commit d816949)
Signed-off-by: Mridul Muralidharan <mridulatgmail.com>
@mridulm
Copy link
Contributor

mridulm commented Aug 4, 2021

Merged to master and branch-3.2
+CC @gengliangwang

Thanks for fixing this @venkata91.
Thanks for reviewing this @Ngone51 !

domybest11 pushed a commit to domybest11/spark that referenced this pull request Jun 15, 2022
…when finalize request for higher shuffleMergeId is received

### What changes were proposed in this pull request?

Clean up older shuffleMergeId shuffle files when finalize request for higher shuffleMergeId is received when no blocks pushed for the corresponding shuffleMergeId. This is identified as part of apache#33034 (comment).

### Why are the changes needed?

Without this change, older shuffleMergeId files won't be cleaned up properly.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Added changes to existing unit test to address this case.

Closes apache#33605 from venkata91/SPARK-32923-follow-on.

Authored-by: Venkata krishnan Sowrirajan <vsowrirajan@linkedin.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
(cherry picked from commit d816949)
Signed-off-by: Mridul Muralidharan <mridulatgmail.com>
wangyum pushed a commit that referenced this pull request May 26, 2023
…when finalize request for higher shuffleMergeId is received

### What changes were proposed in this pull request?

Clean up older shuffleMergeId shuffle files when finalize request for higher shuffleMergeId is received when no blocks pushed for the corresponding shuffleMergeId. This is identified as part of #33034 (comment).

### Why are the changes needed?

Without this change, older shuffleMergeId files won't be cleaned up properly.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Added changes to existing unit test to address this case.

Closes #33605 from venkata91/SPARK-32923-follow-on.

Authored-by: Venkata krishnan Sowrirajan <vsowrirajan@linkedin.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants