Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-29072][CORE] Put back usage of TimeTrackingOutputStream for UnsafeShuffleWriter and SortShuffleWriter #25780

Closed
wants to merge 3 commits into from

Conversation

mccheah
Copy link
Contributor

@mccheah mccheah commented Sep 13, 2019

What changes were proposed in this pull request?

The previous refactors of the shuffle writers using the shuffle writer plugin resulted in shuffle write metric updates - particularly write times - being lost in particular situations. This patch restores the lost metric updates.

Why are the changes needed?

This fixes a regression. I'm pretty sure that without this, the Spark UI will lose shuffle write time information.

Does this PR introduce any user-facing change?

No change from Spark 2.4. Without this, there would be a user-facing bug in Spark 3.0.

How was this patch tested?

Existing unit tests.

@mccheah
Copy link
Contributor Author

mccheah commented Sep 13, 2019

@squito @vanzin

@squito
Copy link
Contributor

squito commented Sep 13, 2019

thanks for catching this. Looks good, though I'll take a fresh look tomorrow.

I did double check BypassMergeSortShuffleWriter and it looks fine

@SparkQA
Copy link

SparkQA commented Sep 13, 2019

Test build #110552 has finished for PR 25780 at commit fe5b93f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 13, 2019

Test build #110553 has finished for PR 25780 at commit 08a6068.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-29072] Put back usage of TimeTrackingOutputStream for UnsafeShuffleWriter and SortShuffleWriter [SPARK-29072][CORE] Put back usage of TimeTrackingOutputStream for UnsafeShuffleWriter and SortShuffleWriter Sep 13, 2019
@asfgit asfgit closed this in 67751e2 Sep 16, 2019
@squito
Copy link
Contributor

squito commented Sep 16, 2019

merged to master

@gatorsmile
Copy link
Member

@mccheah Could you explain which PR causes this regression?

@HyukjinKwon
Copy link
Member

Seems it was simply mistakenly missed during refactoring at https://github.com/apache/spark/pull/25304/files#diff-a1d00506391c1c4b2209f9bbff590c5bL390 at #25304

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants