[Improvement] ShuffleBlock should be release when finished reading #74

xianjingfeng · 2022-07-26T08:10:04Z

What changes were proposed in this pull request?

release shuffleblock when finished reading

Why are the changes needed?

We found spark executor is easy be killed by yarn, and i found it is because executor use too mush offheap memory when read shuffle data.
I found most of offheap memory is used to store uncompressed shuffle Data, and this part of memory will be release only when GC is triggered

Does this PR introduce any user-facing change?

No

How was this patch tested?

Add new ut

jerqi · 2022-07-26T11:01:56Z

Did you verify this patch? I think we would better have ut for every pr, we should test the feature by hand at least.

xianjingfeng · 2022-07-26T12:14:49Z

Did you verify this patch? I think we would better have ut for every pr, we should test the feature by hand at least.

We have tested in production environment. Sometimes it is difficult to write ut, I will try to write ut for this pr later

jerqi · 2022-07-26T12:22:09Z

Did you verify this patch? I think we would better have ut for every pr, we should test the feature by hand at least.

We have tested in production environment. Sometimes it is difficult to write ut, I will try to write ut for this pr later

Have your company used the Uniffle in your production environment? Just curious, could you tell me the name of your company?

jerqi · 2022-07-26T15:50:02Z

Did you verify this patch? I think we would better have ut for every pr, we should test the feature by hand at least.

We have tested in production environment. Sometimes it is difficult to write ut, I will try to write ut for this pr later

I know it's sometimes diffcult. If you can't write ut for it. you should provide some production envrionment detailed data to prove the effect of the pr. It's best to have screenshot.

client-spark/common/src/main/java/org/apache/spark/shuffle/reader/RssShuffleDataIterator.java

xianjingfeng · 2022-07-27T02:32:44Z

Did you verify this patch? I think we would better have ut for every pr, we should test the feature by hand at least.

We have tested in production environment. Sometimes it is difficult to write ut, I will try to write ut for this pr later

Have your company used the Uniffle in your production environment? Just curious, could you tell me the name of your company?

SF

client-spark/common/src/main/java/org/apache/spark/shuffle/reader/RssShuffleDataIterator.java

xianjingfeng · 2022-07-27T13:01:09Z

I think use HeapByteBuffer is better here. What do you think?

jerqi · 2022-07-27T14:10:35Z

I think use HeapByteBuffer is better here. What do you think?

We find Uniffle's GC time is longer than Spark origin shuffle in our test when the task read shuffle data. So we'd better use off heap memory. But grpc use heap memory, so we can't use offheap memory totally. We will replace grpc with netty in the future. We hope we can use offheap memory totally.

xianjingfeng · 2022-07-27T15:38:27Z

We will replace grpc with netty in the future. We hope we can use offheap memory totally.

When? Do we have a detailed plan？

jerqi · 2022-07-27T15:49:16Z

We will replace grpc with netty in the future. We hope we can use offheap memory totally.

When? Do we have a detailed plan？

Maybe October, we hope to do that, but there are always other important things to do. For 0.6 version, we plan to support to deploy Uniffle on K8S. For 0.7 version, we plan to replace grpc with netty, but we only replace part interface (read shuffle data and write shuffle data).

client-spark/common/src/main/java/org/apache/spark/shuffle/reader/RssShuffleDataIterator.java

common/src/test/java/org/apache/uniffle/common/RssShuffleUtilsTest.java

codecov-commenter · 2022-07-29T02:51:44Z

Codecov Report

Merging #74 (92c1b0a) into master (aa18be0) will decrease coverage by 0.01%.
The diff coverage is 68.75%.

@@             Coverage Diff              @@
##             master      #74      +/-   ##
============================================
- Coverage     56.39%   56.38%   -0.02%     
- Complexity     1173     1178       +5     
============================================
  Files           149      149              
  Lines          7953     7992      +39     
  Branches        761      766       +5     
============================================
+ Hits           4485     4506      +21     
- Misses         3226     3243      +17     
- Partials        242      243       +1

Impacted Files	Coverage Δ
.../apache/uniffle/common/exception/RssException.java	`0.00% <0.00%> (ø)`
...e/spark/shuffle/reader/RssShuffleDataIterator.java	`89.70% <50.00%> (-3.95%)`	⬇️
...ava/org/apache/uniffle/common/RssShuffleUtils.java	`95.65% <100.00%> (+2.31%)`	⬆️
.../apache/uniffle/coordinator/CoordinatorServer.java	`68.67% <0.00%> (-2.22%)`	⬇️
.../java/org/apache/uniffle/server/ShuffleServer.java	`63.41% <0.00%> (-1.30%)`	⬇️
...he/uniffle/client/impl/ShuffleWriteClientImpl.java	`25.95% <0.00%> (-0.10%)`	⬇️
...he/uniffle/coordinator/CoordinatorGrpcService.java	`2.36% <0.00%> (-0.06%)`	⬇️
...java/org/apache/uniffle/common/rpc/GrpcServer.java	`0.00% <0.00%> (ø)`
.../uniffle/storage/handler/impl/LocalFileWriter.java	`90.00% <0.00%> (ø)`
.../hadoop/mapreduce/task/reduce/RssEventFetcher.java	`88.57% <0.00%> (+1.68%)`	⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us.

jerqi

LGTM, thanks for your contribution. It's a really great work.

[Improvement] ShuffleBlock should be release when finished reading

35c18ce

jerqi reviewed Jul 26, 2022

View reviewed changes

client-spark/common/src/main/java/org/apache/spark/shuffle/reader/RssShuffleDataIterator.java Outdated Show resolved Hide resolved

client-spark/common/src/main/java/org/apache/spark/shuffle/reader/RssShuffleDataIterator.java Show resolved Hide resolved

colinmjj reviewed Jul 27, 2022

View reviewed changes

client-spark/common/src/main/java/org/apache/spark/shuffle/reader/RssShuffleDataIterator.java Outdated Show resolved Hide resolved

xianjingfeng added 2 commits July 28, 2022 23:26

Use reflection to destory DirectByteBuffer

ab9273d

remove unnecessary import

893a77a

jerqi reviewed Jul 28, 2022

View reviewed changes

client-spark/common/src/main/java/org/apache/spark/shuffle/reader/RssShuffleDataIterator.java Outdated Show resolved Hide resolved

jerqi reviewed Jul 28, 2022

View reviewed changes

common/src/test/java/org/apache/uniffle/common/RssShuffleUtilsTest.java Outdated Show resolved Hide resolved

jerqi reviewed Jul 28, 2022

View reviewed changes

common/src/test/java/org/apache/uniffle/common/RssShuffleUtilsTest.java Outdated Show resolved Hide resolved

remove ut and replace RuntimeException to RssException

fdbd88c

jerqi reviewed Jul 29, 2022

View reviewed changes

common/src/test/java/org/apache/uniffle/common/RssShuffleUtilsTest.java Outdated Show resolved Hide resolved

add ut again

92c1b0a

jerqi approved these changes Jul 29, 2022

View reviewed changes

jerqi merged commit ccb39ed into apache:master Jul 29, 2022

jerqi mentioned this pull request Jul 29, 2022

[Improvement] ShuffleBlock should be release when finished reading #73

Closed

xianjingfeng deleted the issue_73 branch April 5, 2023 02:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Improvement] ShuffleBlock should be release when finished reading #74

[Improvement] ShuffleBlock should be release when finished reading #74

xianjingfeng commented Jul 26, 2022 •

edited by jerqi

Loading

jerqi commented Jul 26, 2022

xianjingfeng commented Jul 26, 2022 •

edited

Loading

jerqi commented Jul 26, 2022 •

edited

Loading

jerqi commented Jul 26, 2022

xianjingfeng commented Jul 27, 2022

xianjingfeng commented Jul 27, 2022

jerqi commented Jul 27, 2022 •

edited

Loading

xianjingfeng commented Jul 27, 2022

jerqi commented Jul 27, 2022 •

edited

Loading

codecov-commenter commented Jul 29, 2022 •

edited

Loading

jerqi left a comment

[Improvement] ShuffleBlock should be release when finished reading #74

[Improvement] ShuffleBlock should be release when finished reading #74

Conversation

xianjingfeng commented Jul 26, 2022 • edited by jerqi Loading

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

jerqi commented Jul 26, 2022

xianjingfeng commented Jul 26, 2022 • edited Loading

jerqi commented Jul 26, 2022 • edited Loading

jerqi commented Jul 26, 2022

xianjingfeng commented Jul 27, 2022

xianjingfeng commented Jul 27, 2022

jerqi commented Jul 27, 2022 • edited Loading

xianjingfeng commented Jul 27, 2022

jerqi commented Jul 27, 2022 • edited Loading

codecov-commenter commented Jul 29, 2022 • edited Loading

Codecov Report

jerqi left a comment

Choose a reason for hiding this comment

xianjingfeng commented Jul 26, 2022 •

edited by jerqi

Loading

xianjingfeng commented Jul 26, 2022 •

edited

Loading

jerqi commented Jul 26, 2022 •

edited

Loading

jerqi commented Jul 27, 2022 •

edited

Loading

jerqi commented Jul 27, 2022 •

edited

Loading

codecov-commenter commented Jul 29, 2022 •

edited

Loading