Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement][AQE] Support getting memory data skip by upstream task ids #358

Merged
merged 2 commits into from
Nov 29, 2022

Conversation

zuston
Copy link
Member

@zuston zuston commented Nov 25, 2022

What changes were proposed in this pull request?

Support getting memory data skip by upstream task ids

Why are the changes needed?

In current codebase, when the shuffle-server memory is large and
job is optimized by AQE skew rule, the multiple readers of the same
partition will get the shuffle data from the same shuffle-server.

To avoid reading unused localfile/HDFS data, the PR of #137 has
introduce the LOCAL_ORDER mechanism to filter the most of data.

But for the storage of MEMORY, it still suffer from this. So this PR is to avoid
reading unused data for one reader, by expectedTaskIds bitmap to
filter.

And this optimization is only enabled when AQE skew is applied.

Does this PR introduce any user-facing change?

No

How was this patch tested?

  1. UTs

Benchmark

Table

Table1: 100g, dtypes: Array[(String, String)] = Array((v1,StringType), (k1,IntegerType)).
And all columns of k1 have the same value (value = 10)

Table2: 10 records, dtypes: Array[(String, String)] = Array((k2,IntegerType), (v2,StringType)).
And it has the only one record of k2=10

Env

Spark Resource Profile: 10 executors(1core2g)
Shuffle-server Environment: 10 shuffle servers, 10g for buffer read and write.
Spark Shuffle Client Config: storage type: MEMORY_LOCALFILE with LOCAL_ORDER
SQL: spark.sql("select * from Table1,Table2 where k1 = k2").write.mode("overwrite").parquet("xxxxxx")

Result

ESS: cost 3min
Uniffle without patch: cost 11.6min (2.1 + 9.5)
Uniffle with patch: cost 3.5min (2.1 + 1.4)

@zuston
Copy link
Member Author

zuston commented Nov 25, 2022

I have proposed a draft implementation @jerqi @xianjingfeng If you have time, please take a look.

@jerqi
Copy link
Contributor

jerqi commented Nov 28, 2022

Should we give this pr a performance test?

@codecov-commenter
Copy link

codecov-commenter commented Nov 28, 2022

Codecov Report

Merging #358 (1554faa) into master (ad51341) will increase coverage by 0.55%.
The diff coverage is 17.30%.

@@             Coverage Diff              @@
##             master     #358      +/-   ##
============================================
+ Coverage     58.01%   58.56%   +0.55%     
- Complexity     1361     1586     +225     
============================================
  Files           171      193      +22     
  Lines          9006    10881    +1875     
  Branches        787      953     +166     
============================================
+ Hits           5225     6373    +1148     
- Misses         3449     4132     +683     
- Partials        332      376      +44     
Impacted Files Coverage Δ
...e/uniffle/client/factory/ShuffleClientFactory.java 0.00% <0.00%> (ø)
...client/request/CreateShuffleReadClientRequest.java 0.00% <0.00%> (ø)
...pache/uniffle/server/ShuffleServerGrpcService.java 0.81% <0.00%> (-0.02%) ⬇️
.../org/apache/uniffle/server/ShuffleTaskManager.java 77.22% <0.00%> (ø)
...uniffle/storage/factory/ShuffleHandlerFactory.java 0.00% <0.00%> (ø)
.../storage/handler/impl/MemoryClientReadHandler.java 0.00% <0.00%> (ø)
...orage/request/CreateShuffleReadHandlerRequest.java 0.00% <0.00%> (ø)
...che/uniffle/client/impl/ShuffleReadClientImpl.java 87.87% <33.33%> (-1.82%) ⬇️
...rg/apache/uniffle/server/buffer/ShuffleBuffer.java 93.33% <100.00%> (+0.28%) ⬆️
...he/uniffle/server/buffer/ShuffleBufferManager.java 82.67% <100.00%> (+0.06%) ⬆️
... and 24 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@zuston
Copy link
Member Author

zuston commented Nov 28, 2022

Should we give this pr a performance test?

Performance test has been attached in description. It works well

@zuston zuston requested a review from jerqi November 28, 2022 10:28
@jerqi
Copy link
Contributor

jerqi commented Nov 28, 2022

There are some conflicts with #276. I will merge #276 first. LGTM except some nits.

@@ -119,6 +120,9 @@ public RssShuffleReader(
this.partitionToShuffleServers = rssShuffleHandle.getPartitionToServers();
this.rssConf = rssConf;
this.dataDistributionType = dataDistributionType;
// This mechanism of expectedTaskIdsBitmap filter is to filter out the most of data.
// especially for AQE skew optimization
this.expectedTaskIdsBitmapFilterEnable = mapEndIndex == Integer.MAX_VALUE ? false : true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is the last reduce partition, may the range of mapId be [n, Integer.MAX_VALUE]?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do we need to use the startMapIndex==0 and mapEndIndex==max_value to judge?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it's more accurate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@jerqi
Copy link
Contributor

jerqi commented Nov 29, 2022

@xianjingfeng Do you have another suggestion?

Copy link
Member

@xianjingfeng xianjingfeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@jerqi jerqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @zuston @xianjingfeng , I will add @xianjingfeng as this pr's co-author. Because this pr is based on pr #294

@jerqi jerqi merged commit 0e45f2d into apache:master Nov 29, 2022
@jerqi
Copy link
Contributor

jerqi commented Nov 29, 2022

cc @bin41215 Maybe you have interest about this pr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants