Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE-301][Subtask][Improvement][AQE] Merge continuous ShuffleDataSegment into single one #303

Merged
merged 1 commit into from
Nov 7, 2022

Conversation

zuston
Copy link
Member

@zuston zuston commented Nov 6, 2022

What changes were proposed in this pull request?

  1. Merge continuous ShuffleDataSegment into single one
  2. Throw exception when reading the discontinuous blocks into segments directly

Why are the changes needed?

Currently, the LocalOrderSegmentSplitter will split the index file into
multiple shuffleDataSegments. But the split scope is limited in the range of local order.

For example:
The blocks are as follow

block-a (taskId-1)
block-b (taskId-2)
block-c (taskId-1)
block-d (taskId-2)

When the reader want to get the range of taskIds: [1, 3), the strategy will return
two shuffleDataSegments. But we'd better to merge them into single one to
reduce the network interaction times, because they are continuous.

Does this PR introduce any user-facing change?

No

How was this patch tested?

  1. UTs

@codecov-commenter
Copy link

codecov-commenter commented Nov 6, 2022

Codecov Report

Merging #303 (d05e086) into master (74949f5) will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@             Coverage Diff              @@
##             master     #303      +/-   ##
============================================
+ Coverage     60.76%   60.79%   +0.02%     
- Complexity     1454     1457       +3     
============================================
  Files           179      179              
  Lines          9201     9207       +6     
  Branches        882      883       +1     
============================================
+ Hits           5591     5597       +6     
  Misses         3316     3316              
  Partials        294      294              
Impacted Files Coverage Δ
...ffle/common/segment/LocalOrderSegmentSplitter.java 88.23% <100.00%> (+1.56%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@zuston
Copy link
Member Author

zuston commented Nov 7, 2022

@jerqi PTAL

Copy link
Contributor

@jerqi jerqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, although i feel the probability of occurrence is not high, it still is an improvement which don't bring too much complexity.

@jerqi jerqi merged commit cf63eae into apache:master Nov 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Subtask] [Improvement][AQE][LocalOrder] Merge continuous ShuffleDataSegment into single one
3 participants