Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE-257] RssMRUtils#getBlockId change the partitionId of int type to long #266

Merged
merged 5 commits into from
Oct 19, 2022

Conversation

fpkgithub
Copy link
Contributor

@fpkgithub fpkgithub commented Oct 17, 2022

What changes were proposed in this pull request?

Why are the changes needed?

1: In the RssMRUtils#getBlockId method, change the partitionId of int type to long, make sure no overflow

Does this PR introduce any user-facing change?

Fix #257 issue ,view this issue for details

How was this patch tested?

org.apache.hadoop.mapreduce.RssMRUtilsTest#partitionIdConvertBlockTest

@jerqi
Copy link
Contributor

jerqi commented Oct 17, 2022

Could you add some uts for this pr?

@jerqi
Copy link
Contributor

jerqi commented Oct 17, 2022

Did this pr fix the issue #257?

@fpkgithub
Copy link
Contributor Author

Did this pr fix the issue #257?
Yes,After the partitionId type is modified, data can be obtained when the reduce task number exceds 1024, and result is correct. I'll add a test case later

@@ -312,7 +312,7 @@ ShuffleBlockInfo createShuffleBlock(SortWriteBuffer wb) {
final byte[] compressed = RssShuffleUtils.compressData(data);
final long crc32 = ChecksumUtils.getCrc32(compressed);
compressTime += System.currentTimeMillis() - start;
final long blockId = RssMRUtils.getBlockId(partitionId, taskAttemptId, getNextSeqNo(partitionId));
final long blockId = RssMRUtils.getBlockId(Long.valueOf(partitionId), taskAttemptId, getNextSeqNo(partitionId));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long.valueof(partitionId) -> (long)partitionId

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reduce the cost of box.

Copy link
Contributor Author

@fpkgithub fpkgithub Oct 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok , change to implicit type casting

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to change this line? Pass int value to long argument should be fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to change this line? Pass int value to long argument should be fine.

Just tell the reader of code that we're sure that we need this type cast.

@codecov-commenter
Copy link

codecov-commenter commented Oct 17, 2022

Codecov Report

Merging #266 (908b078) into master (8be8390) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##             master     #266   +/-   ##
=========================================
  Coverage     59.70%   59.71%           
  Complexity     1377     1377           
=========================================
  Files           166      166           
  Lines          8916     8918    +2     
  Branches        853      853           
=========================================
+ Hits           5323     5325    +2     
  Misses         3318     3318           
  Partials        275      275           
Impacted Files Coverage Δ
...n/java/org/apache/hadoop/mapreduce/RssMRUtils.java 31.70% <ø> (ø)
...g/apache/hadoop/mapred/SortWriteBufferManager.java 79.89% <100.00%> (ø)
...ache/uniffle/coordinator/SimpleClusterManager.java 86.82% <0.00%> (+0.20%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@jerqi
Copy link
Contributor

jerqi commented Oct 18, 2022

@fpkgithub Could you resolve the comment?

@jerqi jerqi changed the title RssMRUtils#getBlockId change the partitionId of int type to long [ISSUE-257] RssMRUtils#getBlockId change the partitionId of int type to long Oct 18, 2022
Copy link
Contributor

@kaijchen kaijchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fpkgithub for digging into this issue. This change looks good.
Just some nitpicking here.

@@ -74,6 +75,22 @@ public void blockConvertTest() {
assertEquals(taskAttemptId, newTaskAttemptId);
}

@Test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use @ParameterizedTest here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example (remove the outer for loop):

@ParameterizedTest
@ValueSource(ints = {0, 1, 233, 1023, 1024, 1234, (1 << Constants.PARTITION_ID_MAX_LENGTH) - 1})
public void partitionIdConvertBlockTest(int partitionId) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok now. Let's merge this pr first.

Copy link
Contributor

@jerqi jerqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fpkgithub @kaijchen Thanks, LGTM

@jerqi jerqi merged commit 7a2f0ef into apache:master Oct 19, 2022
@kaijchen
Copy link
Contributor

@jerqi Please backport this fix into 0.6.0, since it hasn't been released yet.

jerqi pushed a commit that referenced this pull request Oct 20, 2022
…to long (#266)

### What changes were proposed in this pull request?
In the RssMRUtils#getBlockId method, change the partitionId of int type to long, make sure no overflow

### Why are the changes needed?
1: In the RssMRUtils#getBlockId method, change the partitionId of int type to long, make sure no overflow


### Does this PR introduce _any_ user-facing change?
Fix #257 issue ,view this issue for details


### How was this patch tested?
org.apache.hadoop.mapreduce.RssMRUtilsTest#partitionIdConvertBlockTest

Co-authored-by: fengpeikun <kanas.feng@vipshop.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] No data can be read from ShuffleServer when the reduce task number exceeds 1024
4 participants