Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#1341] fix(mr): Fix MR Combiner ArrayIndexOutOfBoundsException Bug. #1666

Merged
merged 3 commits into from
Apr 30, 2024

Conversation

qijiale76
Copy link
Contributor

@qijiale76 qijiale76 commented Apr 19, 2024

What changes were proposed in this pull request?

The current implementation of the SortBufferIterator.getKey() and .getValue() methods in the SortWriteBuffer class assumes that keys and values are always stored within a single buffer. This assumption can lead to runtime exceptions, specifically ArrayIndexOutOfBoundsException, when keys or values span multiple WrappedBuffer instances.

In the current implementation, due to the execution of the compact(), the data of the key will not span buffers. However, the data of the value may be located on different buffers from the key and may span multiple buffers.

This PR update getKey() to use fetchDataFromBuffers() to retrieve the key data. And this PR update getValue() to first adjust the starting index and offset based on the length of the key and then use fetchDataFromBuffers() to retrieve the value data.

Another solution to this bug is to modify addRecord() method by keeping key and value in the same WrappedBuffer.

Why are the changes needed?

Fix: #1341

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Tested manually. I'll submit a new integration test later.

@qijiale76
Copy link
Contributor Author

@sunshineJK Would you please review this PR?

Copy link

github-actions bot commented Apr 19, 2024

Test Results

 2 384 files  + 84   2 384 suites  +84   4h 38m 28s ⏱️ + 37m 10s
   921 tests +  2     920 ✅ +  3   1 💤 ±0  0 ❌ ±0 
10 677 runs  +182  10 663 ✅ +185  14 💤 ±0  0 ❌ ±0 

Results for commit 76e15b8. ± Comparison against base commit 45ad0b8.

This pull request removes 1 and adds 3 tests. Note that renamed tests count towards both.
org.apache.uniffle.shuffle.manager.ShuffleManagerServerFactoryTest ‑ testShuffleManagerServerType
org.apache.hadoop.mapred.SortWriteBufferTest ‑ testSortBufferIterator
org.apache.uniffle.shuffle.manager.ShuffleManagerServerFactoryTest ‑ testShuffleManagerServerType{ServerType}[1]
org.apache.uniffle.shuffle.manager.ShuffleManagerServerFactoryTest ‑ testShuffleManagerServerType{ServerType}[2]

♻️ This comment has been updated with latest results.

@jerqi
Copy link
Contributor

jerqi commented Apr 19, 2024

Could you a ut for this case?

@Lobo2008
Copy link

Lobo2008 commented Apr 22, 2024

@qijiale76 hi, i made some tests and found out that this PR works fine with input datas-212.txt (212MB) or any file size <= 212MB, but failed with datas-213.txt (213MB ).
here is the exception:

org.apache.hadoop.mapred.MapTask$NewOutputCollector@66971f6b
org.apache.uniffle.common.exception.RssException: Can't support sequence [4097], the max value should be 4095
	at org.apache.hadoop.mapreduce.RssMRUtils.getBlockId(RssMRUtils.java:237)
	at org.apache.hadoop.mapred.SortWriteBufferManager.createShuffleBlock(SortWriteBufferManager.java:390)

other info:

hadoop client: hadoop-3.2.1
nodemanager: hadoop-3.2.1
rss: master branch (Apri 19, 2024) with or without PR 1666

@qijiale76
Copy link
Contributor Author

@Lobo2008 Thanks so much for your review, This bug of overflowed nextSeqNo is discussed in #1418. I'll solve it in that PR.

@jerqi jerqi requested a review from zhengchenyu April 29, 2024 04:02
@zhengchenyu
Copy link
Collaborator

zhengchenyu commented Apr 29, 2024

@qijiale76 Can you add UT?

@zhengchenyu
Copy link
Collaborator

LGTM+1

@jerqi
Copy link
Contributor

jerqi commented Apr 29, 2024

This pull request will copy data one more time. Will it only copy data when we don't use the combine?

@qijiale76
Copy link
Contributor Author

This pull request will copy data one more time. Will it only copy data when we don't use the combine?

SortBufferIterator is only used by Combiner. When Combiner is null, these methods are never called.

@jerqi
Copy link
Contributor

jerqi commented Apr 30, 2024

This pull request will copy data one more time. Will it only copy data when we don't use the combine?

SortBufferIterator is only used by Combiner. When Combiner is null, these methods are never called.

OK, I got it.

@jerqi
Copy link
Contributor

jerqi commented Apr 30, 2024

Merged to master & branch 0.9 . Thanks @qijiale76 @Lobo2008 @zhengchenyu

@jerqi jerqi merged commit 636d0be into apache:master Apr 30, 2024
41 checks passed
jerqi pushed a commit that referenced this pull request Apr 30, 2024
…1666)

### What changes were proposed in this pull request?

The current implementation of the SortBufferIterator.getKey() and .getValue() methods in the SortWriteBuffer class assumes that keys and values are always stored within a single buffer. This assumption can lead to runtime exceptions, specifically ArrayIndexOutOfBoundsException, when keys or values span multiple WrappedBuffer instances.

In the current implementation, due to the execution of the `compact()`, the data of the key will not span buffers. However, the data of the value may be located on different buffers from the key and may span multiple buffers.

This PR update getKey() to use fetchDataFromBuffers() to retrieve the key data. And this PR update getValue() to first adjust the starting index and offset based on the length of the key and then use fetchDataFromBuffers() to retrieve the value data.

Another solution to this bug is to modify `addRecord()` method by keeping key and value in the same WrappedBuffer.

### Why are the changes needed?

Fix: #1341

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Tested manually. I'll submit a new integration test later.
zhengchenyu pushed a commit to zhengchenyu/incubator-uniffle that referenced this pull request Jun 13, 2024
… Bug. (apache#1666)

### What changes were proposed in this pull request?

The current implementation of the SortBufferIterator.getKey() and .getValue() methods in the SortWriteBuffer class assumes that keys and values are always stored within a single buffer. This assumption can lead to runtime exceptions, specifically ArrayIndexOutOfBoundsException, when keys or values span multiple WrappedBuffer instances.

In the current implementation, due to the execution of the `compact()`, the data of the key will not span buffers. However, the data of the value may be located on different buffers from the key and may span multiple buffers.

This PR update getKey() to use fetchDataFromBuffers() to retrieve the key data. And this PR update getValue() to first adjust the starting index and offset based on the length of the key and then use fetchDataFromBuffers() to retrieve the value data.

Another solution to this bug is to modify `addRecord()` method by keeping key and value in the same WrappedBuffer.

### Why are the changes needed?

Fix: apache#1341

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Tested manually. I'll submit a new integration test later.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] java.lang.ArrayIndexOutOfBoundsException
4 participants