Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serious Bug] Inaccurate flow control leads to Shuffle server OOM when enabling Netty #1472

Closed
3 tasks done
rickyma opened this issue Jan 19, 2024 · 4 comments
Closed
3 tasks done

Comments

@rickyma
Copy link
Contributor

rickyma commented Jan 19, 2024

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the bug

In high-pressure scenarios, inaccurate flow control(usedMemory? preAllocatedMemory?) leads to Shuffle server OOM.

image

The SQL used to reproduce the bug:
tpcds:
select * from (
select s.,c. from store_sales s join customer  c on s.ss_customer_sk=c.c_customer_sk
) sc DISTRIBUTE BY sc.ss_customer_sk,sc.ss_item_sk;

Affects Version(s)

master

Uniffle Server Log Output

[2024-01-19 03:18:50.483] [Grpc-714] [DEBUG] org.apache.uniffle.server.buffer.ShuffleBufferManager.requireMemory - Require memory succeeded with 1023891 bytes, usedMemory[117888810505] include preAllocation[180869913], inFlushSize[110004009963]
[2024-01-19 03:18:50.485] [Grpc-19] [DEBUG] org.apache.uniffle.server.buffer.ShuffleBufferManager.requireMemory - Require memory succeeded with 3093288 bytes, usedMemory[117891903793] include preAllocation[183963201], inFlushSize[110004009963]
[2024-01-19 03:18:50.485] [epollEventLoopGroup-3-19] [DEBUG] org.apache.uniffle.server.netty.ShuffleServerNettyHandler.handleSendShuffleDataRequest - Cache Shuffle Data for appId[application_1703049085550_1651917_1705170927963], shuffleId[0], cost 0 ms with 52 blocks and 1189625 bytes
[2024-01-19 03:18:50.487] [epollEventLoopGroup-3-44] [WARN] org.apache.uniffle.common.netty.handle.TransportChannelHandler.exceptionCaught - Exception in connection from /9.23.12.144:45176
io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 4194304 byte(s) of direct memory (used: 171798691840, max: 171798691840)
        at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:843)
        at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:772)
        at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:710)
        at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:685)
        at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:212)
        at io.netty.buffer.PoolArena.tcacheAllocateNormal(PoolArena.java:194)
        at io.netty.buffer.PoolArena.allocate(PoolArena.java:136)
        at io.netty.buffer.PoolArena.allocate(PoolArena.java:126)
        at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:397)
        at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:188)
        at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:179)
        at org.apache.uniffle.common.netty.protocol.Decoders.decodeShuffleBlockInfo(Decoders.java:50)
        at org.apache.uniffle.common.netty.protocol.SendShuffleDataRequest.decodePartitionData(SendShuffleDataRequest.java:92)
        at org.apache.uniffle.common.netty.protocol.SendShuffleDataRequest.decode(SendShuffleDataRequest.java:104)
        at org.apache.uniffle.common.netty.protocol.Message.decode(Message.java:145)
        at org.apache.uniffle.common.netty.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:72)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
        at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:800)
        at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:509)
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:407)
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:750)

Uniffle Engine Log Output

No response

Uniffle Server Configurations

xmx:120g
capacity:110g
read.capacity:20g
max_direct_mem:160g
rss.server.netty.epoll.enable true
rss.rpc.server.type GRPC_NETTY

Uniffle Engine Configurations

set spark.sql.files.maxPartitionBytes=1073741824;
set spark.executor.cores=8;
set spark.task.cpus=4;
set spark.executor.memory=49g;
set spark.driver.memory=20g;
set spark.dynamicAllocation.maxExecutors=150;
set spark.dynamicAllocation.minExecutors=150;
set spark.rss.client.type = GRPC_NETTY;
set spark.rss.client.netty.io.mode = EPOLL;

Additional context

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!
@zuston
Copy link
Member

zuston commented Jan 19, 2024

PTAL @leixm @jerqi

@connorlwilkes
Copy link
Contributor

This only affects off heap Netty?

@jerqi
Copy link
Contributor

jerqi commented Jan 25, 2024

This only affects off heap Netty?

Currently, it only affects off heap Netty.

@rickyma rickyma changed the title [Serious Bug] Inaccurate flow control leads to Shuffle server OOM [Serious Bug] Inaccurate flow control leads to Shuffle server OOM when enabling Netty Jan 31, 2024
@rickyma
Copy link
Contributor Author

rickyma commented Jan 31, 2024

Progress: I already have a version that can solve this problem, and I am still testing it.

rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 5, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 5, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 5, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 5, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 5, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 5, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 5, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 5, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 5, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 5, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 5, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 5, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 5, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 5, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 12, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 12, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 12, 2024
…ntException issues in extremely rare scenarios
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 12, 2024
jerqi pushed a commit that referenced this issue Feb 13, 2024
### What changes were proposed in this pull request?

Upgrade Netty and GRPC

### Why are the changes needed?

A sub PR for: #1519

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing UTs.
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 13, 2024
jerqi pushed a commit that referenced this issue Feb 14, 2024
…Memory and usedDirectMemory (#1524)

### What changes were proposed in this pull request?

We need to know the exact direct memory usage of `PooledByteBufAllocator` in `Netty`.
So we should introduce metrics for Netty's `pinnedDirectMemory` and `usedDirectMemory`.

### Why are the changes needed?

A sub PR for: #1519

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing UTs.
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 14, 2024
…ntException issues in extremely rare scenarios
jerqi pushed a commit that referenced this issue Feb 15, 2024
…ption issues in extremely rare scenarios (#1522)

### What changes were proposed in this pull request?

Improve the robustness of methods `ShuffleDataResult.release()` and `ShuffleIndexResult.release()` to fix occasional IllegalReferenceCountException issues in extremely rare scenarios.

### Why are the changes needed?

A sub PR for: #1519

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing UTs.
jerqi pushed a commit that referenced this issue Feb 15, 2024
…ks instead of reallocating it (#1521)

### What changes were proposed in this pull request?

Reuse ByteBuf when decoding shuffle blocks instead of reallocating it

### Why are the changes needed?

A sub PR for: #1519

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing UTs.
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 21, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 21, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 21, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 22, 2024
…on issues when exceptions happened in clientReadHandler.readShuffleData()
zuston pushed a commit that referenced this issue Feb 23, 2024
…mory issue causing OOM (#1534)

### What changes were proposed in this pull request?

When we use `UnpooledByteBufAllocator` to allocate off-heap `ByteBuf`, Netty directly requests off-heap memory from the operating system instead of allocating it according to `pageSize` and `chunkSize`. This way, we can obtain the exact `ByteBuf` size during the pre-allocation of memory, avoiding distortion of metrics such as `usedMemory`. 

Moreover, we have restored the code submission of the PR [#1521](#1521). We ensure that there is sufficient direct memory for the Netty server during decoding `sendShuffleDataRequest` by taking into account the `encodedLength` of `ByteBuf` in advance during the pre-allocation of memory, thus avoiding OOM during decoding `sendShuffleDataRequest`. 

Since we are not using `PooledByteBufAllocator`, the PR [#1524](#1524) is no longer needed.

### Why are the changes needed?

A sub PR for: #1519

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing UTs.
zuston pushed a commit that referenced this issue Feb 24, 2024
…ler.readShuffleData (#1536)

### What changes were proposed in this pull request?

Fix IllegalReferenceCountException issues when exceptions happened in clientReadHandler.readShuffleData().

### Why are the changes needed?

A follow-up PR for: #1522

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing UTs.
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 24, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 24, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 24, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 24, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 24, 2024
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 24, 2024
@rickyma rickyma closed this as completed Feb 26, 2024
jerqi pushed a commit that referenced this issue Feb 26, 2024
### What changes were proposed in this pull request?

Fix [#1008](#1008). It does not actually test `GRPC_NETTY` mode, because it uses `ShuffleServerGrpcClient` everywhere instead of `ShuffleServerGrpcNettyClient`. 
Setting the shuffle server's tags to `GRPC_NETTY,GRPC` is useless, because we are not using `ShuffleServerGrpcNettyClient` at all.

### Why are the changes needed?

It is a sub PR for: #1519
Also, it is a follow-up PR for: #1008

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing UTs.
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Feb 28, 2024
zuston pushed a commit that referenced this issue Feb 29, 2024
…le data requests (#1551)

### What changes were proposed in this pull request?

Refresh `timestamp` when sending `SendShuffleDataRequest`.

### Why are the changes needed?

A follow-up PR for: #1534

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing UTs.
rickyma added a commit to rickyma/incubator-uniffle that referenced this issue Mar 21, 2024
zuston pushed a commit that referenced this issue Mar 25, 2024
… when failing to cache shuffle data (#1597)

### What changes were proposed in this pull request?

Release memory more accurately when failing to cache shuffle data.

### Why are the changes needed?

A follow-up PR for: #1534.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing UTs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants