-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Reading local shuffle data in high-pressure scenarios may lead to high system load #1596
Comments
I will use a thread pool to control the concurrency when reading shuffle data. If anyone has any good suggestions, please feel free to let me know. |
Read concurrency is not limited in current codebase. +1 for this issue. |
We ever calculated the number of write threads. 2 threads per HDD or 3 threads per SSD will obtain the best performance. |
… release readMemory (#1605) ### What changes were proposed in this pull request? 1. Add a `ChannelFutureListener` and use its callback mechanism to release `readMemory` only after the `writeAndFlush` method is truly completed. 2. Change the descriptions of configurations `rss.server.buffer.capacity.ratio` and `rss.server.read.buffer.capacity.ratio`. ### Why are the changes needed? This is actually a bug, which was introduced by PR #879. The issue has been present since the very beginning when the Netty feature was first integrated. Fix #1596. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? I don't think we need new tests. Tested in our env. The new log will be: ``` [2024-03-26 23:11:51.039] [epollEventLoopGroup-3-158] [INFO] ShuffleServerNettyHandler.operationComplete - Successfully executed getLocalShuffleData for appId[application_1703049085550_7359933_1711463990606], shuffleId[0], partitionId[1328], offset[0], length[14693742]. Took 1457 ms and retrieved 14693742 bytes of data [2024-03-26 23:11:51.040] [epollEventLoopGroup-3-130] [INFO] ShuffleServerNettyHandler.operationComplete - Successfully executed getMemoryShuffleData for appId[application_1703049085550_7359933_1711463990606], shuffleId[0], partitionId[1262]. Took 1 ms and retrieved 0 bytes of data [2024-03-26 23:11:51.068] [epollEventLoopGroup-3-177] [INFO] ShuffleServerNettyHandler.operationComplete - Successfully executed getLocalShuffleIndex for appId[application_1703049085550_7359933_1711463990606], shuffleId[0], partitionId[1366]. Took 918 ms and retrieved 1653600 bytes of data ```
…he channel is writable
…he channel is writable
…he channel is writable
…nnel is writable (#1641) ### What changes were proposed in this pull request? 1. Send failed responses only when the channel is writable. 2. Print debug logs when the data is successfully sent, reducing log output. 3. Reduce the duplicated error log. ### Why are the changes needed? A follow-up PR for: #1605. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing UTs.
…nnel is writable (#1641) ### What changes were proposed in this pull request? 1. Send failed responses only when the channel is writable. 2. Print debug logs when the data is successfully sent, reducing log output. 3. Reduce the duplicated error log. ### Why are the changes needed? A follow-up PR for: #1605. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing UTs.
Is this error message also affected by this problem? @rickyma |
No, |
Code of Conduct
Search before asking
Describe the bug
When enabling Netty, the current timing of releasing
readMemory
in the code is incorrect because the methodclient.getChannel().writeAndFlush()
is asynchronous. If we releasereadMemory
directly, it will result inreadMemory
being released before it has a chance to take effect. And the file reading only occurs after thewriteAndFlush
method is called. We should add aChannelFutureListener
and use its callback mechanism to releasereadMemory
. This can ensure thewriteAndFlush
method is truly completed.We haven't set a limit on the maximum number of concurrent requests(for reading local shuffle data) that can be processed at the same time; we only controlled the maximumreadCapacity
of the buffer when reading local shuffle data. This approach falls short. In our tests, we discovered that the metricread_used_buffer_size
could potentially only reach up to 3.75GB, yet the system load on the shuffle server was already high. This suggests that relying solely onreadCapacity
to manage the reading of local shuffle data is inadequate.Affects Version(s)
master
Uniffle Server Log Output
No response
Uniffle Engine Log Output
No response
Uniffle Server Configurations
No response
Uniffle Engine Configurations
No response
Additional context
No response
Are you willing to submit PR?
The text was updated successfully, but these errors were encountered: