Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Netty performance tracking #1161

Closed
zuston opened this issue Aug 21, 2023 · 3 comments · Fixed by #1162
Closed

Netty performance tracking #1161

zuston opened this issue Aug 21, 2023 · 3 comments · Fixed by #1162

Comments

@zuston
Copy link
Member

zuston commented Aug 21, 2023

Netty Performance tracking

sub tasks

  1. [#1161] improvement: Reduce the data copy #1162
  2. [Improvement] return the direct byte buffer when getting localfile data by using netty  #1163

Benchmark

Tested with grpc and netty.

Environment

Software: Uniffle master / Hadoop 3.2.2 / Spark 3.1.2

Hardware: Machine 96 cores, 512G memory, 1T * 4 SSD, network bandwidth 8GB/s

Hadoop Yarn Cluster: 1 * ResourceManager + 40 * NodeManager, every machine 1T * 4 SSD

Uniffle Cluster: 1 * Coordinator + 5 * Shuffle Server, every machine 1T * 4 SSD

Configuration

spark's conf

spark.executor.instances 400
spark.executor.cores 1
spark.executor.memory 2g
spark.shuffle.manager org.apache.spark.shuffle.RssShuffleManager
spark.rss.storage.type MEMORY_LOCALFILE

uniffle grpc-based server's conf

JVM XMX=200g

...
rss.server.buffer.capacity 100g
rss.server.read.buffer.capacity 20g
rss.server.flush.thread.alive 20
rss.server.flush.threadPool.size 50
rss.server.high.watermark.write 80
rss.server.low.watermark.write 70
...

uniffle netty-based server's conf

XMX_SIZE="140g"
MAX_DIRECT_MEMORY_SIZE=200g

...
rss.server.buffer.capacity 100g
rss.server.read.buffer.capacity 20g
rss.server.flush.thread.alive 20
rss.server.flush.threadPool.size 50
rss.server.high.watermark.write 80
rss.server.low.watermark.write 70
...

report

type 5T (run with 400 executors)
grpc-based 3.6min/5.9min
netty 3.4min/7.7min

And I found the spark executor with netty uniffle gc time is higher than grpc based.

@jerqi
Copy link
Contributor

jerqi commented Aug 21, 2023

Could you enable spark.rss.client.off.heap.memory.enable?

jerqi added a commit to jerqi/incubator-uniffle that referenced this issue Aug 21, 2023
@zuston
Copy link
Member Author

zuston commented Aug 22, 2023

Could you enable spark.rss.client.off.heap.memory.enable?

This is only for hdfs.

@zuston
Copy link
Member Author

zuston commented Aug 22, 2023

Another problem: the remote fetch from localfile by netty is unstable, compared with grpc, it costs too much time.

@zuston zuston changed the title [Improvement] netty performance tracking Netty performance tracking Aug 22, 2023
zuston pushed a commit that referenced this issue Aug 22, 2023
…buffer len (#1162)

### What changes were proposed in this pull request?

If we use the off heap memory and then we use the method `getData`, we will copy the off heap memory to heap data memory. So we should avoid using it in the Netty mode.

### Why are the changes needed?

Fix: #1161

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?
Code Review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants