Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Blocks read inconsistent" happened when shuffle read #76

Closed
Augus-smile opened this issue Feb 9, 2022 · 5 comments
Closed

"Blocks read inconsistent" happened when shuffle read #76

Augus-smile opened this issue Feb 9, 2022 · 5 comments

Comments

@Augus-smile
Copy link

你好, 我在执行spark 作业时, 在shuffle read阶段,显示blocks read inconsistent错误, 请问是什么原因呢?

server.conf:

rss.server.flush.thread.alive=2
rss.server.flush.threadPool.size=4
rss.server.buffer.capacity=6g
rss.server.read.buffer.capacity=3g
rss.server.disk.capacity=180g

server_rss_env.sh:
....
XMX_SIZE="12g"
....

@colinmjj
Copy link
Collaborator

colinmjj commented Feb 9, 2022

When shuffle read, after process all blocks, it will check if all expected blocks are processed. "Blocks read inconsistent" will be thrown when some blocks are lost. It may be caused by writing shuffle data failed.

@Augus-smile
Copy link
Author

Thank you for your reply, Does the LOCALFILE_AND_HDFS mode write data to the HDFS and Shuffle server at the same time?

@colinmjj
Copy link
Collaborator

colinmjj commented Feb 9, 2022

@Augus-smile please try release 0.2.0 which has a lot of improvements and bug fix. You can refer readme for detail configuration. In 0.2.0, MEMORY_LOCALFILE_HDFS is introduced for multiple storages. We will publish related doc soon.

@Augus-smile
Copy link
Author

Shuffle server process was killed due to OOM, related server conf : physical memory : 16g, XMX_SIZE: 12g. However, during the Spark task execution, the Shuffle Server process occupies nearly 16g memory. What are the memory consumption components of the Shuffle Server? Is there a parameter to restrict the memory usage of shuffle Server?

@colinmjj
Copy link
Collaborator

colinmjj commented Feb 10, 2022

set xmx_size in rss-env.sh
in server.conf:
rss.server.buffer.capacity # memory cache for write and read
rss.server.read.buffer.capacity # memory cache for read
BTW, please left extra 5G memory for shuffle server

@jerqi jerqi closed this as completed Feb 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants