Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] Executor在shuffle write/read 过程中是否落本地盘? #134

Closed
fuhaiq opened this issue May 11, 2022 · 2 comments
Closed

Comments

@fuhaiq
Copy link

fuhaiq commented May 11, 2022

每个节点只配置了一块磁盘,且性能很差。没上RSS的时候,shuffle阶段时,多个executor对这个盘进行大量随机I/O,性能很差

后来上了RSS,观测发现 shuffle write 阶段的性能非常好,但是在 shuffle read 阶段,性能下降很快。

请教下:

在shuffle write时,executor直接将shuffle数据传至rss-server?还是落本地盘后再传?
在shuffle read时,executor从rss-server获取数据后,直接算?还是先落本地盘后再算?

spark.rss.storage.type=MEMORY_LOCAL时,这里的LOCAL本地盘是指 shuffle write 还是 shuffle read 阶段的落盘?

@colinmjj
Copy link
Collaborator

在shuffle write时,executor直接将shuffle数据直接传至rss-server,不再落本地盘
在shuffle read时,executor从rss-server获取数据后,直接算
MEMORY_LOCAL时,这里的LOCAL是指 shuffle server的本地盘,在shuffle write和shuffle read阶段都有可能用到

@fuhaiq
Copy link
Author

fuhaiq commented May 11, 2022

在shuffle write时,executor直接将shuffle数据直接传至rss-server,不再落本地盘 在shuffle read时,executor从rss-server获取数据后,直接算 MEMORY_LOCAL时,这里的LOCAL是指 shuffle server的本地盘,在shuffle write和shuffle read阶段都有可能用到

非常感谢~理解了

@fuhaiq fuhaiq closed this as completed May 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants