Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to evaluate rss cluster size? #92

Open
Lobo2008 opened this issue Jan 11, 2023 · 1 comment
Open

How to evaluate rss cluster size? #92

Lobo2008 opened this issue Jan 11, 2023 · 1 comment

Comments

@Lobo2008
Copy link

Lobo2008 commented Jan 11, 2023

Hi, we have about 20,000 daily spark applications. All these apps produce 100TB shuffle writes/reads data respectively. A peak app produces 6TB, but most produce less than 100GB.

We ran some online spark apps on test rss cluster and found that rss node consume obviously more memory than cpu/disk/etc which even cause some rss/node break down .

Now we would like to run all apps on rss, any suggestions on rss cluster size and machine selection with replicas=2
eg.memory, cpu, node num,disk,memory-cpu ratio, etc. Any suggesion will help.

One more: mappers send data to StreamServer,StreamServer stores in memory,then flush to disk, not many calculations, so rss consumes memory a lot more than cpu, is that right?

@hiboyang
Copy link
Contributor

Hi @Lobo2008, how much memory usage do you see in RSS server? RSS server gets shuffle data in small blocks from Spark mapper, and write the block to disk. It will not cache large amount of data in memory. Thus curious how much memory you see RSS server uses.

Normally the bottleneck of RSS server will be disk io and network bandwidth since Spark applications write/read a lot of data from there. You could start with 10 to 50 spark executors mapping to one RSS server. Then observe disk/network metrics on RSS server, and adjust accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants