-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark cluster #21
Comments
first try same 1 master+1 slave:
|
64 partitions runs 1180sec, AUC=0.7313, RAM 73G (1 slave)
|
32 partitions: runs 1560sec, AUC=0.7298, RAM 72GB local again: runs 830 sec, 0.7308, RAM 125GB local with 64 partitions: runs 830 sec, 0.7292, RAM 123GB |
|
|
100M, 10 trees single server - x1e.8xlarge (32 cores, 960GB RAM)
cluster with 10 slave nodes - r4.8xlarge (32 cores, 240GB RAM)
4.3x time ratio (but should do 10 nodes cluster vs 1 node cluster as we've seen the cluster overhead penalty) if penalty on cluster is same as for 10M above (1.4-1.7x), then 6-7.3x speedup from 1 slave to 10 |
320 partitions run stopped after 2 trees as it was getting slower: |
10M:
match partitions to cores:
|
10 trees, depth 10
|
best of partition (auto/manual):
|
previous results single server:
10M ran on:
r4.8xlarge (32 cores, 1 NUMA, 240GB RAM)
100M ran on:
x1e.8xlarge (32 cores, 1 NUMA, 960GB RAM)
The text was updated successfully, but these errors were encountered: