Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

join pooled clusters based on yarn cluster metrics #2191

Closed
coyotemarin opened this issue Aug 12, 2020 · 2 comments · Fixed by #2197
Closed

join pooled clusters based on yarn cluster metrics #2191

coyotemarin opened this issue Aug 12, 2020 · 2 comments · Fixed by #2197
Labels
Milestone

Comments

@coyotemarin
Copy link
Collaborator

Currently, pooling will check if a cluster is "big enough" in terms of memory, CPU and a few other aspects (e.g. EBS volume size).

We can instead compare memory and CPU needs by SSHing to the cluster's YARN resource manager and querying its metrics API for availableMB and availableVirtualCores.

Not only would this provide more useful information about a cluster that can run multiple jobs simultaneously, it would also allow us to skip querying the cluster's instances ListInstanceGroups/ListInstanceFleets, saving an API call.

@coyotemarin coyotemarin added this to the v0.7.4 milestone Aug 12, 2020
@coyotemarin
Copy link
Collaborator Author

Probably should call these options min_available_mb and min_available_virtual_cores. If either is set, we can bypass checking the cluster's instance information.

@coyotemarin
Copy link
Collaborator Author

core_instance_type, num_core_instances, etc. will still be relevant when there is no pooled cluster available and we need to start our own.

@coyotemarin coyotemarin changed the title joined pooled clusters based on yarn cluster metrics join pooled clusters based on yarn cluster metrics Aug 26, 2020
coyotemarin pushed a commit that referenced this issue Aug 29, 2020
pooling: query YARN resource manager for available memory and CPU (fixes #2191)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant