Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we move GPUs from OpenStack to OpenShift? #668

Closed
3 tasks done
joachimweyl opened this issue Jul 31, 2024 · 4 comments
Closed
3 tasks done

Should we move GPUs from OpenStack to OpenShift? #668

joachimweyl opened this issue Jul 31, 2024 · 4 comments
Assignees
Labels
openshift This issue pertains to NERC OpenShift openstack This issue pertains to NERC OpenStack

Comments

@joachimweyl
Copy link
Contributor

joachimweyl commented Jul 31, 2024

Motivation

We have 16 V100s at 4%-20% usage and 16 A100SXM4s at 0% usage in OpenStack. Is OpenShift a better location for them? Also worth noting is that we have a request for 4 SXM4 BM nodes pending from RH we might not even need to move them into OpenShift we can move them to ESI and then directly to BM for a while until RH is done with them and then move them into OpenShift Prod.
We already have one request to move a V100 GPU out of OpenStack.

Completion Criteria

Decide if we should move some of the GPU nodes from OpenStack to OpenShift.

Description

  • Decide if we should move GPUs from OpenStack to OpenShift
  • Decide how many of each GPU node we should move
  • Create an issue to move them.

Original Move Suggestion

  • 6 V100 nodes with 12 V100 GPUs
  • 6 A100SXM4 nodes with 12 A100SXM4 GPUs

Completion dates

Desired - 2024-08-09
Required - 2024-08-21

@joachimweyl joachimweyl added openstack This issue pertains to NERC OpenStack openshift This issue pertains to NERC OpenShift labels Jul 31, 2024
@joachimweyl
Copy link
Contributor Author

@syockel your thoughts on this would be helpful

@joachimweyl
Copy link
Contributor Author

If we can't come to a decision by next Wed I will bring it up in the Operations meeting.

@joachimweyl
Copy link
Contributor Author

Decision made with Scott & Michael to move 8 V100s and 8 A100SXM4s out of OpenStack. 1 V100 to go to OpenShift testing cluster, 7 V100s to go to OpenShift production. all 8 A100SXM4s to go to OpenShift Production.

@joachimweyl
Copy link
Contributor Author

#680 created.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
openshift This issue pertains to NERC OpenShift openstack This issue pertains to NERC OpenStack
Projects
None yet
Development

No branches or pull requests

2 participants