-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU serving setting #935
GPU serving setting #935
Conversation
...nd/src/pages/modelServing/screens/projects/ServingRuntimeModal/ManageServingRuntimeModal.tsx
Show resolved
Hide resolved
5fb8d05
to
0c6e7df
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good on face-value -- I'll need to test it tomorrow; maybe @lucferbux can test this in his morning.
I think due to the amount of conflicts we will have -- you should probably merge first and I can go back through and try to fix all my PRs. Otherwise we'll be sharing in the conflicts between my various PRs.
frontend/src/pages/modelServing/screens/projects/ServingRuntimeList.tsx
Outdated
Show resolved
Hide resolved
...nd/src/pages/modelServing/screens/projects/ServingRuntimeModal/ManageServingRuntimeModal.tsx
Show resolved
Hide resolved
.../pages/modelServing/screens/projects/ServingRuntimeModal/ServingRuntimeSizeExpandedField.tsx
Show resolved
Hide resolved
...nd/src/pages/modelServing/screens/projects/ServingRuntimeModal/ServingRuntimeSizeSection.tsx
Outdated
Show resolved
Hide resolved
frontend/src/pages/modelServing/screens/projects/ServingRuntimeTable.tsx
Outdated
Show resolved
Hide resolved
@cfchase I see that this is a WIP -- hopefully it's reasonably done otherwise we're going to have a race of conflicts tomorrow. |
@andrewballantyne I should have removed the WIP last night after I finished testing. Removed it |
@andrewballantyne nevermind.... WIP back on while I rebase off of yours. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andrewballantyne The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Allow ServingRuntimes to specify allowed number of GPUs
Closes: #940
Description
I've added an annotation to the ServingRuntime specification in the servingruntimes-configmap.yaml
This is meant as a stopgap feature until full support for GPUs is implemented and tested, but hopefully follows the same development flow for servingruntimes-configs without the annotation.
How Has This Been Tested?
Change redhat-ods-applications servingruntimes-config to add an annotation to the override servingruntime
opendatahub.io/gpu-setting: '1'
opendatahub.io/gpu-setting: '5'
opendatahub.io/gpu-setting: '0'
opendatahub.io/gpu-setting: 'hidden'
no annotation
Verify the gpu selection shows as expected and test .
Configure a model server is spawned with correct limits/requests/affinities
Verify any pods spawned for the model server contain the correct limits/requests/affinities
Tested on OCP 4.11 with GPU operator installed using live build
quay.io/cfchase/rhods-operator-live-catalog:1.22.0-w7