You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the solution you'd like
[A clear and concise description of what you want to happen.]
Similar to batch/v1 Job, I would like to support suspending InferenceService.
The suspending semantics allows us to appropriately manage capacity planning in the hybrid cluster (mixed cluster of Training / Inference).
Also, the suspending feature allows us to manage all capacity allocation by Kueue.
Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
Links to the design documents:
[Optional, start with the short-form RFC template to outline your ideas and get early feedback.]
[Required, use the longer-form design doc template to specify and discuss your design in more detail]
The text was updated successfully, but these errors were encountered:
@yuzisun As we discussed in the last KubeCon, I would like to support Suspending semantics.
cc: @terrytangyuan I heard that you are interested in this feature in another place.
/kind feature
Describe the solution you'd like
[A clear and concise description of what you want to happen.]
Similar to batch/v1 Job, I would like to support suspending InferenceService.
The suspending semantics allows us to appropriately manage capacity planning in the hybrid cluster (mixed cluster of Training / Inference).
Also, the suspending feature allows us to manage all capacity allocation by Kueue.
Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
xref: kubernetes-sigs/kueue#1603
Links to the design documents:
[Optional, start with the short-form RFC template to outline your ideas and get early feedback.]
[Required, use the longer-form design doc template to specify and discuss your design in more detail]
The text was updated successfully, but these errors were encountered: