-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
70fd52b
commit 7d151bc
Showing
14 changed files
with
184 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
<!--- | ||
This is an autogenerated file! | ||
Please do not edit this file directly, but instead make changes to the | ||
sigs.yaml file in the project root. | ||
To understand how this file is generated, see https://git.k8s.io/community/generator/README.md | ||
---> | ||
# Serving Working Group | ||
|
||
Discuss and enhance the support of inference serving for accelerated workloads in Kubernetes. Make Kubernetes the natural choice for hosting production inference reliably, and improve all serving workloads along the way. | ||
|
||
The [charter](charter.md) defines the scope and governance of the Serving Working Group. | ||
|
||
## Stakeholder SIGs | ||
* [SIG Apps](/sig-apps) | ||
* [SIG Architecture](/sig-architecture) | ||
* [SIG Autoscaling](/sig-autoscaling) | ||
* [SIG Instrumentation](/sig-instrumentation) | ||
* [SIG Network](/sig-network) | ||
* [SIG Node](/sig-node) | ||
* [SIG Scheduling](/sig-scheduling) | ||
* [SIG Storage](/sig-storage) | ||
|
||
## Meetings | ||
*Joining the [mailing list](https://groups.google.com/a/kubernetes.io/g/wg-serving) for the group will typically add invites for the following meetings to your calendar.* | ||
* WG Serving weekly meeting: [Wednesdays at 9:00 PT (Pacific Time)](https://zoom.us/j/93517402529?pwd=RnkwUUQ4L3J2QmNYYlNBcnZGbXcvQT09) (weekly). [Convert to your timezone](http://www.thetimezoneconverter.com/?t=9:00&tz=PT%20%28Pacific%20Time%29). | ||
* [Meeting notes and Agenda](https://docs.google.com/document/d/1aExJFtaLnO-TM6_2uILgI8NI0IjOm7FcwLABBKEMEo0/edit). | ||
* [Meeting recordings](https://www.youtube.com/playlist?list=TODO). | ||
|
||
## Organizers | ||
|
||
* Sergey Kanzhelev (**[@SergeyKanzhelev](https://github.com/SergeyKanzhelev)**), Google | ||
* TBD (**[@TBD](https://github.com/TBD)**), TBD | ||
|
||
## Contact | ||
- Slack: [#wg-serving](https://kubernetes.slack.com/messages/wg-serving) | ||
- [Mailing list](https://groups.google.com/a/kubernetes.io/g/wg-serving) | ||
- [Open Community Issues/PRs](https://github.com/kubernetes/community/labels/wg%2Fserving) | ||
- Steering Committee Liaison: TODO TODO (**[@TODO](https://github.com/TODO)**) | ||
<!-- BEGIN CUSTOM CONTENT --> | ||
|
||
<!-- END CUSTOM CONTENT --> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
# WG Serving Charter | ||
|
||
This charter adheres to the conventions described in the [Kubernetes Charter README] and uses | ||
the Roles and Organization Management outlined in [wg-governance]. | ||
|
||
[Kubernetes Charter README]: /committee-steering/governance/README.md | ||
|
||
## Scope | ||
|
||
Discuss and enhance the support for AI/ML inference workloads in Kubernetes. | ||
|
||
|
||
### In scope | ||
|
||
- Gather requirements for serving workloads (inference primarily, but benefiting | ||
other non-batch use cases where possible) that have broad community alignment | ||
from practitioners, distros, and vendors. Provide concrete input to other SIGs | ||
and WGs around needs for identified requirements. Do it in partnership | ||
with existing ecosystem projects like kServe, Seldon, Kaito, and | ||
others to identify, extract, or implement common shared problems (like Kueue | ||
abstracted deferred scheduling for multiple batch frameworks). | ||
- Specific areas of improvement include: | ||
- Directly improve key kubernetes workload controllers when used with | ||
accelerators and the most common inference serving frameworks and model | ||
servers. | ||
- Explore new projects that improve orchestration, scaling, and load balancing | ||
of inference workloads and compose well with other workloads on Kubernetes | ||
- Being able to run serving workloads safely while giving up | ||
available slack capacity to batch frameworks | ||
|
||
### Out of scope | ||
|
||
- Training and batch inference, which are covered by WG Batch. | ||
- Addition of new API kinds that serve a specific models. The focus should be on | ||
general APIs that frameworks can build on top of. | ||
- Ability to describe the workflows for serving workloads is out of scope, | ||
Kubernetes will offer building blocks to MLOps platforms to build those. | ||
|
||
## Stakeholders | ||
|
||
Stakeholders in this working group span multiple SIGs that own parts of the | ||
code in core kubernetes components and addons. | ||
|
||
- SIG Apps as an primary SIG | ||
- SIG Architecture | ||
- SIG Node | ||
- SIG Scheduling | ||
- SIG Autoscaling | ||
- SIG Network | ||
- SIG Instrumentation | ||
- SIG Storage | ||
|
||
## Deliverables | ||
|
||
The list of deliverables include the following high level features: | ||
|
||
- To SIG Apps: | ||
- Ability to express the model serving workloads with easy to understand logical | ||
objects with the ability to scale to multi-host | ||
- To SIG Scheduling and Autoscaling | ||
- Faster scaling up and down | ||
- Ability to preempt workloads | ||
- To SIG Node: | ||
- Runtime support for Pods preemption | ||
- Runtime support for devices partitioning | ||
|
||
## Roles and Organization Management | ||
|
||
This wg adheres to the Roles and Organization Management outlined in [wg-governance] | ||
and opts-in to updates and modifications to [wg-governance]. | ||
|
||
[wg-governance]: /committee-steering/governance/wg-governance.md | ||
|
||
Additionally, the wg commits to: | ||
|
||
- maintain a solid communication line between the Kubernetes groups and the wider CNCF community; | ||
- submit a proposal to the KubeCon/CloudNativeCon maintainers track; | ||
|
||
## Timelines and Disbanding | ||
|
||
As a first mandate, the wg will define a roadmap in the first quarter of operation. | ||
We believe there will be a set of features the Working Group can identify and deliver | ||
that will enable the majority of frameworks operate natively on Kubernetes. | ||
|
||
There is no expectations that the Working Group will be converted into SIG long term, | ||
however, there is a chance that a separate project or a sizeable sub-component of SIG Apps can be | ||
created as a result of a Working Group. |