Skip to content

Commit

Permalink
WG serving proposal
Browse files Browse the repository at this point in the history
  • Loading branch information
SergeyKanzhelev committed Apr 29, 2024
1 parent 2d34a98 commit 0743da8
Show file tree
Hide file tree
Showing 14 changed files with 200 additions and 0 deletions.
4 changes: 4 additions & 0 deletions OWNERS_ALIASES
Expand Up @@ -142,6 +142,10 @@ aliases:
- JimBugwadia
- poonam-lamba
- sudermanjr
wg-serving-leads:
- ArangoGutierrez
- SergeyKanzhelev
- terrytangyuan
wg-structured-logging-leads:
- mengjiao-liu
- pohly
Expand Down
1 change: 1 addition & 0 deletions liaisons.md
Expand Up @@ -60,6 +60,7 @@ members will assume one of the departing members groups.
| [WG Device Management](wg-device-management/README.md) | Patrick Ohly (**[@pohly](https://github.com/pohly)**) |
| [WG LTS](wg-lts/README.md) | Nabarun Pal (**[@palnabarun](https://github.com/palnabarun)**) |
| [WG Policy](wg-policy/README.md) | Patrick Ohly (**[@pohly](https://github.com/pohly)**) |
| [WG Serving](wg-serving/README.md) | Maciej Szulik (**[@soltysh](https://github.com/soltysh)**) |
| [WG Structured Logging](wg-structured-logging/README.md) | Nabarun Pal (**[@palnabarun](https://github.com/palnabarun)**) |
| [Committee Code of Conduct](committee-code-of-conduct/README.md) | Nabarun Pal (**[@palnabarun](https://github.com/palnabarun)**) |
| [Committee Security Response](committee-security-response/README.md) | Stephen Augustus (**[@justaugustus](https://github.com/justaugustus)**) |
Expand Down
1 change: 1 addition & 0 deletions sig-apps/README.md
Expand Up @@ -59,6 +59,7 @@ subprojects, and resolve cross-subproject technical issues and decisions.
The following [working groups][working-group-definition] are sponsored by sig-apps:
* [WG Batch](/wg-batch)
* [WG Data Protection](/wg-data-protection)
* [WG Serving](/wg-serving)


## Subprojects
Expand Down
1 change: 1 addition & 0 deletions sig-architecture/README.md
Expand Up @@ -60,6 +60,7 @@ The following [working groups][working-group-definition] are sponsored by sig-ar
* [WG Device Management](/wg-device-management)
* [WG LTS](/wg-lts)
* [WG Policy](/wg-policy)
* [WG Serving](/wg-serving)
* [WG Structured Logging](/wg-structured-logging)


Expand Down
1 change: 1 addition & 0 deletions sig-autoscaling/README.md
Expand Up @@ -48,6 +48,7 @@ The Chairs of the SIG run operations and processes governing the SIG.
The following [working groups][working-group-definition] are sponsored by sig-autoscaling:
* [WG Batch](/wg-batch)
* [WG Device Management](/wg-device-management)
* [WG Serving](/wg-serving)


## Subprojects
Expand Down
1 change: 1 addition & 0 deletions sig-instrumentation/README.md
Expand Up @@ -53,6 +53,7 @@ subprojects, and resolve cross-subproject technical issues and decisions.
## Working Groups

The following [working groups][working-group-definition] are sponsored by sig-instrumentation:
* [WG Serving](/wg-serving)
* [WG Structured Logging](/wg-structured-logging)


Expand Down
1 change: 1 addition & 0 deletions sig-list.md
Expand Up @@ -67,6 +67,7 @@ When the need arises, a [new SIG can be created](sig-wg-lifecycle.md)
|[Device Management](wg-device-management/README.md)|[device-management](https://github.com/kubernetes/kubernetes/labels/wg%2Fdevice-management)|* Architecture<br>* Autoscaling<br>* Network<br>* Node<br>* Scheduling<br>|* [John Belamaric](https://github.com/johnbelamaric), Google<br>* [Kevin Klues](https://github.com/klueska), NVIDIA<br>* [Patrick Ohly](https://github.com/pohly), Intel<br>|* [Slack](https://kubernetes.slack.com/messages/wg-device-management)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-device-management)|* Regular WG Meeting: [Tuesdays at 8:30 PT (Pacific Time) (biweekly)](TBD)<br>
|[LTS](wg-lts/README.md)|[lts](https://github.com/kubernetes/kubernetes/labels/wg%2Flts)|* Architecture<br>* Cluster Lifecycle<br>* K8s Infra<br>* Release<br>* Security<br>* Testing<br>|* [Jeremy Rickard](https://github.com/jeremyrickard), Microsoft<br>* [Jordan Liggitt](https://github.com/liggitt), Google<br>* [Micah Hausler](https://github.com/micahhausler), Amazon<br>|* [Slack](https://kubernetes.slack.com/messages/wg-lts)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-lts)|* Regular WG Meeting: [Tuesdays at 07:00 PT (Pacific Time) (biweekly)](https://zoom.us/j/92480197536?pwd=dmtSMGJRQmNYYTIyZkFlQ25JRngrdz09)<br>
|[Policy](wg-policy/README.md)|[policy](https://github.com/kubernetes/kubernetes/labels/wg%2Fpolicy)|* Architecture<br>* Auth<br>* Multicluster<br>* Network<br>* Node<br>* Scheduling<br>* Storage<br>|* [Jim Bugwadia](https://github.com/JimBugwadia), Kyverno/Nirmata<br>* [Poonam Lamba](https://github.com/poonam-lamba), Google<br>* [Andy Suderman](https://github.com/sudermanjr), Fairwinds<br>|* [Slack](https://kubernetes.slack.com/messages/wg-policy)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-policy)|* Regular WG Meeting: [Wednesdays at 8:00 PT (Pacific Time) (semimonthly)](https://zoom.us/j/7375677271)<br>
|[Serving](wg-serving/README.md)|[serving](https://github.com/kubernetes/kubernetes/labels/wg%2Fserving)|* Apps<br>* Architecture<br>* Autoscaling<br>* Instrumentation<br>* Network<br>* Node<br>* Scheduling<br>* Storage<br>|* [Eduardo Arango](https://github.com/ArangoGutierrez), NVIDIA<br>* [Sergey Kanzhelev](https://github.com/SergeyKanzhelev), Google<br>* [Yuan Tang](https://github.com/terrytangyuan), Red Hat<br>|* [Slack](https://kubernetes.slack.com/messages/wg-serving)<br>* [Mailing List](https://groups.google.com/a/kubernetes.io/g/wg-serving)|* WG Serving weekly meeting ([Calendar](https://calendar.google.com/calendar/embed?src=e896b769743f3877edfab2d4c6a14132b2aa53287021e9bbf113cab676da54ba%40group.calendar.google.com)): [Wednesdays at 9:00 AM PT (Pacific Time) (weekly)](https://zoom.us/j/93517402529?pwd=RnkwUUQ4L3J2QmNYYlNBcnZGbXcvQT09)<br>
|[Structured Logging](wg-structured-logging/README.md)|[structured-logging](https://github.com/kubernetes/kubernetes/labels/wg%2Fstructured-logging)|* API Machinery<br>* Architecture<br>* Cloud Provider<br>* Instrumentation<br>* Network<br>* Node<br>* Scheduling<br>* Storage<br>|* [Mengjiao Liu](https://github.com/mengjiao-liu), DaoCloud<br>* [Patrick Ohly](https://github.com/pohly), Intel<br>|* [Slack](https://kubernetes.slack.com/messages/wg-structured-logging)<br>* [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-wg-structured-logging)|

### Committees
Expand Down
1 change: 1 addition & 0 deletions sig-network/README.md
Expand Up @@ -74,6 +74,7 @@ subprojects, and resolve cross-subproject technical issues and decisions.
The following [working groups][working-group-definition] are sponsored by sig-network:
* [WG Device Management](/wg-device-management)
* [WG Policy](/wg-policy)
* [WG Serving](/wg-serving)
* [WG Structured Logging](/wg-structured-logging)


Expand Down
1 change: 1 addition & 0 deletions sig-node/README.md
Expand Up @@ -55,6 +55,7 @@ The following [working groups][working-group-definition] are sponsored by sig-no
* [WG Batch](/wg-batch)
* [WG Device Management](/wg-device-management)
* [WG Policy](/wg-policy)
* [WG Serving](/wg-serving)
* [WG Structured Logging](/wg-structured-logging)


Expand Down
1 change: 1 addition & 0 deletions sig-scheduling/README.md
Expand Up @@ -65,6 +65,7 @@ The following [working groups][working-group-definition] are sponsored by sig-sc
* [WG Batch](/wg-batch)
* [WG Device Management](/wg-device-management)
* [WG Policy](/wg-policy)
* [WG Serving](/wg-serving)
* [WG Structured Logging](/wg-structured-logging)


Expand Down
1 change: 1 addition & 0 deletions sig-storage/README.md
Expand Up @@ -58,6 +58,7 @@ subprojects, and resolve cross-subproject technical issues and decisions.
The following [working groups][working-group-definition] are sponsored by sig-storage:
* [WG Data Protection](/wg-data-protection)
* [WG Policy](/wg-policy)
* [WG Serving](/wg-serving)
* [WG Structured Logging](/wg-structured-logging)


Expand Down
44 changes: 44 additions & 0 deletions sigs.yaml
Expand Up @@ -3481,6 +3481,50 @@ workinggroups:
liaison:
github: pohly
name: Patrick Ohly
- dir: wg-serving
name: Serving
mission_statement: >
Discuss and enhance the support of inference serving for accelerated workloads
in Kubernetes. Make Kubernetes the natural choice for hosting production inference
reliably, and improve all serving workloads along the way.
charter_link: charter.md
stakeholder_sigs:
- Apps
- Architecture
- Autoscaling
- Instrumentation
- Network
- Node
- Scheduling
- Storage
label: serving
leadership:
chairs:
- github: ArangoGutierrez
name: Eduardo Arango
company: NVIDIA
- github: SergeyKanzhelev
name: Sergey Kanzhelev
company: Google
- github: terrytangyuan
name: Yuan Tang
company: Red Hat
meetings:
- description: WG Serving weekly meeting ([Calendar](https://calendar.google.com/calendar/embed?src=e896b769743f3877edfab2d4c6a14132b2aa53287021e9bbf113cab676da54ba%40group.calendar.google.com))
day: Wednesday
time: 9:00 AM
tz: PT (Pacific Time)
frequency: weekly
url: https://zoom.us/j/93517402529?pwd=RnkwUUQ4L3J2QmNYYlNBcnZGbXcvQT09
archive_url: https://docs.google.com/document/d/1aExJFtaLnO-TM6_2uILgI8NI0IjOm7FcwLABBKEMEo0/edit
recordings_url: https://www.youtube.com/playlist?list=TODO
contact:
slack: wg-serving
mailing_list: https://groups.google.com/a/kubernetes.io/g/wg-serving
liaison:
github: soltysh
name: Maciej Szulik
- dir: wg-structured-logging
name: Structured Logging
mission_statement: >
Expand Down
44 changes: 44 additions & 0 deletions wg-serving/README.md
@@ -0,0 +1,44 @@
<!---
This is an autogenerated file!
Please do not edit this file directly, but instead make changes to the
sigs.yaml file in the project root.
To understand how this file is generated, see https://git.k8s.io/community/generator/README.md
--->
# Serving Working Group

Discuss and enhance the support of inference serving for accelerated workloads in Kubernetes. Make Kubernetes the natural choice for hosting production inference reliably, and improve all serving workloads along the way.

The [charter](charter.md) defines the scope and governance of the Serving Working Group.

## Stakeholder SIGs
* [SIG Apps](/sig-apps)
* [SIG Architecture](/sig-architecture)
* [SIG Autoscaling](/sig-autoscaling)
* [SIG Instrumentation](/sig-instrumentation)
* [SIG Network](/sig-network)
* [SIG Node](/sig-node)
* [SIG Scheduling](/sig-scheduling)
* [SIG Storage](/sig-storage)

## Meetings
*Joining the [mailing list](https://groups.google.com/a/kubernetes.io/g/wg-serving) for the group will typically add invites for the following meetings to your calendar.*
* WG Serving weekly meeting ([Calendar](https://calendar.google.com/calendar/embed?src=e896b769743f3877edfab2d4c6a14132b2aa53287021e9bbf113cab676da54ba%40group.calendar.google.com)): [Wednesdays at 9:00 AM PT (Pacific Time)](https://zoom.us/j/93517402529?pwd=RnkwUUQ4L3J2QmNYYlNBcnZGbXcvQT09) (weekly). [Convert to your timezone](http://www.thetimezoneconverter.com/?t=9:00 AM&tz=PT%20%28Pacific%20Time%29).
* [Meeting notes and Agenda](https://docs.google.com/document/d/1aExJFtaLnO-TM6_2uILgI8NI0IjOm7FcwLABBKEMEo0/edit).
* [Meeting recordings](https://www.youtube.com/playlist?list=TODO).

## Organizers

* Eduardo Arango (**[@ArangoGutierrez](https://github.com/ArangoGutierrez)**), NVIDIA
* Sergey Kanzhelev (**[@SergeyKanzhelev](https://github.com/SergeyKanzhelev)**), Google
* Yuan Tang (**[@terrytangyuan](https://github.com/terrytangyuan)**), Red Hat

## Contact
- Slack: [#wg-serving](https://kubernetes.slack.com/messages/wg-serving)
- [Mailing list](https://groups.google.com/a/kubernetes.io/g/wg-serving)
- [Open Community Issues/PRs](https://github.com/kubernetes/community/labels/wg%2Fserving)
- Steering Committee Liaison: Maciej Szulik (**[@soltysh](https://github.com/soltysh)**)
<!-- BEGIN CUSTOM CONTENT -->

<!-- END CUSTOM CONTENT -->
98 changes: 98 additions & 0 deletions wg-serving/charter.md
@@ -0,0 +1,98 @@
# WG Serving Charter

This charter adheres to the conventions described in the [Kubernetes Charter README] and uses
the Roles and Organization Management outlined in [wg-governance].

[Kubernetes Charter README]: /committee-steering/governance/README.md

## Scope

Discuss and enhance serving workloads on Kubernetes, specifically focusing on
hardware-accelerated AI/ML inference. The working group will focus on the novel
challenges of compute-intensive online inference. Scenarios solving use cases
involving non-fungible accelerators will be prioritized over solutions against
generic CPU. However, all improvements should, where possible, benefit other
serving workloads like web services or stateful databases, be usable as
primitives by multiple ecosystem projects, and compose well into the workflows
of those deploying models to production. The Working Group Batch has a similar
scope. The difference in scope by a simplified definition is that the Serving WG
will generally concentrate on the workloads where Pods are running with
restartPolicy=Always, while WG Batch will generally be looking at Pods with the
restartPolicy=OnFailure. There are edge cases to this definition, but it creates
an easy enough framework to differentiate the scope of these two Working Groups.

### In scope

- Gather requirements for serving workloads (inference primarily, but benefiting
other non-batch use cases where possible) that have broad community alignment
from practitioners, distros, and vendors. Provide concrete input to other SIGs
and WGs around needs for identified requirements. Do it in partnership
with existing ecosystem projects like kServe, Seldon, Kaito, and
others to identify, extract, or implement common shared problems (like Kueue
abstracted deferred scheduling for multiple batch frameworks).
- Specific areas of improvement include:
- Directly improve key kubernetes workload controllers when used with
accelerators and the most common inference serving frameworks and model
servers.
- Explore new projects that improve orchestration, scaling, and load balancing
of inference workloads and compose well with other workloads on Kubernetes
- Being able to run serving workloads safely while giving up
available slack capacity to batch frameworks

### Out of scope

- Training and batch inference, which are covered by WG Batch.
- Ability to describe the workflows for serving workloads is out of scope,
Kubernetes will offer building blocks to MLOps platforms to build those.

## Stakeholders

Stakeholders in this working group span multiple SIGs that own parts of the
code in core kubernetes components and addons.

- SIG Apps as a primary SIG
- SIG Architecture
- SIG Node
- SIG Scheduling
- SIG Autoscaling
- SIG Network
- SIG Instrumentation
- SIG Storage

## Deliverables

The list of deliverables include the following high level features:

- To SIG Apps:
- Ability to express the model serving workloads with easy to understand logical
objects with the ability to scale to multi-host
- To SIG Scheduling and Autoscaling
- Faster scaling up and down
- Ability to preempt workloads
- To SIG Node:
- Runtime support for Pods preemption
- Runtime support for devices partitioning

## Roles and Organization Management

This WG adheres to the Roles and Organization Management outlined in [wg-governance]
and opts-in to updates and modifications to [wg-governance].

[wg-governance]: /committee-steering/governance/wg-governance.md

Additionally, the WG commits to:

- maintain a solid communication line between the Kubernetes groups and the wider CNCF community;
- submit a proposal to the KubeCon/CloudNativeCon maintainers track;

## Timelines and Disbanding

As a first mandate, the WG will define a roadmap in the first quarter of operation.
We believe there will be a set of features the Working Group can identify and deliver
that will enable the majority of frameworks operate natively on Kubernetes.

Achieving the aforementioned deliverables, also mentioned in the `In Scope`
section, will allow us to decide when to disband this WG. There is no
expectations that the Working Group will be converted into SIG long term,
however, there is a chance that a separate project or a sizeable sub-component
of SIG Apps can be created as a result of a Working Group.

0 comments on commit 0743da8

Please sign in to comment.