Pulse · kubernetes-sigs/gateway-api-inference-extension · GitHub

March 23, 2025 – March 26, 2025

Overview

13 Active pull requests

4 Active issues

Could not load contribution data

Please try again later

10 Pull requests merged by 4 people

Configure gpu-deployment.yaml to force vLLM v1 with LoRA
#573 merged Mar 25, 2025
Configure the vllm deployment with best practices for startup
#550 merged Mar 25, 2025
Bump sigs.k8s.io/controller-runtime from 0.20.3 to 0.20.4
#570 merged Mar 25, 2025
Bump github.com/onsi/gomega from 1.36.2 to 1.36.3
#569 merged Mar 25, 2025
Bump google.golang.org/protobuf from 1.36.5 to 1.36.6
#568 merged Mar 25, 2025
Removing unsafe lib by switching to atomic.Pointer
#567 merged Mar 25, 2025
Allow partial metric updates
#561 merged Mar 24, 2025
Update boilerplate template
#566 merged Mar 24, 2025
Swapping out flow image
#562 merged Mar 24, 2025
remove controller-runtime dependency from API
#565 merged Mar 24, 2025

3 Pull requests opened by 3 people

Add benchmark automation tool
#563 opened Mar 24, 2025
Adding printer columns to inference model
#574 opened Mar 25, 2025
Docs: Updates getting started guide for kgateway
#575 opened Mar 25, 2025

3 Issues closed by 2 people

Handle response body parsing for both streaming and non-streaming cases
#178 closed Mar 25, 2025
Remove Controller-Runtime Dependencies from API Types
#564 closed Mar 24, 2025
Refactor the vllm specific code to become model server agnostic
#383 closed Mar 23, 2025

1 Issue opened by 1 person

Record the limitation that only a single EPP replica has been tested so far
#572 opened Mar 25, 2025

9 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

[WIP] Groundwork to support OpenAI API endpoints that vLLM supports
#526 commented on Mar 24, 2025 • 2 new comments
update benchmarking guide with latest results with vllm v1
#559 commented on Mar 24, 2025 • 2 new comments
Prefix Cache Aware Proposal
#498 commented on Mar 24, 2025 • 0 new comments
Expose baseline algorithm parameters as configurable
#16 commented on Mar 24, 2025 • 0 new comments
Improve vLLM upstream health checks to only pass when models are servable
#558 commented on Mar 24, 2025 • 0 new comments
We should encourage all InferencePool deployments to gracefully rollout and drain
#549 commented on Mar 25, 2025 • 0 new comments
Create env vars for the algorithm's scheduling parameters
#447 commented on Mar 25, 2025 • 0 new comments
v0.3.0 Release Tracker
#493 commented on Mar 25, 2025 • 0 new comments
Validate model/adapter is available on the model server before sending requests to a model server
#49 commented on Mar 25, 2025 • 0 new comments