-
Notifications
You must be signed in to change notification settings - Fork 45
Insights: kubernetes-sigs/gateway-api-inference-extension
Overview
Could not load contribution data
Please try again later
10 Pull requests merged by 4 people
-
Configure gpu-deployment.yaml to force vLLM v1 with LoRA
#573 merged
Mar 25, 2025 -
Configure the vllm deployment with best practices for startup
#550 merged
Mar 25, 2025 -
Bump sigs.k8s.io/controller-runtime from 0.20.3 to 0.20.4
#570 merged
Mar 25, 2025 -
Bump github.com/onsi/gomega from 1.36.2 to 1.36.3
#569 merged
Mar 25, 2025 -
Bump google.golang.org/protobuf from 1.36.5 to 1.36.6
#568 merged
Mar 25, 2025 -
Removing unsafe lib by switching to atomic.Pointer
#567 merged
Mar 25, 2025 -
Allow partial metric updates
#561 merged
Mar 24, 2025 -
Update boilerplate template
#566 merged
Mar 24, 2025 -
Swapping out flow image
#562 merged
Mar 24, 2025 -
remove controller-runtime dependency from API
#565 merged
Mar 24, 2025
3 Pull requests opened by 3 people
-
Add benchmark automation tool
#563 opened
Mar 24, 2025 -
Adding printer columns to inference model
#574 opened
Mar 25, 2025 -
Docs: Updates getting started guide for kgateway
#575 opened
Mar 25, 2025
3 Issues closed by 2 people
-
Handle response body parsing for both streaming and non-streaming cases
#178 closed
Mar 25, 2025 -
Remove Controller-Runtime Dependencies from API Types
#564 closed
Mar 24, 2025 -
Refactor the vllm specific code to become model server agnostic
#383 closed
Mar 23, 2025
1 Issue opened by 1 person
-
Record the limitation that only a single EPP replica has been tested so far
#572 opened
Mar 25, 2025
9 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
[WIP] Groundwork to support OpenAI API endpoints that vLLM supports
#526 commented on
Mar 24, 2025 • 2 new comments -
update benchmarking guide with latest results with vllm v1
#559 commented on
Mar 24, 2025 • 2 new comments -
Prefix Cache Aware Proposal
#498 commented on
Mar 24, 2025 • 0 new comments -
Expose baseline algorithm parameters as configurable
#16 commented on
Mar 24, 2025 • 0 new comments -
Improve vLLM upstream health checks to only pass when models are servable
#558 commented on
Mar 24, 2025 • 0 new comments -
We should encourage all InferencePool deployments to gracefully rollout and drain
#549 commented on
Mar 25, 2025 • 0 new comments -
Create env vars for the algorithm's scheduling parameters
#447 commented on
Mar 25, 2025 • 0 new comments -
v0.3.0 Release Tracker
#493 commented on
Mar 25, 2025 • 0 new comments -
Validate model/adapter is available on the model server before sending requests to a model server
#49 commented on
Mar 25, 2025 • 0 new comments