Watch InferencePools and configure nginx #3894

sjberman · 2025-09-11T18:55:20Z

This commit adds support for the control plane to watch InferencePools. A feature flag has been added to enable/disable processing these resources. By default, it is disabled.

When an HTTPRoute references an InferencePool, we will create a headless Service associated with that InferencePool, and reference it internally in the graph config for that Route. This allows us to use all of our existing logic to get the endpoints and build the proper nginx config for those endpoints.

In a future commit, the nginx config will be updated to handle the proper load balancing for the AI workloads, but for now we just use our default methods by proxy_passing to the upstream.

Testing: Manually verified

single InferencePool results in headless service and proper nginx config
multiple InferencePools result in multiple services and proper nginx config
services are cleaned up when InferencePool is cleaned up
backendRef conditions are set properly if InferencePools don't exist
Reference Grants are utilized properly for InferencePools in different namespaces

Closes #3835

Checklist

Before creating a PR, run through this checklist and mark each as complete.

I have read the CONTRIBUTING doc
I have added tests that prove my fix is effective or that my feature works
I have checked that all unit tests pass after adding my changes
I have updated necessary documentation
I have rebased my branch onto main
I will ensure my PR is targeting the main branch and pulling from my branch from my own fork

Release notes

If this PR introduces a change that affects users and needs to be mentioned in the release notes,
please add a brief note that summarizes the change.

internal/controller/state/graph/tlsroute_test.go

This commit adds support for the control plane to watch InferencePools. A feature flag has been added to enable/disable processing these resources. By default, it is disabled. When an HTTPRoute references an InferencePool, we will create a headless Service associated with that InferencePool, and reference it internally in the graph config for that Route. This allows us to use all of our existing logic to get the endpoints and build the proper nginx config for those endpoints. In a future commit, the nginx config will be updated to handle the proper load balancing for the AI workloads, but for now we just use our default methods by proxy_passing to the upstream.

internal/framework/controller/resource.go

Makefile

internal/controller/handler_test.go

salonichf5

just had one last comment but it looks good to me otherwise

internal/controller/state/graph/httproute_test.go

internal/framework/controller/resource.go

internal/controller/state/graph/httproute.go

internal/controller/state/graph/httproute_test.go

internal/controller/state/graph/inferencepools.go

bjee19

great job!

This commit adds support for the control plane to watch InferencePools. A feature flag has been added to enable/disable processing these resources. By default, it is disabled. When an HTTPRoute references an InferencePool, we will create a headless Service associated with that InferencePool, and reference it internally in the graph config for that Route. This allows us to use all of our existing logic to get the endpoints and build the proper nginx config for those endpoints. In a future commit, the nginx config will be updated to handle the proper load balancing for the AI workloads, but for now we just use our default methods by proxy_passing to the upstream.

github-project-automation bot added this to NGINX Gateway Fabric Sep 11, 2025

github-project-automation bot moved this to 🆕 New in NGINX Gateway Fabric Sep 11, 2025

github-actions bot added documentation Improvements or additions to documentation enhancement New feature or request dependencies Pull requests that update a dependency file helm-chart Relates to helm chart labels Sep 11, 2025

sjberman marked this pull request as ready for review September 11, 2025 19:26

sjberman requested a review from a team as a code owner September 11, 2025 19:26

sjberman commented Sep 11, 2025

View reviewed changes

internal/controller/state/graph/tlsroute_test.go Outdated Show resolved Hide resolved

sjberman force-pushed the feat/inference-pools branch from dc5e130 to 7ca30f8 Compare September 11, 2025 20:20

sjberman force-pushed the feat/inference-pools branch from 7ca30f8 to dcde38c Compare September 11, 2025 20:28

sjberman commented Sep 12, 2025

View reviewed changes

internal/framework/controller/resource.go Show resolved Hide resolved

sjberman requested a review from salonichf5 September 15, 2025 15:01

Handle max length name

53872cf

sjberman force-pushed the feat/inference-pools branch from e5b2167 to 53872cf Compare September 15, 2025 15:02

salonichf5 reviewed Sep 15, 2025

View reviewed changes

Makefile Show resolved Hide resolved

internal/controller/handler_test.go Show resolved Hide resolved

salonichf5 reviewed Sep 15, 2025

View reviewed changes

internal/controller/handler_test.go Show resolved Hide resolved

sjberman force-pushed the feat/inference-pools branch from 26f2966 to 5162110 Compare September 15, 2025 19:16

Enhance unit test

2dc7e03

sjberman force-pushed the feat/inference-pools branch from 5162110 to 2dc7e03 Compare September 15, 2025 19:17

salonichf5 approved these changes Sep 15, 2025

View reviewed changes

internal/controller/state/graph/httproute_test.go Show resolved Hide resolved

bjee19 requested changes Sep 15, 2025

View reviewed changes

github-project-automation bot moved this from 🆕 New to 🏗 In Progress in NGINX Gateway Fabric Sep 15, 2025

Code review

1d963d8

sjberman requested a review from bjee19 September 16, 2025 15:02

bjee19 approved these changes Sep 16, 2025

View reviewed changes

sjberman merged commit e9a3568 into feat/inference-extension Sep 16, 2025
45 checks passed

sjberman deleted the feat/inference-pools branch September 16, 2025 18:07

github-project-automation bot moved this from 🏗 In Progress to ✅ Done in NGINX Gateway Fabric Sep 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Watch InferencePools and configure nginx #3894

Watch InferencePools and configure nginx #3894

Uh oh!

sjberman commented Sep 11, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

salonichf5 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bjee19 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Watch InferencePools and configure nginx #3894

Watch InferencePools and configure nginx #3894

Uh oh!

Conversation

sjberman commented Sep 11, 2025

Checklist

Release notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

salonichf5 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bjee19 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants