-
Notifications
You must be signed in to change notification settings - Fork 1
Feature: Add a new service "model-proxy" to support to redirect request to all available model serving jobs in cluster #61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
add LTP scripts update update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a new model-proxy service to redirect requests to available model serving jobs in a Kubernetes cluster. The service acts as a reverse proxy with authentication, load balancing, and request tracing capabilities.
Key Changes
- Added a comprehensive Go-based model-proxy service with authentication, load balancing, and request tracing
- Integrated the model-proxy service into the existing pylon reverse proxy configuration
- Provided complete deployment infrastructure including Kubernetes manifests, Docker builds, and management scripts
Reviewed Changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 8 comments.
Show a summary per file
File | Description |
---|---|
src/pylon/deploy/pylon.yaml.template | Adds environment variable for MODEL_PROXY_URI |
src/pylon/deploy/pylon-config/location.conf.template | Adds nginx location block for model-proxy routing |
src/model-proxy/src/types/*.go | Defines core data structures for requests, responses, configs, and tracing |
src/model-proxy/src/trace/trace.go | Implements JSON file-based request/response logging |
src/model-proxy/src/proxy/*.go | Core proxy functionality including authentication, load balancing, and model server discovery |
src/model-proxy/src/main.go | Main entry point with command-line configuration |
src/model-proxy/deploy/* | Kubernetes deployment templates and management scripts |
src/model-proxy/config/* | Service configuration files and validation logic |
src/model-proxy/build/model-proxy.common.dockerfile | Docker build configuration |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add description, args, return for each function
Add a new service "model-proxy" to support to redirect request to all available model serving jobs in cluster
This pull request introduces the initial implementation of the model-proxy service, including its deployment configuration, core source code, and supporting scripts. The changes lay the foundation for running a model proxy as a Kubernetes DaemonSet, handling authentication, load balancing, and configuration management.
Core service implementation:
main.go
), authentication handler (authenticator.go
), and load balancer (load_balancer.go
). These files provide the basic proxy functionality, request authentication, and endpoint selection logic. [1] [2] [3]config.json.example
,model-proxy.yaml
) and Go module definition (go.mod
) to support service setup and local development. [1] [2] [3]Deployment and orchestration:
model-proxy.yaml.template
,service.yaml
) and Dockerfile for building the service container. These files enable the model-proxy to be deployed as a DaemonSet and managed within a cluster. [1] [2] [3]start.sh
,stop.sh
,refresh.sh
,delete.sh
) to automate service lifecycle operations such as startup, shutdown, refresh, and cleanup. [1] [2] [3] [4]Configuration and validation:
model_proxy.py
to merge service configs and ensure correct types for critical parameters (e.g., port).Development tooling:
.gitignore
file to exclude build artifacts, logs, and virtual environments from source control.