-
Notifications
You must be signed in to change notification settings - Fork 296
Open
Labels
area/test-and-releaseenhancementNew feature or requestNew feature or requesthelp wantedExtra attention is neededExtra attention is neededpriority/P2Nice-to-Have / ExploratoryNice-to-Have / Exploratory
Description
Description
Add a new production-stack profile to the E2E testing framework to test vLLM Production Stack configurations with Semantic Router.
Background
The E2E testing framework introduced in #655 provides an extensible profile-based architecture. We need to add a production-stack profile to test Semantic Router deployment and functionality in production-grade vLLM stack environments.
Tasks
- Create
e2e/profiles/production-stack/directory structure - Implement
Profileinterface for production-stack- Setup: Deploy vLLM production stack components
- Setup: Deploy Semantic Router with production configurations
- Setup: Configure high availability and load balancing
- Setup: Configure monitoring and observability
- Teardown: Clean up production stack resources
- Implement test cases:
- Multi-replica deployment health check
- Load balancing verification
- High availability failover testing
- Performance and throughput testing
- Resource utilization monitoring
- Add documentation for production-stack profile usage
- Update CI workflow to run production-stack tests
Implementation Details
Profile Structure
type Profile struct {
verbose bool
}
func (p *Profile) Setup(ctx context.Context, opts *framework.SetupOptions) error {
// 1. Deploy vLLM production stack (multiple replicas, load balancer)
// 2. Deploy Semantic Router with production settings
// 3. Configure monitoring (Prometheus, Grafana)
// 4. Configure high availability settings
}Test Cases
- Multi-Replica Health Check: Verify all replicas are healthy and serving
- Load Balancing: Test request distribution across replicas
- Failover: Verify graceful failover when a replica fails
- Performance: Measure throughput and latency under load
- Resource Monitoring: Check CPU, memory, and GPU utilization
Acceptance Criteria
- Production-stack profile can be run with
make e2e-test PROFILE=production-stack - All test cases pass successfully
- Documentation is complete and clear
- CI integration works correctly
- Performance benchmarks are documented
References
- E2E Framework PR: [Feat] Add automate e2e test framework for extensible integration tests #655
- vLLM Production Stack Documentation: (add link)
- E2E Framework README:
e2e/README.md
Related Issues
Part of the E2E testing framework expansion effort.
- Related to [E2E] Add Istio profile for E2E testing framework #656 (Istio profile)
Metadata
Metadata
Assignees
Labels
area/test-and-releaseenhancementNew feature or requestNew feature or requesthelp wantedExtra attention is neededExtra attention is neededpriority/P2Nice-to-Have / ExploratoryNice-to-Have / Exploratory