Skip to content

MAF-19232: feat(e2e): use template on e2e tests#51

Merged
hhk7734 merged 11 commits intomainfrom
MAF-19232_e2e_use_template
Feb 10, 2026
Merged

MAF-19232: feat(e2e): use template on e2e tests#51
hhk7734 merged 11 commits intomainfrom
MAF-19232_e2e_use_template

Conversation

@ghost
Copy link
Copy Markdown

@ghost ghost commented Feb 4, 2026

No description provided.

…plates for meta-llama model

- Changed the test model to "meta-llama/Llama-3.2-1B-Instruct" and added a new test template "quickstart-vllm-meta-llama-llama-3.2-1b-instruct-amd-mi250-tp2" in e2e workflows.
- Removed deprecated InferenceServiceTemplates and updated the InferenceService data structure to include template references.
- Introduced new inference service YAML templates for performance and quality benchmarks, enhancing the testing framework.
- Updated environment variable handling and resource management in the e2e tests for improved clarity and maintainability.
@gitgod-bot gitgod-bot assigned ghost Feb 4, 2026
…mance and quality tests

- Added functions to create and delete model Persistent Volumes (PV) and Persistent Volume Claims (PVC) in the e2e testing framework.
- Updated performance and quality test cases to utilize the new PV and PVC management, enhancing resource handling during tests.
- Refactored existing code to remove deprecated PV and PVC creation methods, ensuring cleaner and more maintainable test scripts.
- Updated the ParseImage function to return an error if the image format is invalid, ensuring better validation.
- Modified the createInferencePerfJob function to handle parsing errors gracefully, improving robustness in performance tests.
- Reorganized the deletion of Heimdall in quality tests for better clarity and consistency.
… tests

- Changed the paths for ModelPV and ModelPVC constants to reflect the new directory structure, ensuring alignment with the latest configuration standards.
- This update enhances maintainability and clarity in the e2e testing framework.
…functions and streamline gateway service retrieval

- Eliminated the CreateInferenceServiceTemplate and DeleteInferenceServiceTemplate functions to reduce redundancy.
- Introduced a new GetGatewayServiceName function for improved clarity in retrieving the Gateway service name.
- Enhanced error handling in the GetGatewayServiceName function to ensure better feedback when the service is not found.
…AML templates for performance and quality tests

- Modified GetInferenceServiceData function to include an 'isKind' parameter for better context handling.
- Updated performance and quality test files to utilize the new parameter in data retrieval.
- Enhanced inference service YAML templates to conditionally include resource requests and limits based on the 'isKind' flag, improving resource management in tests.
…te data retrieval for inference tests

- Added new environment variables for test templates: TEST_TEMPLATE_PREFILL and TEST_TEMPLATE_DECODE.
- Updated performance and quality test files to utilize the new template variables for data retrieval, improving clarity and maintainability in test configurations.
- Ensured consistency in the handling of inference service data across different test scenarios.
…ource management

- Introduced new YAML templates for Persistent Volume (PV) and Persistent Volume Claim (PVC) to support model storage in the e2e testing framework.
- Updated constants to reflect the new template paths, ensuring consistency and maintainability in test configurations.
- These additions improve the handling of storage resources during performance and quality tests.
@ghost
Copy link
Copy Markdown
Author

ghost commented Feb 10, 2026

Quality benchmark를 실행할 때, PD disaggregation 환경에서는 tokenizer 관련 이슈로 아래와 같은 문제가 발생하여 우선 aggregated 환경에서의 테스트로 전환해 두겠습니다.

(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482] AsyncLLM output_handler failed.
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482] Traceback (most recent call last):
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py", line 459, in output_handler
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]     processed_outputs = output_processor.process_outputs(
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/output_processor.py", line 428, in process_outputs
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]     req_state.logprobs_processor.update_from_output(
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/logprobs.py", line 201, in update_from_output
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]     self._update_prompt_logprobs(output.new_prompt_logprobs_tensors)
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/logprobs.py", line 113, in _update_prompt_logprobs
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]     convert_ids_list_to_tokens(self.tokenizer,
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]   File "/usr/local/lib/python3.12/dist-packages/vllm/transformers_utils/detokenizer_utils.py", line 100, in convert_ids_list_to_tokens
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]     token_str = tokenizer.decode([token_id])
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]   File "/usr/local/lib/python3.12/dist-packages/transformers/tokenization_utils_base.py", line 3897, in decode
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]     return self._decode(
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]            ^^^^^^^^^^^^^
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]   File "/usr/local/lib/python3.12/dist-packages/transformers/tokenization_utils_fast.py", line 682, in _decode
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]     text = self._tokenizer.decode(token_ids, skip_special_tokens=skip_special_tokens)
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482] OverflowError: out of range integral type conversion attempted

@ghost ghost marked this pull request as ready for review February 10, 2026 08:02
@ghost ghost self-requested a review as a code owner February 10, 2026 08:02
@ghost ghost requested review from bongwoobak, Copilot and hhk7734 February 10, 2026 08:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the E2E test harness to create InferenceService resources using templateRefs directly (instead of creating/deleting InferenceServiceTemplate CRs), and centralizes model PV/PVC creation into reusable test utils. This aligns the E2E suites with the new “template-based” flow and updates CI inputs accordingly.

Changes:

  • Refactor quality/performance E2E suites to create InferenceService from new per-suite templates that use templateRefs.
  • Add reusable utils + templates for creating/deleting model PV/PVC for product-cluster runs.
  • Update E2E env var set (model + template names) and workflows for product-cluster executions.

Reviewed changes

Copilot reviewed 19 out of 21 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
test/utils/settings/constants.go Adds model PV/PVC template paths; removes legacy inference service template constants.
test/utils/models.go New utils for creating/deleting model PV/PVC via templates.
test/utils/kind.go Skips kind cluster creation if it already exists.
test/utils/inference_service.go Simplifies InferenceServiceData to templateRefs-based flow; adds helper to fetch container image.
test/utils/common.go Adds ParseImage helper for extracting image tag.
test/utils/config/model-pv.yaml.tmpl New model PV template used by E2E.
test/utils/config/model-pvc.yaml.tmpl New model PVC template used by E2E.
test/e2e/envs/env_vars.go Adds TEST_TEMPLATE_PREFILL/DECODE env vars; updates defaults.
test/e2e/quality/quality_test.go Switches to new inference-service template flow and utils-based PV/PVC management.
test/e2e/quality/config/inference-service.yaml.tmpl New InferenceService template using templateRefs.
test/e2e/performance/performance_test.go Switches to new inference-service template flow; uses live image tag for perf job.
test/e2e/performance/config/inference-service.yaml.tmpl New InferenceService template using templateRefs.
test/e2e/performance/config/heimdall-values.yaml.tmpl Updates Heimdall plugin chain / scheduling profiles for PD-style routing.
test/config/base/resources/inference-service-template-*.yaml.tmpl Removes legacy InferenceServiceTemplate manifests no longer used by tests.
test/config/base/resources/inference-service-{prefill,decode}.yaml.tmpl Removes legacy prefill/decode InferenceService manifests.
.github/workflows/e2e-quality-p-cluster.yaml Updates quality workflow env vars (model + template name).
.github/workflows/e2e-pd-p-cluster.yaml Updates PD workflow env vars (model + template names).

Comment thread .github/workflows/e2e-quality-p-cluster.yaml
Comment thread test/utils/kind.go
Comment thread test/e2e/quality/quality_test.go
Comment thread test/e2e/performance/performance_test.go
Comment thread test/utils/models.go Outdated
Comment thread test/e2e/performance/config/inference-service.yaml.tmpl Outdated
Comment thread test/e2e/quality/config/inference-service.yaml.tmpl Outdated
…larity

- Revised comments in CreateModelPV and DeleteModelPV functions to specify that they handle PersistentVolumes, enhancing clarity for future developers.
…ence service templates

- Eliminated the 'parallelism' section from the performance and quality inference service YAML templates to streamline configuration and improve clarity.
Copilot AI review requested due to automatic review settings February 10, 2026 08:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 21 changed files in this pull request and generated 5 comments.

Comment thread test/e2e/quality/quality_test.go
Comment thread test/e2e/performance/performance_test.go
Comment thread test/e2e/quality/config/inference-service.yaml.tmpl
Comment thread test/e2e/performance/config/inference-service.yaml.tmpl
Comment thread test/utils/settings/constants.go
@hhk7734
Copy link
Copy Markdown
Member

hhk7734 commented Feb 10, 2026

테스트 끝난건가요?

@ghost
Copy link
Copy Markdown
Author

ghost commented Feb 10, 2026

테스트 끝난건가요?

넵 performance, quality 둘다 성공했습니다.

@hhk7734 hhk7734 merged commit b4080c0 into main Feb 10, 2026
8 of 9 checks passed
@hhk7734 hhk7734 deleted the MAF-19232_e2e_use_template branch February 10, 2026 08:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants