MAF-19232: feat(e2e): use template on e2e tests · Pull Request #51 · moreh-dev/mif

ghost · 2026-02-04T06:51:00Z

No description provided.

…plates for meta-llama model - Changed the test model to "meta-llama/Llama-3.2-1B-Instruct" and added a new test template "quickstart-vllm-meta-llama-llama-3.2-1b-instruct-amd-mi250-tp2" in e2e workflows. - Removed deprecated InferenceServiceTemplates and updated the InferenceService data structure to include template references. - Introduced new inference service YAML templates for performance and quality benchmarks, enhancing the testing framework. - Updated environment variable handling and resource management in the e2e tests for improved clarity and maintainability.

…mance and quality tests - Added functions to create and delete model Persistent Volumes (PV) and Persistent Volume Claims (PVC) in the e2e testing framework. - Updated performance and quality test cases to utilize the new PV and PVC management, enhancing resource handling during tests. - Refactored existing code to remove deprecated PV and PVC creation methods, ensuring cleaner and more maintainable test scripts.

- Updated the ParseImage function to return an error if the image format is invalid, ensuring better validation. - Modified the createInferencePerfJob function to handle parsing errors gracefully, improving robustness in performance tests. - Reorganized the deletion of Heimdall in quality tests for better clarity and consistency.

… tests - Changed the paths for ModelPV and ModelPVC constants to reflect the new directory structure, ensuring alignment with the latest configuration standards. - This update enhances maintainability and clarity in the e2e testing framework.

…functions and streamline gateway service retrieval - Eliminated the CreateInferenceServiceTemplate and DeleteInferenceServiceTemplate functions to reduce redundancy. - Introduced a new GetGatewayServiceName function for improved clarity in retrieving the Gateway service name. - Enhanced error handling in the GetGatewayServiceName function to ensure better feedback when the service is not found.

…AML templates for performance and quality tests - Modified GetInferenceServiceData function to include an 'isKind' parameter for better context handling. - Updated performance and quality test files to utilize the new parameter in data retrieval. - Enhanced inference service YAML templates to conditionally include resource requests and limits based on the 'isKind' flag, improving resource management in tests.

…te data retrieval for inference tests - Added new environment variables for test templates: TEST_TEMPLATE_PREFILL and TEST_TEMPLATE_DECODE. - Updated performance and quality test files to utilize the new template variables for data retrieval, improving clarity and maintainability in test configurations. - Ensured consistency in the handling of inference service data across different test scenarios.

…ource management - Introduced new YAML templates for Persistent Volume (PV) and Persistent Volume Claim (PVC) to support model storage in the e2e testing framework. - Updated constants to reflect the new template paths, ensuring consistency and maintainability in test configurations. - These additions improve the handling of storage resources during performance and quality tests.

ghost · 2026-02-10T07:29:26Z

Quality benchmark를 실행할 때, PD disaggregation 환경에서는 tokenizer 관련 이슈로 아래와 같은 문제가 발생하여 우선 aggregated 환경에서의 테스트로 전환해 두겠습니다.

(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482] AsyncLLM output_handler failed.
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482] Traceback (most recent call last):
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py", line 459, in output_handler
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]     processed_outputs = output_processor.process_outputs(
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/output_processor.py", line 428, in process_outputs
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]     req_state.logprobs_processor.update_from_output(
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/logprobs.py", line 201, in update_from_output
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]     self._update_prompt_logprobs(output.new_prompt_logprobs_tensors)
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/logprobs.py", line 113, in _update_prompt_logprobs
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]     convert_ids_list_to_tokens(self.tokenizer,
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]   File "/usr/local/lib/python3.12/dist-packages/vllm/transformers_utils/detokenizer_utils.py", line 100, in convert_ids_list_to_tokens
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]     token_str = tokenizer.decode([token_id])
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]   File "/usr/local/lib/python3.12/dist-packages/transformers/tokenization_utils_base.py", line 3897, in decode
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]     return self._decode(
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]            ^^^^^^^^^^^^^
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]   File "/usr/local/lib/python3.12/dist-packages/transformers/tokenization_utils_fast.py", line 682, in _decode
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]     text = self._tokenizer.decode(token_ids, skip_special_tokens=skip_special_tokens)
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) ERROR 02-10 07:25:42 [async_llm.py:482] OverflowError: out of range integral type conversion attempted

Copilot

Pull request overview

Updates the E2E test harness to create InferenceService resources using templateRefs directly (instead of creating/deleting InferenceServiceTemplate CRs), and centralizes model PV/PVC creation into reusable test utils. This aligns the E2E suites with the new “template-based” flow and updates CI inputs accordingly.

Changes:

Refactor quality/performance E2E suites to create InferenceService from new per-suite templates that use templateRefs.
Add reusable utils + templates for creating/deleting model PV/PVC for product-cluster runs.
Update E2E env var set (model + template names) and workflows for product-cluster executions.

Reviewed changes

Copilot reviewed 19 out of 21 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
`test/utils/settings/constants.go`	Adds model PV/PVC template paths; removes legacy inference service template constants.
`test/utils/models.go`	New utils for creating/deleting model PV/PVC via templates.
`test/utils/kind.go`	Skips kind cluster creation if it already exists.
`test/utils/inference_service.go`	Simplifies `InferenceServiceData` to `templateRefs`-based flow; adds helper to fetch container image.
`test/utils/common.go`	Adds `ParseImage` helper for extracting image tag.
`test/utils/config/model-pv.yaml.tmpl`	New model PV template used by E2E.
`test/utils/config/model-pvc.yaml.tmpl`	New model PVC template used by E2E.
`test/e2e/envs/env_vars.go`	Adds `TEST_TEMPLATE_PREFILL/DECODE` env vars; updates defaults.
`test/e2e/quality/quality_test.go`	Switches to new inference-service template flow and utils-based PV/PVC management.
`test/e2e/quality/config/inference-service.yaml.tmpl`	New `InferenceService` template using `templateRefs`.
`test/e2e/performance/performance_test.go`	Switches to new inference-service template flow; uses live image tag for perf job.
`test/e2e/performance/config/inference-service.yaml.tmpl`	New `InferenceService` template using `templateRefs`.
`test/e2e/performance/config/heimdall-values.yaml.tmpl`	Updates Heimdall plugin chain / scheduling profiles for PD-style routing.
`test/config/base/resources/inference-service-template-*.yaml.tmpl`	Removes legacy `InferenceServiceTemplate` manifests no longer used by tests.
`test/config/base/resources/inference-service-{prefill,decode}.yaml.tmpl`	Removes legacy prefill/decode `InferenceService` manifests.
`.github/workflows/e2e-quality-p-cluster.yaml`	Updates quality workflow env vars (model + template name).
`.github/workflows/e2e-pd-p-cluster.yaml`	Updates PD workflow env vars (model + template names).

…larity - Revised comments in CreateModelPV and DeleteModelPV functions to specify that they handle PersistentVolumes, enhancing clarity for future developers.

…ence service templates - Eliminated the 'parallelism' section from the performance and quality inference service YAML templates to streamline configuration and improve clarity.

Copilot

Pull request overview

Copilot reviewed 19 out of 21 changed files in this pull request and generated 5 comments.

hhk7734 · 2026-02-10T08:23:41Z

테스트 끝난건가요?

ghost · 2026-02-10T08:26:16Z

테스트 끝난건가요?

넵 performance, quality 둘다 성공했습니다.

gitgod-bot assigned ghost Feb 4, 2026

seongsu-dev added 7 commits February 4, 2026 16:00

MAF-19232: refactor(e2e): Disable PD disaggregation on Quality test

3ba4d41

ghost marked this pull request as ready for review February 10, 2026 08:02

ghost self-requested a review as a code owner February 10, 2026 08:02

ghost requested review from bongwoobak, Copilot and hhk7734 February 10, 2026 08:02

Copilot started reviewing on behalf of ghost February 10, 2026 08:02 View session

Copilot AI reviewed Feb 10, 2026

View reviewed changes

Comment thread .github/workflows/e2e-quality-p-cluster.yaml

Comment thread test/utils/kind.go

Comment thread test/e2e/quality/quality_test.go

Comment thread test/e2e/performance/performance_test.go

Comment thread test/utils/models.go Outdated

hhk7734 requested changes Feb 10, 2026

View reviewed changes

Comment thread test/e2e/performance/config/inference-service.yaml.tmpl Outdated

Comment thread test/e2e/quality/config/inference-service.yaml.tmpl Outdated

seongsu-dev added 2 commits February 10, 2026 17:13

MAF-19232: refactor(e2e): Update model PV and PVC documentation for c…

58e17a3

…larity - Revised comments in CreateModelPV and DeleteModelPV functions to specify that they handle PersistentVolumes, enhancing clarity for future developers.

MAF-19232: refactor(e2e): Remove parallelism configuration from infer…

63b2870

…ence service templates - Eliminated the 'parallelism' section from the performance and quality inference service YAML templates to streamline configuration and improve clarity.

Copilot AI review requested due to automatic review settings February 10, 2026 08:14

Copilot started reviewing on behalf of ghost February 10, 2026 08:14 View session

Copilot AI reviewed Feb 10, 2026

View reviewed changes

Comment thread test/e2e/quality/quality_test.go

Comment thread test/e2e/performance/performance_test.go

Comment thread test/e2e/quality/config/inference-service.yaml.tmpl

Comment thread test/e2e/performance/config/inference-service.yaml.tmpl

Comment thread test/utils/settings/constants.go

hhk7734 approved these changes Feb 10, 2026

View reviewed changes

hhk7734 merged commit b4080c0 into main Feb 10, 2026
8 of 9 checks passed

hhk7734 deleted the MAF-19232_e2e_use_template branch February 10, 2026 08:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAF-19232: feat(e2e): use template on e2e tests#51

MAF-19232: feat(e2e): use template on e2e tests#51
hhk7734 merged 11 commits intomainfrom
MAF-19232_e2e_use_template

ghost commented Feb 4, 2026

Uh oh!

ghost commented Feb 10, 2026 •

edited by ghost

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hhk7734 commented Feb 10, 2026

Uh oh!

ghost commented Feb 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ghost commented Feb 4, 2026

Uh oh!

ghost commented Feb 10, 2026 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hhk7734 commented Feb 10, 2026

Uh oh!

ghost commented Feb 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ghost commented Feb 10, 2026 •

edited by ghost

Loading