Available to run simple arch by george-wu20250203 · Pull Request #38 · otterscale/model-operator

george-wu20250203 · 2026-03-17T00:54:47Z

No description provided.

gemini-code-assist · 2026-03-17T00:55:17Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the robustness and ease of use of the ModelService operator. It introduces a more flexible model artifact provisioning mechanism, provides sensible defaults for EPP and HTTPRoute configurations, and refines resource naming and labeling for better Kubernetes integration. These changes aim to make the operator more resilient to transient resource deletions and simplify the deployment of inference services.

Highlights

Model Artifact Provisioning: The method for provisioning model artifacts has been updated from using Kubernetes ImageVolumeSource to employing an init container that utilizes the kit tool to unpack OCI ModelPacks into an emptyDir volume. This change enhances compatibility and flexibility across different Kubernetes environments and custom registries.
Default Endpoint Picker (EPP) and HTTPRoute Creation: The operator now automatically creates a default Endpoint Picker (EPP) Deployment, Service, ConfigMap, ServiceAccount, and RBAC, even when spec.inferencePool is not explicitly defined. Additionally, a default HTTPRoute is created if spec.httpRoute is not set and a defaultGatewayName is provided, simplifying initial setup.
Enhanced Naming and Labeling Conventions: A new SanitizeDNS1035Label helper function has been introduced, and various resource naming functions have been updated to ensure DNS-1035 compliance. New labels (llm-d.ai/inferenceServing, llm-d.ai/model, inferencepool) were added to improve resource selection and management within the cluster.
Simplified EPP Configuration: The default EPP ConfigMap has been streamlined by removing metrics-data-source and core-metrics-extractor plugins. This simplifies the EPP setup and increases compatibility with EPP images that may not register these specific plugins.
Improved Resource Reconciliation Resilience: Logic has been added to recreate Kubernetes resources if they are detected as deleted between a Get and Update operation, enhancing the operator's ability to maintain desired states. Special handling for Istio DestinationRule deep-equality checks was also introduced to prevent panics.

Changelog

cmd/main.go
- Added a new command-line flag --default-gateway-name to specify the default HTTPRoute parentRef gateway.
- Passed the defaultGatewayName and kitImage to the ModelService controller's EPP configuration.
internal/controller/modelservice_controller.go
- Added KitImage field to the ModelServiceReconciler struct.
- Updated EnsureDecodeDeployment and EnsurePrefillDeployment calls to include the kitImage parameter.
- Modified EnsureHTTPRoute call to pass the EPPConfig for default HTTPRoute creation logic.
internal/controller/modelservice_controller_test.go
- Configured KitImage in the ModelServiceReconciler for tests.
- Updated deployment tests to reflect the new model provisioning mechanism using emptyDir volumes and an init container instead of ImageVolumeSource.
internal/controller/suite_test.go
- Added a minimal InferencePool CustomResourceDefinition (CRD) YAML for envtest to enable testing of InferencePool resource creation.
- Implemented logic to create a temporary directory and write the InferencePool CRD for envtest setup and cleanup.
internal/modelservice/deployment.go
- Introduced constants ModelTmpVolumeName and ModelTmpMountPath for temporary model storage.
- Modified BuildDeployment to accept kitImage as a parameter.
- Refactored buildInitContainers to include a model-unpack init container that uses kit to unpack OCI ModelPacks into an emptyDir volume.
- Added buildModelUnpackInitContainer to create the model unpacking init container, supporting imagePullSecrets.
- Updated buildVolumes to use emptyDir volumes for model storage and a secret volume for Docker config when imagePullSecrets are present.
internal/modelservice/epp_config.go
- Added DefaultGatewayName field to the EPPConfig struct to support default HTTPRoute creation.
internal/modelservice/epp_configmap.go
- Removed metrics-data-source and core-metrics-extractor plugins from the defaultPluginsConfig to simplify EPP setup.
internal/modelservice/epp_configmap_test.go
- Updated tests to reflect the removal of metrics-data-source and core-metrics-extractor from the default EPP configuration.
internal/modelservice/epp_deployment.go
- Added DefaultEPPImage constant for the default Endpoint Picker container image.
- Removed poolName from BuildEPPDeployment parameters, as it is now derived internally.
- Introduced PoolGroupAPIGroup constant and passed it as an argument to the EPP container.
- Removed the --metrics-port argument from the EPP container command-line arguments.
- Updated Selector and Template.Labels in the EPP Deployment to include LabelInferencePool.
internal/modelservice/epp_service.go
- Added DefaultEPPExtProcPort constant for the default EPP external processing port.
- Updated BuildEPPService to use EPPNameForService for the service name and to specify TargetPort for service ports.
internal/modelservice/helpers.go
- Added strings and unicode imports for string manipulation.
- Introduced new constants: DockerConfigVolumeName, DockerConfigMountPath, LabelInferenceServing, LabelModel, and LabelInferencePool.
- Implemented SanitizeDNS1035Label function to create DNS-1035 compliant labels.
- Updated EPPName, EPPConfigMapName, EPPSecretName, EPPServiceMonitorName, and EPPClusterRBACName to use SanitizeDNS1035Label.
- Modified SelectorLabelsForRole and PodLabelsForRole to include new labels for improved selection.
- Added HTTPRouteLabels and InferencePoolLabels helper functions.
- Updated InferencePoolSelectorLabels to use LabelInferenceServing and LabelModel.
- Adjusted EPPLabels and EPPSelectorLabels to use EPPName(msName) for consistency.
internal/modelservice/helpers_test.go
- Added new test cases for SanitizeDNS1035Label, EPPNameForService, InferencePoolLabels, and HTTPRouteLabels.
- Updated existing tests for InferencePoolSelectorLabels to reflect new label expectations.
internal/modelservice/httproute.go
- Introduced BuildDefaultHTTPRoute function to construct an HTTPRoute when spec.httpRoute is not explicitly set.
- Defined new constants: DefaultGatewayGroup, DefaultGatewayKind, DefaultGatewayNamespace, and HeaderOtterScaleModelName.
- Added modelRouteMatch helper function to create HTTPRoute matches based on model name headers and path prefixes.
- Modified BuildHTTPRoute to incorporate modelRouteMatch for consistent routing.
internal/modelservice/httproute_test.go
- Added test cases for the new BuildDefaultHTTPRoute function.
- Updated BuildHTTPRoute tests to verify the inclusion of modelRouteMatch in the route rules.
internal/modelservice/inferencepool.go
- Added ptrToGroup helper function for creating pointers to Gateway API Group types.
- Introduced BuildDefaultInferencePool function to create a default InferencePool when ms.Spec.InferencePool is nil.
- Changed the default EndpointPickerFailureMode from FailOpen to FailClose.
- Updated EndpointPickerRef in BuildInferencePool to explicitly include Group and Kind.
internal/modelservice/inferencepool_test.go
- Added test cases for the new BuildDefaultInferencePool function.
- Updated BuildInferencePool tests to reflect changes in EndpointPickerRef and FailureMode defaults.
internal/modelservice/reconcile.go
- Updated EnsureDecodeDeployment and EnsurePrefillDeployment to pass the kitImage parameter.
- Modified ensureDeployment to use SelectorLabelsForRole with roleName and adjust deployment labels.
- Enhanced EnsureInferencePool to create a default InferencePool if ms.Spec.InferencePool is nil, and to clean up legacy naming conventions.
- Updated EnsureHTTPRoute to create a default HTTPRoute if ms.Spec.HTTPRoute is nil and eppConfig.DefaultGatewayName is provided.
- Removed conditional cleanup logic for EPP-related resources (ServiceAccount, ConfigMap, Service, RBAC) to ensure they are always present.
- Modified EnsureEPPDeployment to always create an EPP Deployment, even if ms.Spec.InferencePool is nil, by using a default InferencePoolSpec.
- Adjusted EnsureEPPRBAC to derive replicas more robustly.
- Added logic to ensureResource and ensureClusterResource to handle IsNotFound errors during updates by recreating the resource.
- Introduced special handling for Istio DestinationRule deep-equality checks in ensureResource.
- Updated EnsureDestinationRule to always ensure the rule if the provider is Istio, regardless of ms.Spec.InferencePool.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces significant changes to enable a "simple architecture", primarily by changing the model provisioning mechanism from Kubernetes image volumes to an init container using kit unpack. It also introduces default creation of InferencePool and HTTPRoute resources, making the operator more self-contained. The changes are extensive and touch many parts of the codebase, including controllers, resource builders, and tests. My review has identified a critical bug in a shell script, a high-severity issue in a string sanitization function, and a medium-severity issue related to code duplication that should be addressed.

gemini-code-assist · 2026-03-17T00:58:05Z

+const modelUnpackScript = `set -euo pipefail
+export KITOPS_HOME="${KITOPS_HOME:-/tmp/model-oci}"
+mkdir -p "$KITOPS_HOME"
+kit unpack "$OCI_MODEL_IMAGE" --plain-http -o -d "$MODEL_MOUNT_PATH"


The -o flag in the kit unpack command seems incorrect. The -o flag is typically used for specifying output format in other kit commands like list or describe, but it's not a standard flag for unpack. Its presence here will likely cause the script to fail if the kit CLI doesn't recognize it. Given that the script uses set -e, any failure would cause the init container to crash.

Suggested change

kit unpack "$OCI_MODEL_IMAGE" --plain-http -o -d "$MODEL_MOUNT_PATH"

kit unpack "$OCI_MODEL_IMAGE" --plain-http -d "$MODEL_MOUNT_PATH"

gemini-code-assist · 2026-03-17T00:58:05Z

+func SanitizeDNS1035Label(s string) string {
+	var b strings.Builder
+	for _, r := range s {
+		switch {
+		case r >= 'a' && r <= 'z' || r >= '0' && r <= '9':
+			b.WriteRune(r)
+		case r >= 'A' && r <= 'Z':
+			b.WriteRune(unicode.ToLower(r))
+		case r == '.' || r == '_' || r == '-':
+			b.WriteRune('-')
+		}
+	}
+	s = strings.Trim(b.String(), "-")
+	for strings.Contains(s, "--") {
+		s = strings.ReplaceAll(s, "--", "-")
+	}
+	if s == "" {
+		return "modelservice"
+	}
+	if s[0] >= '0' && s[0] <= '9' {
+		s = "m-" + s
+	}
+	if len(s) > 0 && s[len(s)-1] == '-' {
+		s = strings.TrimSuffix(s, "-")
+	}
+	return s
 }


This function does a good job of sanitizing the string to meet most of the DNS-1035 label requirements. However, it's missing a check for the maximum length of 63 characters. A long ModelService name could result in a resource name that is too long, causing resource creation to fail. Please add a check to truncate the string to 63 characters.

func SanitizeDNS1035Label(s string) string { var b strings.Builder for _, r := range s { switch { case r >= 'a' && r <= 'z' || r >= '0' && r <= '9': b.WriteRune(r) case r >= 'A' && r <= 'Z': b.WriteRune(unicode.ToLower(r)) case r == '.' || r == '_' || r == '-': b.WriteRune('-') } } s = strings.Trim(b.String(), "-") for strings.Contains(s, "--") { s = strings.ReplaceAll(s, "--", "-") } if s == "" { return "modelservice" } if s[0] >= '0' && s[0] <= '9' { s = "m-" + s } if len(s) > 63 { s = s[:63] } if len(s) > 0 && s[len(s)-1] == '-' { s = strings.TrimSuffix(s, "-") } return s }

gemini-code-assist · 2026-03-17T00:58:05Z

 	if err := c.Update(ctx, existing); err != nil {
+		// If the resource was deleted between Get and Update, treat it as NotFound and recreate.
+		if apierrors.IsNotFound(err) {
+			if err := c.Create(ctx, desired); err != nil {
+				return fmt.Errorf("recreating %s %s after NotFound: %w", kind, name, err)
+			}
+			log.FromContext(ctx).Info("Recreated "+kind, "name", name)
+			return nil
+		}
 		return fmt.Errorf("updating %s %s: %w", kind, name, err)
 	}


This update-or-recreate-on-notfound logic is very useful for making the reconciliation loop more robust. However, this same block of code is duplicated in the ensureResource function (in two places, one for the DestinationRule special case and one for the general case). To improve maintainability and reduce code duplication, I suggest refactoring this logic into a separate helper function.

codecov · 2026-03-17T13:51:32Z

Codecov Report

❌ Patch coverage is 65.70513% with 107 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
internal/modelservice/reconcile.go	0.00%	74 Missing ⚠️
internal/modelservice/helpers.go	58.69%	16 Missing and 3 partials ⚠️
internal/modelservice/defaults.go	0.00%	7 Missing ⚠️
internal/controller/modelservice_controller.go	28.57%	0 Missing and 5 partials ⚠️
internal/modelservice/deployment.go	97.40%	1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

…_unpack_modelpack

george-wu20250203 added 7 commits March 15, 2026 19:23

Add podlabels llm-d.ai/model

3e36203

create epp automatically

ba11084

Create httproute automatically

685abcc

Finish workable version

94105bf

Modify lables

0ff4f0d

fix lint issue

240c6f0

fix test error

77e77b5

gemini-code-assist Bot reviewed Mar 17, 2026

View reviewed changes

iasthc added 5 commits March 17, 2026 21:38

feat: add default image configurations and update EPP deployment logic

d86b532

chore: update dependencies to latest versions

df1e92c

feat: add GAIE version and update CRD download logic in Makefile

d4bc46b

fix: remove unnecessary blank line in suite_test.go

522e223

fix: add DefaultImages to ModelServiceReconciler in tests

7cde32e

iasthc added 2 commits March 17, 2026 21:51

fix: update default EPP image description in main.go

2eeef34

Merge remote-tracking branch 'origin/main' into fix/initial_container…

565798a

…_unpack_modelpack

iasthc merged commit d7c0ff0 into main Mar 17, 2026
12 checks passed

iasthc deleted the fix/initial_container_unpack_modelpack branch March 17, 2026 13:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Available to run simple arch#38

Available to run simple arch#38
iasthc merged 14 commits intomainfrom
fix/initial_container_unpack_modelpack

george-wu20250203 commented Mar 17, 2026

Uh oh!

gemini-code-assist Bot commented Mar 17, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Mar 17, 2026

Uh oh!

gemini-code-assist Bot Mar 17, 2026

Uh oh!

gemini-code-assist Bot Mar 17, 2026

Uh oh!

codecov Bot commented Mar 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	kit unpack "$OCI_MODEL_IMAGE" --plain-http -o -d "$MODEL_MOUNT_PATH"
	kit unpack "$OCI_MODEL_IMAGE" --plain-http -d "$MODEL_MOUNT_PATH"

Conversation

george-wu20250203 commented Mar 17, 2026

Uh oh!

gemini-code-assist Bot commented Mar 17, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Mar 17, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants