Skip to content

Available to run simple arch#38

Merged
iasthc merged 14 commits intomainfrom
fix/initial_container_unpack_modelpack
Mar 17, 2026
Merged

Available to run simple arch#38
iasthc merged 14 commits intomainfrom
fix/initial_container_unpack_modelpack

Conversation

@george-wu20250203
Copy link
Copy Markdown
Contributor

No description provided.

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the robustness and ease of use of the ModelService operator. It introduces a more flexible model artifact provisioning mechanism, provides sensible defaults for EPP and HTTPRoute configurations, and refines resource naming and labeling for better Kubernetes integration. These changes aim to make the operator more resilient to transient resource deletions and simplify the deployment of inference services.

Highlights

  • Model Artifact Provisioning: The method for provisioning model artifacts has been updated from using Kubernetes ImageVolumeSource to employing an init container that utilizes the kit tool to unpack OCI ModelPacks into an emptyDir volume. This change enhances compatibility and flexibility across different Kubernetes environments and custom registries.
  • Default Endpoint Picker (EPP) and HTTPRoute Creation: The operator now automatically creates a default Endpoint Picker (EPP) Deployment, Service, ConfigMap, ServiceAccount, and RBAC, even when spec.inferencePool is not explicitly defined. Additionally, a default HTTPRoute is created if spec.httpRoute is not set and a defaultGatewayName is provided, simplifying initial setup.
  • Enhanced Naming and Labeling Conventions: A new SanitizeDNS1035Label helper function has been introduced, and various resource naming functions have been updated to ensure DNS-1035 compliance. New labels (llm-d.ai/inferenceServing, llm-d.ai/model, inferencepool) were added to improve resource selection and management within the cluster.
  • Simplified EPP Configuration: The default EPP ConfigMap has been streamlined by removing metrics-data-source and core-metrics-extractor plugins. This simplifies the EPP setup and increases compatibility with EPP images that may not register these specific plugins.
  • Improved Resource Reconciliation Resilience: Logic has been added to recreate Kubernetes resources if they are detected as deleted between a Get and Update operation, enhancing the operator's ability to maintain desired states. Special handling for Istio DestinationRule deep-equality checks was also introduced to prevent panics.
Changelog
  • cmd/main.go
    • Added a new command-line flag --default-gateway-name to specify the default HTTPRoute parentRef gateway.
    • Passed the defaultGatewayName and kitImage to the ModelService controller's EPP configuration.
  • internal/controller/modelservice_controller.go
    • Added KitImage field to the ModelServiceReconciler struct.
    • Updated EnsureDecodeDeployment and EnsurePrefillDeployment calls to include the kitImage parameter.
    • Modified EnsureHTTPRoute call to pass the EPPConfig for default HTTPRoute creation logic.
  • internal/controller/modelservice_controller_test.go
    • Configured KitImage in the ModelServiceReconciler for tests.
    • Updated deployment tests to reflect the new model provisioning mechanism using emptyDir volumes and an init container instead of ImageVolumeSource.
  • internal/controller/suite_test.go
    • Added a minimal InferencePool CustomResourceDefinition (CRD) YAML for envtest to enable testing of InferencePool resource creation.
    • Implemented logic to create a temporary directory and write the InferencePool CRD for envtest setup and cleanup.
  • internal/modelservice/deployment.go
    • Introduced constants ModelTmpVolumeName and ModelTmpMountPath for temporary model storage.
    • Modified BuildDeployment to accept kitImage as a parameter.
    • Refactored buildInitContainers to include a model-unpack init container that uses kit to unpack OCI ModelPacks into an emptyDir volume.
    • Added buildModelUnpackInitContainer to create the model unpacking init container, supporting imagePullSecrets.
    • Updated buildVolumes to use emptyDir volumes for model storage and a secret volume for Docker config when imagePullSecrets are present.
  • internal/modelservice/epp_config.go
    • Added DefaultGatewayName field to the EPPConfig struct to support default HTTPRoute creation.
  • internal/modelservice/epp_configmap.go
    • Removed metrics-data-source and core-metrics-extractor plugins from the defaultPluginsConfig to simplify EPP setup.
  • internal/modelservice/epp_configmap_test.go
    • Updated tests to reflect the removal of metrics-data-source and core-metrics-extractor from the default EPP configuration.
  • internal/modelservice/epp_deployment.go
    • Added DefaultEPPImage constant for the default Endpoint Picker container image.
    • Removed poolName from BuildEPPDeployment parameters, as it is now derived internally.
    • Introduced PoolGroupAPIGroup constant and passed it as an argument to the EPP container.
    • Removed the --metrics-port argument from the EPP container command-line arguments.
    • Updated Selector and Template.Labels in the EPP Deployment to include LabelInferencePool.
  • internal/modelservice/epp_service.go
    • Added DefaultEPPExtProcPort constant for the default EPP external processing port.
    • Updated BuildEPPService to use EPPNameForService for the service name and to specify TargetPort for service ports.
  • internal/modelservice/helpers.go
    • Added strings and unicode imports for string manipulation.
    • Introduced new constants: DockerConfigVolumeName, DockerConfigMountPath, LabelInferenceServing, LabelModel, and LabelInferencePool.
    • Implemented SanitizeDNS1035Label function to create DNS-1035 compliant labels.
    • Updated EPPName, EPPConfigMapName, EPPSecretName, EPPServiceMonitorName, and EPPClusterRBACName to use SanitizeDNS1035Label.
    • Modified SelectorLabelsForRole and PodLabelsForRole to include new labels for improved selection.
    • Added HTTPRouteLabels and InferencePoolLabels helper functions.
    • Updated InferencePoolSelectorLabels to use LabelInferenceServing and LabelModel.
    • Adjusted EPPLabels and EPPSelectorLabels to use EPPName(msName) for consistency.
  • internal/modelservice/helpers_test.go
    • Added new test cases for SanitizeDNS1035Label, EPPNameForService, InferencePoolLabels, and HTTPRouteLabels.
    • Updated existing tests for InferencePoolSelectorLabels to reflect new label expectations.
  • internal/modelservice/httproute.go
    • Introduced BuildDefaultHTTPRoute function to construct an HTTPRoute when spec.httpRoute is not explicitly set.
    • Defined new constants: DefaultGatewayGroup, DefaultGatewayKind, DefaultGatewayNamespace, and HeaderOtterScaleModelName.
    • Added modelRouteMatch helper function to create HTTPRoute matches based on model name headers and path prefixes.
    • Modified BuildHTTPRoute to incorporate modelRouteMatch for consistent routing.
  • internal/modelservice/httproute_test.go
    • Added test cases for the new BuildDefaultHTTPRoute function.
    • Updated BuildHTTPRoute tests to verify the inclusion of modelRouteMatch in the route rules.
  • internal/modelservice/inferencepool.go
    • Added ptrToGroup helper function for creating pointers to Gateway API Group types.
    • Introduced BuildDefaultInferencePool function to create a default InferencePool when ms.Spec.InferencePool is nil.
    • Changed the default EndpointPickerFailureMode from FailOpen to FailClose.
    • Updated EndpointPickerRef in BuildInferencePool to explicitly include Group and Kind.
  • internal/modelservice/inferencepool_test.go
    • Added test cases for the new BuildDefaultInferencePool function.
    • Updated BuildInferencePool tests to reflect changes in EndpointPickerRef and FailureMode defaults.
  • internal/modelservice/reconcile.go
    • Updated EnsureDecodeDeployment and EnsurePrefillDeployment to pass the kitImage parameter.
    • Modified ensureDeployment to use SelectorLabelsForRole with roleName and adjust deployment labels.
    • Enhanced EnsureInferencePool to create a default InferencePool if ms.Spec.InferencePool is nil, and to clean up legacy naming conventions.
    • Updated EnsureHTTPRoute to create a default HTTPRoute if ms.Spec.HTTPRoute is nil and eppConfig.DefaultGatewayName is provided.
    • Removed conditional cleanup logic for EPP-related resources (ServiceAccount, ConfigMap, Service, RBAC) to ensure they are always present.
    • Modified EnsureEPPDeployment to always create an EPP Deployment, even if ms.Spec.InferencePool is nil, by using a default InferencePoolSpec.
    • Adjusted EnsureEPPRBAC to derive replicas more robustly.
    • Added logic to ensureResource and ensureClusterResource to handle IsNotFound errors during updates by recreating the resource.
    • Introduced special handling for Istio DestinationRule deep-equality checks in ensureResource.
    • Updated EnsureDestinationRule to always ensure the rule if the provider is Istio, regardless of ms.Spec.InferencePool.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant changes to enable a "simple architecture", primarily by changing the model provisioning mechanism from Kubernetes image volumes to an init container using kit unpack. It also introduces default creation of InferencePool and HTTPRoute resources, making the operator more self-contained. The changes are extensive and touch many parts of the codebase, including controllers, resource builders, and tests. My review has identified a critical bug in a shell script, a high-severity issue in a string sanitization function, and a medium-severity issue related to code duplication that should be addressed.

const modelUnpackScript = `set -euo pipefail
export KITOPS_HOME="${KITOPS_HOME:-/tmp/model-oci}"
mkdir -p "$KITOPS_HOME"
kit unpack "$OCI_MODEL_IMAGE" --plain-http -o -d "$MODEL_MOUNT_PATH"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The -o flag in the kit unpack command seems incorrect. The -o flag is typically used for specifying output format in other kit commands like list or describe, but it's not a standard flag for unpack. Its presence here will likely cause the script to fail if the kit CLI doesn't recognize it. Given that the script uses set -e, any failure would cause the init container to crash.

Suggested change
kit unpack "$OCI_MODEL_IMAGE" --plain-http -o -d "$MODEL_MOUNT_PATH"
kit unpack "$OCI_MODEL_IMAGE" --plain-http -d "$MODEL_MOUNT_PATH"

Comment on lines +150 to 176
func SanitizeDNS1035Label(s string) string {
var b strings.Builder
for _, r := range s {
switch {
case r >= 'a' && r <= 'z' || r >= '0' && r <= '9':
b.WriteRune(r)
case r >= 'A' && r <= 'Z':
b.WriteRune(unicode.ToLower(r))
case r == '.' || r == '_' || r == '-':
b.WriteRune('-')
}
}
s = strings.Trim(b.String(), "-")
for strings.Contains(s, "--") {
s = strings.ReplaceAll(s, "--", "-")
}
if s == "" {
return "modelservice"
}
if s[0] >= '0' && s[0] <= '9' {
s = "m-" + s
}
if len(s) > 0 && s[len(s)-1] == '-' {
s = strings.TrimSuffix(s, "-")
}
return s
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This function does a good job of sanitizing the string to meet most of the DNS-1035 label requirements. However, it's missing a check for the maximum length of 63 characters. A long ModelService name could result in a resource name that is too long, causing resource creation to fail. Please add a check to truncate the string to 63 characters.

func SanitizeDNS1035Label(s string) string {
	var b strings.Builder
	for _, r := range s {
		switch {
		case r >= 'a' && r <= 'z' || r >= '0' && r <= '9':
			b.WriteRune(r)
		case r >= 'A' && r <= 'Z':
			b.WriteRune(unicode.ToLower(r))
		case r == '.' || r == '_' || r == '-':
			b.WriteRune('-')
		}
	}
	s = strings.Trim(b.String(), "-")
	for strings.Contains(s, "--") {
		s = strings.ReplaceAll(s, "--", "-")
	}
	if s == "" {
		return "modelservice"
	}
	if s[0] >= '0' && s[0] <= '9' {
		s = "m-" + s
	}
	if len(s) > 63 {
		s = s[:63]
	}
	if len(s) > 0 && s[len(s)-1] == '-' {
		s = strings.TrimSuffix(s, "-")
	}
	return s
}

Comment on lines 520 to 530
if err := c.Update(ctx, existing); err != nil {
// If the resource was deleted between Get and Update, treat it as NotFound and recreate.
if apierrors.IsNotFound(err) {
if err := c.Create(ctx, desired); err != nil {
return fmt.Errorf("recreating %s %s after NotFound: %w", kind, name, err)
}
log.FromContext(ctx).Info("Recreated "+kind, "name", name)
return nil
}
return fmt.Errorf("updating %s %s: %w", kind, name, err)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This update-or-recreate-on-notfound logic is very useful for making the reconciliation loop more robust. However, this same block of code is duplicated in the ensureResource function (in two places, one for the DestinationRule special case and one for the general case). To improve maintainability and reduce code duplication, I suggest refactoring this logic into a separate helper function.

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 17, 2026

@iasthc iasthc merged commit d7c0ff0 into main Mar 17, 2026
12 checks passed
@iasthc iasthc deleted the fix/initial_container_unpack_modelpack branch March 17, 2026 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants