Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly improves the robustness and ease of use of the ModelService operator. It introduces a more flexible model artifact provisioning mechanism, provides sensible defaults for EPP and HTTPRoute configurations, and refines resource naming and labeling for better Kubernetes integration. These changes aim to make the operator more resilient to transient resource deletions and simplify the deployment of inference services. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces significant changes to enable a "simple architecture", primarily by changing the model provisioning mechanism from Kubernetes image volumes to an init container using kit unpack. It also introduces default creation of InferencePool and HTTPRoute resources, making the operator more self-contained. The changes are extensive and touch many parts of the codebase, including controllers, resource builders, and tests. My review has identified a critical bug in a shell script, a high-severity issue in a string sanitization function, and a medium-severity issue related to code duplication that should be addressed.
| const modelUnpackScript = `set -euo pipefail | ||
| export KITOPS_HOME="${KITOPS_HOME:-/tmp/model-oci}" | ||
| mkdir -p "$KITOPS_HOME" | ||
| kit unpack "$OCI_MODEL_IMAGE" --plain-http -o -d "$MODEL_MOUNT_PATH" |
There was a problem hiding this comment.
The -o flag in the kit unpack command seems incorrect. The -o flag is typically used for specifying output format in other kit commands like list or describe, but it's not a standard flag for unpack. Its presence here will likely cause the script to fail if the kit CLI doesn't recognize it. Given that the script uses set -e, any failure would cause the init container to crash.
| kit unpack "$OCI_MODEL_IMAGE" --plain-http -o -d "$MODEL_MOUNT_PATH" | |
| kit unpack "$OCI_MODEL_IMAGE" --plain-http -d "$MODEL_MOUNT_PATH" |
| func SanitizeDNS1035Label(s string) string { | ||
| var b strings.Builder | ||
| for _, r := range s { | ||
| switch { | ||
| case r >= 'a' && r <= 'z' || r >= '0' && r <= '9': | ||
| b.WriteRune(r) | ||
| case r >= 'A' && r <= 'Z': | ||
| b.WriteRune(unicode.ToLower(r)) | ||
| case r == '.' || r == '_' || r == '-': | ||
| b.WriteRune('-') | ||
| } | ||
| } | ||
| s = strings.Trim(b.String(), "-") | ||
| for strings.Contains(s, "--") { | ||
| s = strings.ReplaceAll(s, "--", "-") | ||
| } | ||
| if s == "" { | ||
| return "modelservice" | ||
| } | ||
| if s[0] >= '0' && s[0] <= '9' { | ||
| s = "m-" + s | ||
| } | ||
| if len(s) > 0 && s[len(s)-1] == '-' { | ||
| s = strings.TrimSuffix(s, "-") | ||
| } | ||
| return s | ||
| } |
There was a problem hiding this comment.
This function does a good job of sanitizing the string to meet most of the DNS-1035 label requirements. However, it's missing a check for the maximum length of 63 characters. A long ModelService name could result in a resource name that is too long, causing resource creation to fail. Please add a check to truncate the string to 63 characters.
func SanitizeDNS1035Label(s string) string {
var b strings.Builder
for _, r := range s {
switch {
case r >= 'a' && r <= 'z' || r >= '0' && r <= '9':
b.WriteRune(r)
case r >= 'A' && r <= 'Z':
b.WriteRune(unicode.ToLower(r))
case r == '.' || r == '_' || r == '-':
b.WriteRune('-')
}
}
s = strings.Trim(b.String(), "-")
for strings.Contains(s, "--") {
s = strings.ReplaceAll(s, "--", "-")
}
if s == "" {
return "modelservice"
}
if s[0] >= '0' && s[0] <= '9' {
s = "m-" + s
}
if len(s) > 63 {
s = s[:63]
}
if len(s) > 0 && s[len(s)-1] == '-' {
s = strings.TrimSuffix(s, "-")
}
return s
}| if err := c.Update(ctx, existing); err != nil { | ||
| // If the resource was deleted between Get and Update, treat it as NotFound and recreate. | ||
| if apierrors.IsNotFound(err) { | ||
| if err := c.Create(ctx, desired); err != nil { | ||
| return fmt.Errorf("recreating %s %s after NotFound: %w", kind, name, err) | ||
| } | ||
| log.FromContext(ctx).Info("Recreated "+kind, "name", name) | ||
| return nil | ||
| } | ||
| return fmt.Errorf("updating %s %s: %w", kind, name, err) | ||
| } |
There was a problem hiding this comment.
This update-or-recreate-on-notfound logic is very useful for making the reconciliation loop more robust. However, this same block of code is duplicated in the ensureResource function (in two places, one for the DestinationRule special case and one for the general case). To improve maintainability and reduce code duplication, I suggest refactoring this logic into a separate helper function.
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
No description provided.