chore: repo polish — PROJECT path, ROADMAP/SECURITY refresh, hygiene sweep, Phase enum#309
Merged
chore: repo polish — PROJECT path, ROADMAP/SECURITY refresh, hygiene sweep, Phase enum#309
Conversation
The kubebuilder PROJECT metadata still referenced the pre-org-move path (github.com/defilan/llmkube) in three places: the top-level repo field and both Model/InferenceService resource paths. go.mod has been github.com/defilantech/llmkube for some time; align PROJECT to match so future kubebuilder scaffolding regenerates with the correct import paths. Signed-off-by: Christopher Maher <chris@mahercode.io>
ROADMAP was frozen at v0.5.0 / March 2026 with entire Q1-Q4 2026
sections that had already shipped or slipped. Rewrite to reflect the
v0.7.0 reality: multi-runtime support (vLLM, TGI, PersonaPlex, oMLX,
Ollama), HPA, hybrid CPU/GPU offload, CUDA 13, agentic-coding flags.
Replace dated quarterly buckets with direction-of-travel sections
(Near-term focus on v1beta1 API prep + supply-chain hardening +
operator decomposition; Medium-term on distributed inference, more
GPU vendors, edge). Past Releases backfilled through v0.7.0.
SECURITY supported-versions matrix was two minors behind. Bump to
0.7.x + 0.6.x and state the support policy explicitly ("latest minor
plus one prior").
Signed-off-by: Christopher Maher <chris@mahercode.io>
…ifacts The repo root has been collecting untracked cruft from editor/MCP sessions: playwright screenshots (blog-*.png, features-*.png, reddit-*.png, home-mobile-*.png, etc.), an audit.log file, a home-snapshot.md playwright artifact, a .env file, and .playwright-mcp/ session state. None were tracked but all show up in "git status" on every clone. Add patterns so future runs stay out of the working tree: - .env / .env.* (with !.env.example escape hatch) - .playwright-mcp/ local MCP state - /*.png (root-level PNGs only; docs/images/*.png still tracked) - /home-snapshot.md Also sweep the existing stray artifacts out of the working tree. Signed-off-by: Christopher Maher <chris@mahercode.io>
gpu-performance-phase0.md and hybrid-offload-phase2-spec.md read like internal design RFCs — they reference "Phase 0" / "Phase 2", internal issue numbers, and implementation plans rather than user guidance. They belong with the rest of the internal design record, not in the public docs tree where new users scanning docs/ will mistake them for user-facing instructions. Files have been moved to the internal llmkube-internal/design-docs/ folder (not in this repo). The public docs tree now contains only user-facing guides. Signed-off-by: Christopher Maher <chris@mahercode.io>
Ship the two Qwen3 sample manifests that pair with the v0.7.0 hybrid
GPU/CPU offload and agentic-coding flag features, so new users trying
those capabilities have working starting points:
- qwen3-30b-hybrid-offload.yaml exercises MoE expert offloading to
CPU RAM via the hybrid offloading spec introduced in #283.
- qwen3-coder-30b.yaml shows an agentic-coding configuration suitable
for the Qwen Code CLI.
Signed-off-by: Christopher Maher <chris@mahercode.io>
Status.Phase was declared as an unvalidated string on both CRDs. The doc comments listed the allowed values, but nothing prevented a typo in the reconciler from silently propagating to stored status. Lock the allowed values down with kubebuilder:validation:Enum so the apiserver rejects any unexpected phase value: InferenceService: Pending, Creating, Progressing, Ready, WaitingForGPU, Failed Model: Pending, Downloading, Copying, Ready, Failed The enum values are derived from every string literal currently assigned to Status.Phase in the reconcilers (confirmed via grep over internal/controller/). Fields remain +optional so fresh resources without a set Phase still validate. Regenerated CRDs include the enum constraint. First step toward the longer v1beta1 direction of migrating Phase into standard Conditions; until then, at least the enum shields us from silent drift. Signed-off-by: Christopher Maher <chris@mahercode.io>
Three kubebuilder-generated "TODO(user):" comments have been lingering since the initial project scaffold: cmd/main.go — cert-manager enable hint internal/controller/model_controller_test.go — two test scaffolding placeholders Replace the cert-manager TODO with a plain documentation comment that still points to the relevant kustomization markers. Drop the two test TODOs — the cleanup logic they hinted at is already implemented inline. Signed-off-by: Christopher Maher <chris@mahercode.io>
When llama-server health check fails after StartProcess, the code attempted to stop the process and discarded any error from StopProcess with a blank identifier. If StopProcess itself failed (e.g., permission denied sending SIGTERM, process already gone, etc.), the orphaned process would linger on the Metal host with no log entry to explain why. Surface the stop failure via the executor logger at warn level with pid and port context. The original health-check error is still returned wrapped to the caller so reconciliation flow is unchanged. Signed-off-by: Christopher Maher <chris@mahercode.io>
This was referenced Apr 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First pass of the v0.7.0 critical-issues audit — eight atomic commits that fix the metadata/hygiene items a new contributor or reviewer hits in the first minute. No behavior changes in the controller; one small agent bug fix.
Each commit is standalone and reviewable on its own:
chore: correct PROJECT repo path— kubebuilderPROJECTfile still referenced the pre-org-move path (github.com/defilan/llmkube) in three places; align togithub.com/defilantech/llmkubeso future scaffolding regenerates with the correct imports.docs: refresh ROADMAP and SECURITY for v0.7.0 baseline— ROADMAP was frozen at v0.5.0/March 2026 with Q1–Q4 sections already shipped or stale; rewrite to reflect the v0.7.0 reality (multi-runtime, HPA, hybrid offload, CUDA 13, agentic-coding flags) and replace dated quarterly buckets with direction-of-travel sections. SECURITY supported-versions matrix bumped to 0.7.x + 0.6.x with an explicit support policy.chore: tighten .gitignore— patterns for.env,.playwright-mcp/, root-level/*.png(keepingdocs/images/*.png), and/home-snapshot.mdso editor/MCP session artifacts don't accumulate in the working tree.docs: remove internal phase specs from public docs/—gpu-performance-phase0.mdandhybrid-offload-phase2-spec.mdread like internal RFCs and are moved to the internal design-docs record (not in this repo). Publicdocs/now contains only user-facing guides.docs(samples): add Qwen3 hybrid offload and agentic coding samples— ship two sample manifests paired with the v0.7.0 hybrid-offload and agentic-coding features so users have working starting points.feat(api): validate Status.Phase as enum— lock down bothInferenceServiceStatus.PhaseandModelStatus.Phasewith+kubebuilder:validation:Enumderived from every string currently assigned in the reconcilers. Silent drift is now impossible; regenerated CRDs include the constraint. First step toward the longer v1beta1 Phase → Conditions migration.chore: drop kubebuilder scaffolding TODO(user) comments— threeTODO(user):comments from the original project scaffold (cert-manager hint incmd/main.go, two test placeholders inmodel_controller_test.go). Replaced with a plain doc comment for the cert-manager reference; test placeholders deleted since the inline logic they hinted at already exists.fix(agent): log swallowed StopProcess error on health check failure— ifStopProcessfailed after a health-check failure, the process would linger on the Metal host with no log entry. Now surfaced via the executor logger at warn level; original health-check error is still wrapped and returned to the caller.Context
Output of the v0.7.0 repo maturity audit. Companion workstreams to follow on separate branches:
install.shchecksum verification,govulncheckin CI,gosec/bodycloselinters, codecov integration.inferenceservice_controller.goacross ~7 small pure-move PRs.Test plan
make manifests generate vet— cleanmake test— all packages pass;internal/controllercoverage 83.1%bin/golangci-lint run ./...— 0 issuesgit check-ignoreon.env,blog-*.png,.playwright-mcp/,docs/images/logo.png— ignore rules behave as expectedinferenceserviceandmodelCRDs underspec.versions[*].schema.openAPIV3Schema.properties.status.properties.phase.enum