Conversation
On a fresh cluster, victoria-metrics and grafana CRDs aren't established when Tilt tries to apply telemetry resources, even with resource_deps set. Add explicit kubectl wait gates (vm-crds-ready, grafana-crds-ready) so the k8s_yaml resources are only applied after their CRDs are established. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
On a fresh cluster the Application and WeightsAndBiases CRDs applied by kustomize may not be established by the time the controller manager tries to start. Add operator-crds-ready gate using kubectl wait. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…etrics infra-metrics-dev.yaml contains a headless Service and VMServiceScrape for ClickHouse that were not listed in the Infrastructure-Metrics k8s_resource objects. Tilt was grouping them as unmatched resources alongside unrelated kustomize cert-manager objects. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
make manifests regenerates CRD YAML files, causing Tilt to re-apply them
concurrently with the initial apply. This produces a resourceVersion
conflict ("object has been modified"). Waiting for manifests and generate
to finish ensures CRDs are applied once with up-to-date content.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
vm-crds-ready only ensures CRDs are established, not that the operator pod is running. The victoria-metrics-operator registers a validating webhook for VMSingle/VLSingle/VTSingle — applying those resources while the pod is still starting causes "connection refused" on the webhook. Add vm-operator-ready gate using kubectl rollout status. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The namespace, certificates, and issuer from kustomize build were not assigned to any k8s_resource, causing Tilt to display them unlabeled. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- scripts/setup_kind.sh: add DEV_PROFILE flag; use single-node cluster when wandbCRD contains 'dev', multi-node otherwise - hack/scripts/kind-images-manager.sh: scrape/pull/load utility for caching cluster images locally and loading into kind - .gitignore: ignore .k8s-images artifact Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…bgraph Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Operator-Certs now waits for cert-manager before applying certificates/issuer - RBAC now waits for manifests/generate before applying generated roles - Remove redundant direct manifests/generate dep from operator-controller-manager (transitively satisfied via operator-crds-ready) - Remove redundant vm-crds-ready dep from metrics resources (transitively satisfied via Victoria-Metrics → vm-operator-ready) - Add comment on Wandb CRD third-party-operators label explaining intent Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
deploy_cert_manager() runs local() commands only and registers no named Tilt resource, so the dependency cannot be declared. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add codegen ==> rbac (RBAC gained manifests/generate deps) - Remove codegen ==> controller (dep dropped as transitive via operator-crds-ready) - Remove vm_crds --> kube_metrics/op_metrics/infra_metrics (deps dropped as transitive) - Fix Victoria Stack section comment Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
helm_repo/helm_resourcecalls with a singlehelm_resource('third-party-operators')backed by thedeploy/operatorumbrella chart (wandb-operator.enabled=false), mirroring how operators are installed in productionoperator-crds-ready,vm-crds-ready,grafana-crds-ready, andvm-operator-readygates prevent resources from being applied before their CRDs or operator webhooks are ready — eliminating fresh-cluster race conditionsRBAC→ codegen,Operator-Certs→ cert-manager intent documented), remove redundant transitive deps with explanatory commentswandb-operator.enabledusingnot (eq ... false)to avoid Helm'sfalse | default truetraphack/scripts/kind-images-manager.sh(scrape/pull/load subcommands for caching images across cluster recreations) and improvescripts/setup_kind.shwith single-node vs multi-node profile selectiondocs/design/wandb_v2/tilt.mdwith a Mermaid graph of all Tilt resource dependencies, plus agent instructions for keeping it in sync with the TiltfileTest plan
tilt upfrom a freshkindcluster reaches a healthy state without manual interventioninstallTelemetry: true) starts cleanly with all readiness gates passing in orderhack/scripts/kind-images-manager.sh scrape+pull+loadround-trip works against a running clusterscripts/setup_kind.shcreates a single-node cluster for dev CRDs and a multi-node cluster for non-dev CRDsdocs/design/wandb_v2/tilt.mdrenders correctly on GitHub🤖 Generated with Claude Code