KubeBolt 1.15.0 — Enterprise foundation + Kobi grows up
1.15 lays the foundation for KubeBolt SaaS / Enterprise while staying a
clean, behavior-compatible upgrade for OSS. The headline is W1 — the
store-interface seams and the now-materialized organization → team → user
hierarchy — alongside a big round of Kobi maturation (a read-only MCP
server, conversation history, dry-run safety, and an in-vivo reliability
pass), first-class Cilium NetworkPolicy support, and a ground-up Overview
/ Capacity dashboard redesign.
Backend ships as KubeBolt 1.15.0. The agent is unchanged this cycle — it
stays at kubebolt-agent 1.1.0 (no agent image or chart bump). See the
compatibility matrix for supported pairings.
No breaking changes. Clean
helm upgradefrom 1.14.x. OSS behavior is
unchanged except that the org → team → user hierarchy is now materialized
(every user becomes a real member of the auto-seeded default team on first
boot — idempotent, additive) and the single-org/single-team boundary is now
enforced server-side. New BoltDB buckets are created automatically. The
bundled VictoriaMetrics moves to the-scratchimage (see Security).
Enterprise / SaaS foundation (W1)
The plumbing a multi-tenant SaaS / Enterprise build needs, landed as no-op seams
in single-tenant OSS so the boundary is real before the EE features exist.
- Multi-tenancy substrate — tenant-scoped runtime: tenant context on every
request, per-cluster runtime keys, connection-pool eviction, and tenant-scoped
WebSocket fan-out. Inert in OSS (one implicitdefaulttenant); the layer EE
builds on. (#77) - API service tokens —
kbs_service tokens (default Autopilot scopes,
reach the read-only MCP endpoint out of the box) andkbk_API keys
(fail-closed, no default scopes). The auth surface for headless automation and
the MCP server. (#78) - W1 store interfaces + identity model — the org → team → user hierarchy is
now materialized in OSS (every user is a real member of the auto-seeded
default team; single-org / single-team enforced server-side), backed by the
store-interface seams (User / Tenant / Team / Cluster / Token / RefreshToken
stores) and a no-op UsageStore metering seam. Administration is
reorganized into domain hubs — Access, Agents & Ingest, AI (Kobi), System
— plus standalone API Tokens. (#90)
Kobi grows up
- Read-only MCP server — drive Kobi's investigation tools (overview,
resources, YAML, describe, logs, events, insights, topology, metrics) from any
MCP host (Claude Code, Cursor, a CI step, another agent) over Streamable
HTTP or stdio. 17 read tools, read-only enforced server-side (no
mutating verb is reachable even by name), Secret redaction inherited. BYO host
LLM. (#83, #87) - Conversation history — Kobi conversations persist per user and resume
across refresh / re-login. (#81) - Usage analytics — per-user and per-trigger cost breakdown plus a
reliability strip in the admin usage view. (#84) - Action safety — every
propose_*action now renders a dry-run preview
(dryRun=Allserver-side) in the proposal card before you confirm; a
configurable action-progress timeout with auto-root-cause when an
executed action stalls (e.g. a scale blocked by a ResourceQuota); and a
Stop button to cancel a streaming turn with latency hints. (#85, #86, #88,
#89) - Reliability round (in-vivo) — the multi-step loop's round budget is now
configurable (KUBEBOLT_AI_MAX_ROUNDS, default 20) with a graceful close
turn + a Continue control instead of a dead error on exhaustion;
propose_debug_podtakes a non-interactivecommandwhose output Kobi reads
back from logs; and new reasoning rules make Kobi walk the dependency graph
(Service endpoints → pod readiness → logs) before blaming the network, and
recognize selectorless / external dependencies instead of guessing a
workload is missing. (#91) - Prompt / UX quick-wins — per-action RBAC vs governance error clarity,
real-time fallback-model transparency, an explicit hallucination guard,
tool-error handling guidance, and persisted per-message timestamps. (#82)
Cilium L3–L7 policies
CiliumNetworkPolicy and CiliumClusterwideNetworkPolicy (cilium.io/v2)
are now first-class managed resources — the L3-L7 layer a standard
NetworkPolicy can't express and that KubeBolt previously only saw as Hubble
metrics. List + detail with a rendered rule breakdown (peers / ports / L7
http·dns·kafka), a matched-pods tab, YAML view/edit/delete, + New
starter templates, and Kobi can now read a Cilium policy to verify whether
an L7 deny actually blocks a flow. Present only on Cilium clusters (graceful
empty everywhere else). (#92)
Overview & Capacity dashboard redesign
- Overview KPIs become ring gauges + legend rows so the dashboard reads as
one family; the previously-invisible Degraded bucket (Pending /
CrashLoopBackOff / partially-ready) is surfaced via a newdegraded
pseudo-status, with filtered deep links (?status=,?severity=) so a KPI
links to a list that matches its count. Usage cards collapse into a graded
Useddonut + bullet bar; Workload Health gains a Needs attention section;
namespace tiles segment ready / not-ready. (#97) - Capacity trends gain cluster-wide request / limit / capacity reference
overlays (capacity ships hidden behind a pill), with toggle state persisted.
(#97) - The top limited-access banner is now dismissible (persisted per
cluster + access shape, re-appears on an RBAC change). (#100)
Fixes & security
- Bundled VictoriaMetrics + vmagent →
v1.145.0-scratch. The metrics images
move to VictoriaMetrics' application-binary-only scratch variant — no OS
packages, no OpenSSL — clearing the Go-stdlib and OpenSSL (CVE-2026-42504,
CVE-2026-45447) findings on both the release Trivy gate and the public
Artifact Hub security scan, with no suppression. Drop-in (same entrypoint +
flags), verified end to end. - Configurable informer cache-sync timeout (
KUBEBOLT_CACHE_SYNC_TIMEOUT_SECONDS,
default 45s; runtime-overridable in Settings → General) for large clusters
that flake on first connect. (#79) endpointsresolves by both keys — the EndpointSlice name and the
fronting Service name — closing a 404 that read downstream as "the endpoint
was deleted" and steered RCA toward recreating it. (#96)
Upgrade
helm upgrade kubebolt oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt --version 1.15.0Preserve your overrides by capturing them and re-applying with -f (do not
use --reuse-values — it carries the old chart's bundled-image defaults
forward, keeping the previous VictoriaMetrics image):
helm get values kubebolt -n kubebolt-system -o yaml > values.yaml
helm upgrade kubebolt oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt \
-n kubebolt-system --version 1.15.0 -f values.yamlThe agent is unchanged — it stays at kubebolt-agent 1.1.0. New BoltDB
buckets auto-create and the org → team → user hierarchy materializes
idempotently on first boot. No configuration changes required.
Install
Helm (recommended for Kubernetes)
helm install kubebolt oci://ghcr.io/clm-cloud-solutions/kubebolt/helm/kubebolt
kubectl port-forward svc/kubebolt 3000:80Single Binary (download below)
# macOS Apple Silicon
curl -LO https://github.com/clm-cloud-solutions/kubebolt/releases/download/v1.15.0/kubebolt-darwin-arm64
chmod +x kubebolt-darwin-arm64 && mv kubebolt-darwin-arm64 /usr/local/bin/kubebolt
# Linux amd64
curl -LO https://github.com/clm-cloud-solutions/kubebolt/releases/download/v1.15.0/kubebolt-linux-amd64
chmod +x kubebolt-linux-amd64 && sudo mv kubebolt-linux-amd64 /usr/local/bin/kubebolt
kubebolt --kubeconfig ~/.kube/configAll platform binaries are attached below. Verify with sha256sum -c CHECKSUMS.txt.
Docker (single container)
docker run -p 3000:3000 -v ~/.kube:/root/.kube:ro \
ghcr.io/clm-cloud-solutions/kubebolt:1.15.0Homebrew (macOS, Linux)
brew install clm-cloud-solutions/tap/kubeboltkubectl krew plugin
kubectl krew index add clm https://github.com/clm-cloud-solutions/krew-index.git
kubectl krew install clm/kubebolt
kubectl kubeboltDocker Compose
git clone https://github.com/clm-cloud-solutions/kubebolt.git
cd kubebolt/deploy && docker compose up -dContainer Images
- Single container (binary + frontend):
ghcr.io/clm-cloud-solutions/kubebolt:1.15.0 - API only:
ghcr.io/clm-cloud-solutions/kubebolt/api:1.15.0 - Web only:
ghcr.io/clm-cloud-solutions/kubebolt/web:1.15.0
Changelog
Features
- feat(web): make the limited-access banner dismissible ()
- feat(web): request/limit/capacity overlays on Capacity trends ()
- feat(api+web): degraded pseudo-status and filtered deep links ()
- feat(web): redesign Overview dashboard with gauges and legend rows ()
- feat(web): Cilium policies UI + grouped/complete + New picker ()
- feat(cluster): CiliumNetworkPolicy + CiliumClusterwideNetworkPolicy as first-class resources ()
- feat(web): step-limit notice + Continue control, Max-steps setting ()
- feat(copilot): configurable tool-call rounds + graceful close, debug_pod command ()
- feat(copilot): Kobi reasoning rules + max-rounds close directive ()
- feat(web): surface the org → team → user hierarchy in the UI ()
- feat(auth): W1 — operationalize the org→team→user hierarchy in OSS ()
- feat(api): W1 — wire the UsageStore metering seam end to end ()
- feat(usage): W1 — UsageStore metering seam (no-op in OSS) ()
- feat(auth): W1 — RefreshTokenStore + APITokenStorer interfaces (token seams) ()
- feat(cluster): W1 — ClusterStore interface (cluster persistence seam) ()
- feat(auth): W1 — UserStore interface (User seam) ()
- feat(auth): W1 — TenantStore interface (Organization seam) ()
- feat(auth): W1 — TeamStore + Team/Membership identity model ()
- feat(copilot): automatic dry-run preview on Kobi action proposals ()
- feat(copilot): cancel in-flight turn + streaming latency feedback ()
- feat(copilot): configurable action-progress timeout + auto-root-cause on stall ()
- feat(copilot): Kobi Usage breakdown — per-user / per-trigger + reliability strip ()
- feat(mcp): read-only Kobi MCP server (HTTP + stdio transports) ()
- feat(copilot): autofocus input on new conversation + timestamps in export ()
- feat(copilot): persist per-message timestamps for the conversation timeline ()
- feat(copilot): cross-reference action audit records to their conversation ()
- feat(copilot): prominent fallback badge + name the user's role on a 403 ()
- feat(copilot): prompt — tool errors are cluster signal + no hallucinated resources ()
- feat(copilot): Kobi conversation history + resume UI ()
- feat(copilot): persist Kobi conversations — per-user history + resume (backend) ()
- feat(ws): scope broadcasts by (tenant,cluster) — A.4 OSS seam ()
- feat(cluster): A.3d — parked-runtime pool eviction (idle + cap) ()
- feat(cluster): A.3c — pool-based instant switch (park + promote) ()
- feat(cluster): lazy-spin pooled runtimes with single-flight (W2 A.3b) ()
- feat(cluster): resolve getters through a per-(tenant,cluster) runtime (W2 A.3a) ()
- feat(cluster): thread request ctx into Connector/Collector/Engine (W2 A.2) ()
- feat(cluster): thread (tenant,cluster) RuntimeKey through the request (W2 A.1) ()
- feat(auth): REST API tokens — service (kbs_) + customer keys (kbk_) ()
- feat(auth): tenant-context seam for multi-tenant (W0) ()
Fixes
- fix(api): resolve by both slice name and service name ()
- fix(api): create-from-manifest registries were out of sync with the GVR map ()
- fix(mcp): review fixes — token scope, Content-Type, panic recovery, packaging ()
- fix(copilot): stop resumed/historical proposals from re-investigating a stall ()
- fix(copilot): stick with the fallback provider for the rest of the session ()
- fix(copilot): fall over to the secondary provider on a 404 ()
- fix(copilot): scope Kobi conversations per cluster, not just per user ()
- fix(copilot): record token usage for every LLM call — no untracked spend ()
- fix(copilot): isolate Kobi conversations per user across login/logout ()
- fix(security): bump VictoriaMetrics + vmagent to v1.145.0-rc0 (CVE-2026-42504) ()
- fix(cluster): configurable informer cache-sync timeout (default 45s) ()
- fix(insights): gate parked-runtime notifications to active cluster (temporary) ()
- fix(cluster): gate WS broadcasts of parked pool runtimes ()
- fix(cluster): stale overview after switch + cache slow ServerVersion ()
Documentation
- docs(auth): document why hashToken uses SHA-256, not bcrypt ()
- docs(incident-sim): Kobi incident-simulation lab — scenarios + 3-tier cascade app ()
- docs(mcp): make the service-token requirement explicit + record live verification ()
- docs(mcp): add live-server verification plan to the Kobi MCP guide ()
- docs: design for read-only multi-tenant Kobi MCP server ()
Other
- Merge pull request #102 from clm-cloud-solutions/release/1.15.0-ga ()
- Merge remote-tracking branch 'origin/main' into release/1.15.0-ga ()
- Merge pull request #101 from clm-cloud-solutions/chore/1.15.0-release ()
- chore: 1.15.0 release bump (GA) ()
- Merge pull request #100 from clm-cloud-solutions/feat/dismissible-limited-access-banner ()
- Merge pull request #99 from clm-cloud-solutions/chore/1.15.0-rc.3-release ()
- chore: 1.15.0-rc.3 release bump ()
- Merge pull request #98 from clm-cloud-solutions/chore/1.15.0-rc.2-release ()
- chore: 1.15.0-rc.2 release bump ()
- Merge pull request #97 from clm-cloud-solutions/feat/dashboard-metrics-improvements ()
- Merge pull request #96 from clm-cloud-solutions/feat/endpoints-lookup-both-keys ()
- Merge pull request #95 from clm-cloud-solutions/chore/1.15.0-rc.1-release ()
- chore: 1.15.0-rc.1 release bump ()
- Merge pull request #94 from clm-cloud-solutions/feat/kobi-incident-sim-lab ()
- Merge pull request #92 from clm-cloud-solutions/feat/cilium-network-policies ()
- Merge pull request #91 from clm-cloud-solutions/feat/kobi-reliability-invivo ()
- Merge pull request #90 from clm-cloud-solutions/feat/w1-store-interfaces ()
- refactor(web): group Administration into domain hubs ()
- refactor(auth): extract IngestTokenStore from inlined Tenant.IngestTokens ()
- Merge pull request #87 from clm-cloud-solutions/claude/kobi-mcp-readonly ()
- Merge pull request #89 from clm-cloud-solutions/feat/kobi-dryrun-preview ()
- Merge pull request #88 from clm-cloud-solutions/fix/kobi-stall-refire ()
- Merge remote-tracking branch 'origin/develop' into fix/kobi-stall-refire ()
- Merge pull request #86 from clm-cloud-solutions/feat/kobi-streaming-cancel ()
- Merge pull request #85 from clm-cloud-solutions/feat/kobi-action-progress-timeout ()
- Merge pull request #84 from clm-cloud-solutions/feat/kobi-usage-breakdown ()
- Merge pull request #83 from clm-cloud-solutions/claude/kobi-mcp-feasibility-vPSZS ()
- Merge pull request #82 from clm-cloud-solutions/feat/kobi-quick-wins ()
- test(copilot): update ActionProposalCard mocks for audit object + useAuth ()
- Merge pull request #81 from clm-cloud-solutions/feat/kobi-conversation-history ()
- Merge pull request #79 from clm-cloud-solutions/fix/cache-sync-timeout ()
- Merge pull request #80 from clm-cloud-solutions/fix/trivy-vm-cve ()
- Merge pull request #78 from clm-cloud-solutions/feat/api-service-tokens ()
- Merge remote-tracking branch 'origin/develop' into feat/api-service-tokens ()
- Merge pull request #77 from clm-cloud-solutions/feat/multitenant ()
- Merge origin/develop into feat/api-service-tokens ()
- style(copilot): bump chat type ramp one step for readability ()
- Merge remote-tracking branch 'origin/develop' into feat/multitenant ()