Skip to content

Ring v0.9.0

Latest

Choose a tag to compare

@Shine-neko Shine-neko released this 03 Jun 08:02
· 137 commits to main since this release

Changed (breaking)

  • POST /deployments now uses RFC 7807 with the same shape as POST /users (application/problem+json, violations[] with property_path, message, code). Existing 400/422 responses with {"message": "..."} body are replaced. Codes for the rules already in place:

    • deployment.runtime.unsupported — runtime must be one of: docker, cloud-hypervisor
    • deployment.command.cloud_hypervisor_unsupported
    • deployment.image.cloud_hypervisor_requires_absolute_path
    • deployment.network.host_runtime_unsupported
    • deployment.ports.host_network_conflict
    • deployment.replicas.host_network_conflict

    New rules (previously not validated, the manifest applied and broke at runtime):

    • deployment.ports.published.out_of_range / deployment.ports.target.out_of_range — port 0 is reserved
    • deployment.ports.published.duplicate — two entries publishing the same host port
    • deployment.ports.replicas_conflict + deployment.replicas.ports_conflict — publishing host ports with replicas > 1 causes inter-replica collisions
    • deployment.replicas.job_must_be_onekind: job is one-shot
    • deployment.health_checks.job_readiness_unsupported — readiness checks only gate rolling updates
    • deployment.environment.key.invalid — env var names must match [A-Za-z_][A-Za-z0-9_]*
    • deployment.resources.{limits,requests}.{cpu,memory}.invalid — invalid quantity string
    • deployment.config.image_pull_policy.unsupported — must be Always, IfNotPresent or Never

    property_path follows JSONPath conventions for nested collections: ports[0].published, volumes[2].source, resources.limits.cpu.

  • POST /namespaces now uses RFC 7807. Validation failures return application/problem+json (422) with codes:

    • namespace.name.length — must be 2 to 63 characters
    • namespace.name.format — lowercase DNS-label rules (a-z0-9 plus -, no leading/trailing dash)

    Conflicts (duplicate name) now return application/problem+json (409) with title: "Conflict" and a detail line naming the offending namespace, instead of the legacy {"error": "..."} shape.

  • POST /secrets now uses RFC 7807. Validation codes:

    • secret.namespace.length / secret.namespace.format
    • secret.name.length — 2 to 253 characters
    • secret.name.format — DNS-subdomain rules (lowercase alphanumerics plus . and -)
    • secret.value.length — 1 to 1 MiB (matches Kubernetes' Secret limit)

    404 (namespace missing) and 409 (duplicate) responses are problem+json with Not Found / Conflict titles.

  • POST /configs and PUT /configs/{id} now use RFC 7807. Validation codes:

    • config.namespace.length / config.namespace.format
    • config.name.length — 1 to 253 characters
    • config.name.format — same DNS-subdomain rules as secrets
    • config.data.length — 1 to 1 MiB
    • config.data.invalid_json — on PUT, when data is non-empty but doesn't round-trip as JSON
    • config.labels.length — at most 1000 characters

    The previous 400 with {"error": "Validation failed", "details": ...} is replaced by a 422 with violations. 404 (config missing on PUT) and 409 (duplicate on POST) are problem+json.

  • POST /login now emits problem+json on 401/500 with title: "Unauthorized" and detail: "invalid credentials" (same shape on internal errors with a generic detail). The legacy {"errors": ["Invalid credentials"]} body is gone.

  • Validation errors on POST /users and PUT /users/{id} now use RFC 7807. The 422 response shape changed from the bare validator-derived {"errors": <map>} to application/problem+json:

    {
      "type": "about:blank",
      "title": "Validation failed",
      "status": 422,
      "detail": "username: must be 2 to 50 characters\npassword: must be 8 to 128 characters",
      "violations": [
        { "property_path": "username", "message": "must be 2 to 50 characters", "code": "user.username.length" },
        { "property_path": "password", "message": "must be 8 to 128 characters", "code": "user.password.length" }
      ]
    }

    Every violation carries a stable code slug (e.g. user.username.format) that clients can branch on without parsing the human message. All applicable rules run on every request — the response lists every failure in one shot instead of stopping at the first.

    Username format is now [a-zA-Z0-9][a-zA-Z0-9._-]* (2-50 chars), matching GitHub-style conventions for human-facing identifiers. Password rules unchanged (8-128 chars).

  • DeploymentStatus is now snake_case in the JSON API and DB. Previously the lifecycle states (pending, running, …) were lowercase while the error states (CrashLoopBackOff, ImagePullBackOff, …) were PascalCase — the mismatch silently dropped rows from string-matching filters elsewhere in the code (root cause of PR #84). All variants now share the same convention. Mapping for external consumers:

    • CrashLoopBackOffcrash_loop_back_off
    • ImagePullBackOffimage_pull_back_off
    • CreateContainerErrorcreate_container_error
    • NetworkErrornetwork_error
    • ConfigErrorconfig_error
    • FileSystemErrorfile_system_error
    • Errorerror (unchanged shape, lowercased)

    Migration 20220101000015_snake_case_deployment_status.sql rewrites existing rows. Update any script that does jq '.status == "CrashLoopBackOff"' or similar.

    Event reason strings (ImagePullBackOff, InstanceCreationFailed, …) stay PascalCase — those are event labels, not statuses.

Added

  • Host-memory admission control. Before creating a Docker container or booting a Cloud Hypervisor VM, Ring now checks the deployment's requested memory (resources.requests.memory, falling back to resources.limits.memory) against the host's currently-available memory. If it doesn't fit, the deployment goes to a new terminal status insufficient_resources with an event naming the gap (needs X MiB but only Y MiB is available — free memory or lower requests/limits), instead of starting and being OOM-killed (Docker) or failing the VM spawn opaquely (CH). The status is terminal — Ring does not crash-loop, since the memory won't reappear on its own. The check is best-effort and point-in-time, and applies to memory only (CPU overcommit is left alone). Deployments that declare no memory request or limit are not gated.

  • volumes: type: secret — mount a ring secret as a read-only file inside the container. The decrypted value becomes the file contents, with no key: field (a secret holds a single opaque value). Pattern matches type: config but reads from the encrypted secret store instead of the plaintext config store. Use when an app expects a credentials file path rather than an env var (Prometheus credentials_file, TLS material, etc.). The mount is always read-only; rotation requires a redeploy. See Deploy with secrets → Mount a secret as a file.

  • ring apply, ring namespace create and ring secret create render RFC 7807 problem details. On validation failure the CLI prints the title line plus every violation with its property path, e.g.

    Unable to apply deployment 'nginx': Validation failed (422)
      * ports[0].published: must be between 1 and 65535
      * replicas: replicas > 1 (3) is incompatible with `ports` — drop `ports` or reduce `replicas` to 1
    

    instead of the legacy API returned status 422: <raw body> one-liner. Non-validation problems (404, 409) print the same way with the server's title and detail. Non-7807 responses fall back to the previous behaviour.

Changed

  • Failed Docker image pulls now surface an actionable reason instead of a raw daemon dump. A ImagePullBackOff event previously read Failed to pull image '…': <bollard string>. Ring now classifies the failure and rewrites it: authentication refused → registry authentication failed … — check config.server, config.username and config.password; registry unreachable (connection refused, host not found, timeout) → cannot reach the registry … — is it up and the registry host correct?; missing tag/digest stays as not found. The original daemon string is preserved verbatim in (original error: …). The deployment status (image_pull_back_off) and event reason are unchanged.

Dashboard (new)

  • Web dashboard (SvelteKit, served embedded by ring server start --dashboard or locally via ring dashboard). Login, Overview home page with summary cards (deployments by status, namespaces/secrets/configs counts, node health, failing deployments), and a deployments list with namespace filters and a created-at column.
  • Deployment detail page — overview, configured resources, running instances, live metrics (per-instance and aggregated CPU / memory / network I/O / disk I/O / PIDs), ports, volumes, environment, configured health checks, health-check probe history, streamed logs (live tail over SSE), and a recent-events timeline.
  • Node page — host info (hostname, OS, arch, uptime, CPU cores, memory, load average).
  • Read-only views for namespaces (with per-namespace audit trail), secrets, and configs.
  • Dark/light theme toggle (persisted, follows prefers-color-scheme) and the Ring version shown in the sidebar.
  • Per-page browser titles, copy-to-clipboard on IDs, and a responsive layout for small screens.

Added

  • ring init — interactive setup that prompts for runtime + port and generates RING_SECRET_KEY, plus --runtime / --port flags to script it non-interactively (CI, Ansible) without a TTY.
  • ring node get — host information for the server's machine.
  • Startup bannerring server start prints a Vite-style banner with the API's Local/Network URLs, the dashboard URL (when enabled), and the registered runtimes.
  • Semantic CLI colours and aligned tables — errors red, success green, status column colour-coded; ANSI is dropped automatically in pipes / under NO_COLOR / with --output json.

Changed

  • CLI commands exit non-zero on failure. A command that can't reach the API or gets a non-2xx response now returns a categorised exit code (1 general, 2 auth, 3 connection, 4 not-found, 5 conflict) instead of silently exiting 0 — so set -e, CI gates and && chains detect the error. Network failures render a single human line instead of the raw reqwest chain.

Fixed

  • Scheduler no longer gets stuck cleaning up a deleted deployment whose referenced secret/config/volume was removed — it reconciles straight to teardown instead of failing resolution every tick.
  • CreateContainerError converges to CrashLoopBackOff instead of looping forever; orphan Docker containers are cleaned up on create/connect-network failures; kind: job create errors converge to Failed.
  • Deployment status enum unified to snake_case end to end (DB, JSON, events), fixing rows silently dropped from string-matched filters.

Security

  • IDOR fix on PUT/DELETE /users/{id} (a user could modify/delete another); minimal-role authorization check.
  • Unique random salt per password hash (previously deterministic — identical passwords produced identical hashes).
  • Single auth middleware at the router with a fail-closed User extractor; namespace-scoped audit log for create/update/delete.
  • Dependency bumps closing OpenSSL CVEs.

Internal

  • CI: clippy is now blocking (-D warnings) and a dashboard lint job (biome + svelte-check) was added.
  • Refactor: presentation helpers (style, output, problem_json) moved to a cli module; shared OutputFormat ValueEnum for --output.