Skip to content

Epic: Phase 3 — Docker-free installer with native binaries #64

@jh-lee-cryptolab

Description

@jh-lee-cryptolab

Context

Phase 1 deleted Docker artifacts from the repo but left install.sh and cloud cloud-init scripts in their old Docker-based form — they are rewritten here to install and run the native binary produced by Phase 2 (#63). Admin CLI becomes a real command on PATH.

Install directory is unified to /opt/rune-vault/ for both local and cloud targets. The prior local path ~/rune-vault/ existed to sidestep Docker Desktop's VirtioFS mount permission issue on macOS when mounting volumes under /opt/; with the native binary there is no VirtioFS layer and no permission issue, so /opt/rune-vault/ is usable on Linux and macOS alike.

Target platforms: Linux and macOS only — Windows is out of scope for rune-vault.

flowchart TB
  Start[install.sh] --> Det[Detect OS/arch]
  Det --> DL[Download archive + .sig + .pem<br/>+ SHA256SUMS from GH release]
  DL --> Ver{shasum + cosign<br/>verify-blob}
  Ver -->|fail| Abort[Abort<br/>no partial state]
  Ver -->|ok| Extract[Extract to install dir]
  Extract --> TLS[Generate self-signed TLS<br/>moved from docker-entrypoint.sh]
  TLS --> Svc{Platform}
  Svc -->|Linux| Systemd[systemd unit]
  Svc -->|macOS| Launchd[launchd plist]
  Systemd & Launchd --> Link[Symlink to PATH]
  Link --> Status[runevault status ✓]
Loading

Non-goals

  • Rewriting the installer in Go (bash stays)
  • Windows support (explicitly out of scope — Linux + macOS only)
  • In-place binary upgrade (install.sh --upgrade) — deferred to a follow-up PR; this phase covers fresh install and uninstall only
  • Installer-driven auto-update

Design

Service registration:

OS Method Unit
Linux systemd /etc/systemd/system/runevault.service (system-wide when installed with sudo)
macOS launchd ~/Library/LaunchAgents/com.cryptolabinc.runevault.plist

New flags:

Flag Behaviour
--uninstall Remove service + unit file. Prompt before deleting the install dir.

Cloud deployments (deployment/{aws,gcp,oci}/): drop apt install docker-ce from cloud-init; download the asset with curl, verify with cosign, install the systemd unit. Terraform state layout unchanged.

Alias: the legacy runevault shell alias (docker exec -it rune-vault …) is removed — the binary is now literally named runevault and is placed on PATH, so operators just type runevault.

Configuration & port tuning: installer writes runevault.conf (YAML) under configs/. Service unit's ExecStart invokes runevault daemon start --config /opt/rune-vault/configs/runevault.conf — no EnvironmentFile=, no env templating. Operators edit runevault.conf and either runevault daemon restart or systemctl restart runevault (equivalent). This replaces the docker-compose.yml edit + docker compose up -d flow.

Example /opt/rune-vault/configs/runevault.conf (see Phase 1 #61 for full schema):

daemon:
  pid_file: /opt/rune-vault/.runevault.pid

server:
  grpc:
    host: 0.0.0.0
    port: 50051
    tls:
      cert: /opt/rune-vault/certs/server.pem
      key: /opt/rune-vault/certs/server.key
  admin:
    socket: /opt/rune-vault/admin.sock

keys:
  path: /opt/rune-vault/vault-keys
  index_name: my-team

envector:
  endpoint: https://envector.example.com
  api_key: <opaque-token-string>

tokens:
  team_secret: <random-hex-32-bytes>
  roles_file: /opt/rune-vault/configs/roles.yml
  tokens_file: /opt/rune-vault/configs/tokens.yml

audit:
  mode: file+stdout
  path: /opt/rune-vault/logs/audit.log

runevault.conf is mode 0600 and holds both envector.api_key and tokens.team_secret inline. Splitting secrets into a separate secrets/ directory is only worthwhile when integrating with an external keystore (K8s Secrets, HashiCorp Vault, etc.); without one, two 0600 files under the same vault user is cosmetic, not a real security boundary. The *_file form stays available in the schema for the keystore-integration case but is not the default.

Interactive configuration: install.sh collects settings through an interactive prompt session and writes the answers into configs/runevault.conf. Default values are shown in brackets; pressing Enter accepts the default. Fields collected:

  • gRPC host + port (defaults 0.0.0.0 / 50051)
  • Admin socket path (default /opt/rune-vault/admin.sock)
  • TLS cert + key path (default: auto-generate self-signed into certs/)
  • enVector endpoint + API key (written inline into runevault.conf as envector.api_key)
  • Team secret (prompted with "generate a new one?" option; written inline into runevault.conf as tokens.team_secret)
  • Vault index name, embedding dim (defaults <team-name> / 1024)
  • Audit log mode (default file+stdout) and path (default logs/audit.log)

The same prompts run for AWS/GCP/OCI cloud deployments (before the Terraform apply that provisions the VM); answers are baked into the cloud-init that drops runevault.conf on the VM at first boot.

Preflight port check: install.sh probes the target gRPC port (from the interactive answers) before writing the service unit. On conflict: abort with a diff-style message naming the occupying process (lsof -i:<port> / ss -ltnp hint). Admin socket skips preflight (daemon unlinks any stale socket file before Listen).

Admin socket permissions: service unit's User=rune (systemd) / UserName key (launchd plist) ensures the daemon-created admin.sock is owned by the vault user and has mode 0600. Installer verifies the mode after first daemon start.

Install directory layout (/opt/rune-vault/ on Linux and macOS alike):

/opt/rune-vault/
  runevault                          # binary, target of /usr/local/bin/runevault symlink
  admin.sock                         # UDS created by daemon (mode 0600, vault-user owned)
  .runevault.pid                     # hidden — daemon writes PID here, removes on clean exit
  configs/                           # operator-editable config + dynamic runtime state
    runevault.conf                   # YAML config (mode 0600 — holds inline api_key + team_secret)
    roles.yml                        # dynamic state, daemon rewrites (mode 0600)
    tokens.yml                       # dynamic state, daemon rewrites (mode 0600)
  certs/
    server.pem                       # TLS cert (auto-generated on fresh install unless supplied)
    server.key                       # mode 0600
  vault-keys/                        # FHE key files (Enc/Sec/Eval JSON)
  logs/
    audit.log                        # if audit.mode includes `file`

runevault.conf is visible (operators edit it routinely). .runevault.pid is hidden (daemon-internal bookkeeping). The daemon owns runtime artefacts (.runevault.pid, admin.sock, the contents of configs/roles.yml + configs/tokens.yml); the installer owns everything else on fresh install. Nothing references Docker volumes.

Acceptance

  • Fresh Ubuntu 22.04 / 24.04: install.sh leaves systemctl status runevault active, runevault status returns 200, Docker is not installed on the host
  • Fresh macOS 14: LaunchAgent running, runevault status returns 200
  • --uninstall cleanly removes the service and unit file; data removal is prompted
  • Interactive install session answering gRPC port: 51000 produces a runevault.conf with server.grpc.port: 51000; the running daemon listens on that port; runevault status succeeds via the default admin socket
  • Installer respects the "accept default" path (every prompt Enter-through) and writes a complete runevault.conf plus referenced secret files without further operator action
  • Scripted installs drive the prompts through stdin (heredoc / expect) — works in CI smoke tests
  • runevault daemon stop / daemon restart run by an operator directly (no installer / service manager involvement) successfully controls the service
  • Preflight aborts with a clear message when the target gRPC port is already occupied, naming the occupying process
  • Admin socket has mode 0600 and vault-user ownership after first daemon start; connect() from another user fails with permission denied
  • Cloud AWS/GCP/OCI deployments succeed with no Docker on the VM
  • Signature-mismatch → abort before any write to disk (no partial state)

Open questions

  • Homebrew tap — deferred to a post-migration packaging effort.
  • systemd unit Type=simple vs Type=notify — decided in Phase 1 skeleton PR; this phase just consumes that decision when templating the unit.

Sequencing

Phase 3 of 3. Depends on Phase 1 (#61) and Phase 2 (#63). Final phase.

Metadata

Metadata

Labels

epicLarge work item spanning multiple PRs / tasksinfraInfrastructure and deploymentmigrationPlatform / stack migration work

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions