Context
Issue #220 landed pre-built LXC base images (matrix arm64 + x64, published to the `rolling-images` Release tag). Deploy time after the image is imported drops from 60–90s to ≤15s.
But on first run, fetching the ~300–500 MB image plus the ~10s import is roughly a wash with the current per-deploy build (~60–90s) on a typical home WiFi connection. For users who only ever deploy one agent, the image path can actually be slower than the build path.
Two improvements to make this a clear win in all cases
A. Shrink the published image
Current image likely includes:
- Full apt cache (`/var/cache/apt/archives`) — 20–50 MB recoverable
- Locale data (`/usr/share/locale/*` minus C/POSIX) — 30–80 MB recoverable
- npm install with devDependencies — 50–150 MB recoverable via `npm prune --production`
- Doc directories (`/usr/share/doc`, `/usr/share/man`) — 20–40 MB recoverable
- Possibly redundant Node.js binaries
In the build workflow, before `incus publish`:
```bash
incus exec build-base -- bash -c '
apt-get clean
rm -rf /var/cache/apt/archives /var/lib/apt/lists/*
rm -rf /usr/share/doc /usr/share/man /usr/share/locale/!(C|POSIX|en_US.utf8)
cd /opt/openclaw && npm prune --omit=dev --omit=optional || true
'
```
Target: ≤200 MB compressed per arch (down from ~300–500). At ≤200 MB, even a 50 Mbps home connection downloads in ~30s, and the import step is the same regardless of size.
B. Make `ensure_image_present` non-blocking on first boot
Current behaviour: `tinyagentos/app.py` startup hook calls `ensure_image_present` which downloads the image synchronously. If the user is doing a fresh install and just wants to see the UI, they're held up for a 300+ MB download before `/api` even responds healthy.
Options (pick one):
- Background — kick off the download as a background asyncio task; return startup-healthy immediately. Deployer's existing fallback to per-deploy build already covers the window before the image lands. First deploy ever may be slow (build), subsequent deploys are fast.
- Opt-in — only run `ensure_image_present` when the user explicitly clicks "Download fast-deploy image" in the providers / agents settings UI. Until they opt in, every deploy uses the build path.
- Lazy on first deploy — defer the download to the first `POST /api/agents/deploy` call. That deploy uses the build path while the image downloads in background; second deploy onwards uses the image.
Option 3 is probably the best UX: the user pays the cost while they're already waiting for something, not during "taOS startup".
Acceptance
Out of scope
- Switching the base distro (Alpine, Wolfi) — separate experiment if size is still a concern after the trims above.
- Image-streaming / simplestreams server — only worth it if we hit GitHub's bandwidth costs, which we won't on the Free tier.
- Peer-share via the cluster (a host that already has the image serves it to other workers) — depends on the broader cluster work.
Related
Context
Issue #220 landed pre-built LXC base images (matrix arm64 + x64, published to the `rolling-images` Release tag). Deploy time after the image is imported drops from 60–90s to ≤15s.
But on first run, fetching the ~300–500 MB image plus the ~10s import is roughly a wash with the current per-deploy build (~60–90s) on a typical home WiFi connection. For users who only ever deploy one agent, the image path can actually be slower than the build path.
Two improvements to make this a clear win in all cases
A. Shrink the published image
Current image likely includes:
In the build workflow, before `incus publish`:
```bash
incus exec build-base -- bash -c '
apt-get clean
rm -rf /var/cache/apt/archives /var/lib/apt/lists/*
rm -rf /usr/share/doc /usr/share/man /usr/share/locale/!(C|POSIX|en_US.utf8)
cd /opt/openclaw && npm prune --omit=dev --omit=optional || true
'
```
Target: ≤200 MB compressed per arch (down from ~300–500). At ≤200 MB, even a 50 Mbps home connection downloads in ~30s, and the import step is the same regardless of size.
B. Make `ensure_image_present` non-blocking on first boot
Current behaviour: `tinyagentos/app.py` startup hook calls `ensure_image_present` which downloads the image synchronously. If the user is doing a fresh install and just wants to see the UI, they're held up for a 300+ MB download before `/api` even responds healthy.
Options (pick one):
Option 3 is probably the best UX: the user pays the cost while they're already waiting for something, not during "taOS startup".
Acceptance
Out of scope
Related