Skip to content

Commit 19baefd

Browse files
committed
Infra: Add uptime monitoring, harden deploy script
- Add UptimeRobot + Pushover monitoring for getcmdr.com - Build new image before stopping old container so failed builds don't take down the site - Log deploy output to `/var/log/cmdr/deploy-website.log` - Verify container is running after deploy - Document journalctl access for `deploy-cmdr` user
1 parent f4e16d5 commit 19baefd

3 files changed

Lines changed: 74 additions & 5 deletions

File tree

docs/tooling/monitoring.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# Uptime monitoring
2+
3+
We use [UptimeRobot](https://uptimerobot.com/) (free tier) to monitor `getcmdr.com`.
4+
5+
## What's monitored
6+
7+
- **HTTP monitor** on `https://getcmdr.com` — checks every 5 minutes from multiple regions.
8+
9+
## Alerts
10+
11+
- **Email**: Goes to the account owner's email (default alert contact).
12+
- **Pushover**: Push notifications to phone via [Pushover](https://pushover.net/). Configured as an alert contact in UptimeRobot.
13+
14+
## Links
15+
16+
- **Dashboard**: https://uptimerobot.com/dashboard (login required)
17+
- **Public status page**: https://stats.uptimerobot.com/MHKbVOfrcB
18+
19+
## Notes
20+
21+
- Free tier allows 50 monitors at 5-minute intervals.
22+
- If we need to monitor more endpoints later (for example, the license server), add them in the UptimeRobot dashboard.

infra/deploy-webhook/README.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,36 @@ Webhook listener for GitHub Actions to trigger deployments without requiring SSH
1010
4. Webhook verifies the HMAC-SHA256 signature
1111
5. If valid, runs `deploy-website.sh`
1212

13+
The deploy script builds the new Docker image **before** stopping the old container. If the
14+
build fails, the existing site stays up.
15+
1316
## Files
1417

1518
- `hooks.json` — Webhook configuration (reads secret from env var)
1619
- `deploy-website.sh` — The actual deployment script
1720

21+
## Logs
22+
23+
Deploy output is appended to `/var/log/cmdr/deploy-website.log` on the server.
24+
25+
To view recent deploy logs:
26+
27+
```bash
28+
ssh hetzner "tail -50 /var/log/cmdr/deploy-website.log"
29+
```
30+
31+
## Granting journalctl access to the deploy user
32+
33+
The `deploy-cmdr` user can't read systemd journal by default. To fix that without
34+
giving sudo, add it to the `systemd-journal` group:
35+
36+
```bash
37+
sudo usermod -aG systemd-journal deploy-cmdr
38+
```
39+
40+
After that, `deploy-cmdr` can run `journalctl` to see service logs (read-only, no
41+
sudo needed).
42+
1843
## Security
1944

2045
The webhook uses HMAC-SHA256 signature verification. Only requests signed with the correct

infra/deploy-webhook/deploy-website.sh

Lines changed: 27 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,20 +3,42 @@ set -e
33

44
# Deploy website script
55
# Triggered by GitHub Actions via webhook after CI passes
6+
#
7+
# Safety: builds the new image BEFORE stopping the old container.
8+
# If the build fails, the existing site stays up.
69

10+
LOG_DIR="/var/log/cmdr"
11+
LOG_FILE="$LOG_DIR/deploy-website.log"
12+
mkdir -p "$LOG_DIR"
13+
14+
# Redirect all output to both the log file and stdout
15+
exec > >(tee -a "$LOG_FILE") 2>&1
16+
17+
echo ""
718
echo "=== Starting website deployment ==="
8-
echo "Time: $(date)"
19+
echo "Time: $(date --iso-8601=seconds)"
920

1021
cd /opt/cmdr
1122

1223
echo "Pulling latest code..."
1324
git pull origin main
1425

15-
echo "Rebuilding website container..."
26+
echo "Building new image (old site stays up during build)..."
1627
cd apps/website
17-
docker compose down
1828
docker compose build --no-cache
29+
30+
echo "Swapping containers..."
31+
docker compose down
1932
docker compose up -d
2033

21-
echo "=== Deployment complete ==="
22-
echo "Time: $(date)"
34+
echo "Verifying container is running..."
35+
sleep 2
36+
if docker compose ps --status running | grep -q getcmdr-static; then
37+
echo "=== Deployment succeeded ==="
38+
else
39+
echo "=== ERROR: Container not running after deploy ==="
40+
docker compose logs --tail 20
41+
exit 1
42+
fi
43+
44+
echo "Time: $(date --iso-8601=seconds)"

0 commit comments

Comments
 (0)