feat: add Redis AOF persistence and pre-deploy migration backup

## Tech Story

As a platform engineer, I want Redis to persist data across restarts and the deploy pipeline to capture a database snapshot before running migrations, so that I never lose cached session data due to a Redis crash and always have a safe rollback point before any schema change.

## ELI5 Context

**What is RDB vs AOF persistence in Redis?**
By default Redis saves a snapshot of all data to disk every few minutes (RDB). If Redis crashes between snapshots, you lose everything written in that window — typically 1–5 minutes of data. AOF (Append-Only File) persistence writes every single command to a log file as it happens. With `appendfsync everysec`, the worst-case data loss is 1 second. Think of RDB as saving a Word document every 5 minutes vs AOF as Google Docs auto-saving every keystroke.

**Why does a migration need a backup before it runs?**
A database migration changes the schema — adding columns, dropping tables, renaming fields. If a migration runs halfway and then fails, the database can be left in an inconsistent state that prevents the app from starting. With a backup taken 30 seconds before the migration, the worst-case recovery is: restore backup, redeploy previous image, investigate. Without it, recovery means manually reverse-engineering what the half-run migration did.

**What is `appendfsync everysec`?**
This Redis config option tells Redis to flush the AOF log to disk every second. It's the recommended balance between performance (not flushing every single write) and safety (losing at most 1 second of data). The alternative `always` (flush every write) is too slow; `no` (let the OS decide) gives no guarantees.

## Technical Elaboration

### File: `docker-compose.prod.yml` — Redis service changes

```yaml
redis:
  image: redis:7-alpine
  restart: unless-stopped
  command: redis-server --requirepass ${REDIS_PASSWORD} --appendonly yes --appendfsync everysec
  volumes:
    - redis_aof:/data      # named volume — persists AOF file across container restarts
  healthcheck:
    test: ["CMD", "redis-cli", "-a", "${REDIS_PASSWORD}", "ping"]
    interval: 10s
    timeout: 3s
    retries: 3

volumes:
  postgres_data:
  redis_aof:               # renamed from redis_data to reflect AOF usage
```

Apply the same change to `docker-compose.staging.yml` — staging should mirror production configuration.

**Important**: if you previously had a volume named `redis_data`, rename it to `redis_aof` in the compose file and run `docker volume rename station_redis_data station_redis_aof` on the VPS, or simply recreate the Redis container (Redis data is ephemeral cache — loss is acceptable during a planned rename).

### Verifying AOF is active

After restarting the Redis container:
```bash
docker exec station-redis-1 redis-cli -a "${REDIS_PASSWORD}" INFO persistence | grep aof_enabled
# Expected output: aof_enabled:1
```

### File: `.github/workflows/release.yml` — pre-migration backup step

In the `deploy-production` job, add this step **before** the migration step:

```yaml
- name: Pre-migration database backup
  run: |
    ssh ${{ secrets.VPS_USER }}@${{ secrets.VPS_HOST }} \
      "cd /opt/station && \
       LABEL=pre-deploy-${{ github.sha }} \
       bash infra/scripts/backup-db.sh"
  # If this step fails, the workflow stops — migrations never run against an un-backed-up database
```

The `backup-db.sh` script already exists from issue #125. This step just calls it with a label so the backup filename includes the git SHA for traceability.

### File: `infra/scripts/backup-db.sh` — label support (small change)

Add an optional `LABEL` environment variable to the backup filename:
```bash
LABEL="${LABEL:-nightly}"
BACKUP_FILE="/tmp/station_backup_${TIMESTAMP}_${LABEL}.sql.gz"
```

So pre-deploy backups are named `station_backup_20260510_030000_pre-deploy-abc1234.sql.gz` and are distinguishable from nightly backups in B2.

### New file: `infra/docs/redis.md`

Document:
1. Why AOF is enabled (data safety, 1-second loss window)
2. How to verify AOF is active (`INFO persistence`)
3. What to do if Redis data is lost (it's a cache — the app degrades gracefully, no restore needed; sessions are re-created on next login)
4. How to inspect the AOF file size: `docker exec station-redis-1 redis-cli -a "${REDIS_PASSWORD}" INFO persistence | grep aof_current_size`

## Definition of Done

- [ ] `docker-compose.prod.yml` Redis service uses `--appendonly yes --appendfsync everysec`
- [ ] `docker-compose.staging.yml` Redis service matches production configuration
- [ ] Redis data volume renamed to `redis_aof` in both compose files
- [ ] `docker exec ... redis-cli INFO persistence | grep aof_enabled` returns `aof_enabled:1` on the VPS
- [ ] `.github/workflows/release.yml` has a pre-migration backup step that runs before `migration:run`
- [ ] If the backup step fails, the workflow fails (no migration runs) — verified by testing with a broken B2 credential
- [ ] `infra/scripts/backup-db.sh` accepts optional `LABEL` env var and includes it in the backup filename
- [ ] `infra/docs/redis.md` written covering AOF, verification, and recovery
- [ ] Staging environment tested end-to-end with AOF enabled

## Dependencies

- Depends on: #125 (backup script must exist before adding the pre-migration step)
- Depends on: #108 (Docker Compose production config — modifying that file)
- Depends on: #90 (release workflow — adding the backup step there)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Redis AOF persistence and pre-deploy migration backup #126

Tech Story

ELI5 Context

Technical Elaboration

File: `docker-compose.prod.yml` — Redis service changes

Verifying AOF is active

File: `.github/workflows/release.yml` — pre-migration backup step

File: `infra/scripts/backup-db.sh` — label support (small change)

New file: `infra/docs/redis.md`

Definition of Done

Dependencies

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

feat: add Redis AOF persistence and pre-deploy migration backup #126

Description

Tech Story

ELI5 Context

Technical Elaboration

File: docker-compose.prod.yml — Redis service changes

Verifying AOF is active

File: .github/workflows/release.yml — pre-migration backup step

File: infra/scripts/backup-db.sh — label support (small change)

New file: infra/docs/redis.md

Definition of Done

Dependencies

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

File: `docker-compose.prod.yml` — Redis service changes

File: `.github/workflows/release.yml` — pre-migration backup step

File: `infra/scripts/backup-db.sh` — label support (small change)

New file: `infra/docs/redis.md`