Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 66 additions & 0 deletions .github/workflows/redis-proxy-docker.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
name: redis-proxy Docker Image

permissions:
contents: read
packages: write

on:
push:
branches: [main]
paths:
- 'cmd/redis-proxy/**'
- 'proxy/**'
- 'Dockerfile.redis-proxy'
- '.github/workflows/redis-proxy-docker.yml'
- 'go.mod'
- 'go.sum'
pull_request:
paths:
- 'cmd/redis-proxy/**'
- 'proxy/**'
- 'Dockerfile.redis-proxy'
- '.github/workflows/redis-proxy-docker.yml'
- 'go.mod'
- 'go.sum'

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v4

- name: Login to GitHub Container Registry
uses: docker/login-action@v4
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Docker metadata
id: meta
uses: docker/metadata-action@v6
with:
images: ghcr.io/${{ github.repository }}/redis-proxy
tags: |
type=sha
type=ref,event=branch
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This workflow publishes a branch tag (e.g., :main) via type=ref,event=branch. The deployment guide currently documents only :latest and :sha-...; either drop the branch tag here or update the docs so users know all published tags.

Suggested change
type=ref,event=branch

Copilot uses AI. Check for mistakes.
type=raw,value=latest,enable={{is_default_branch}}

- name: Build and push
uses: docker/build-push-action@v7
with:
context: .
file: ./Dockerfile.redis-proxy
platforms: linux/amd64
push: ${{ github.event_name != 'pull_request' }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
13 changes: 13 additions & 0 deletions Dockerfile.redis-proxy
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
FROM golang:latest AS build

WORKDIR $GOPATH/src/app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve Docker build performance and leverage layer caching more effectively, you should copy go.mod and go.sum first, download dependencies, and then copy the rest of the source code. This prevents re-downloading dependencies every time any file in the repository changes. The structure should be:

COPY go.mod go.sum ./
RUN go mod download
COPY . .


RUN CGO_ENABLED=0 go build -o /redis-proxy ./cmd/redis-proxy/

FROM gcr.io/distroless/static-debian12:nonroot
COPY --from=build /redis-proxy /redis-proxy

ENTRYPOINT ["/redis-proxy"]
289 changes: 289 additions & 0 deletions docs/redis-proxy-deployment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,289 @@
# redis-proxy Deployment Guide

redis-proxy is a Redis-protocol reverse proxy that enables gradual migration from Redis to ElasticKV through dual-write, shadow-read comparison, and phased primary cutover.

## Docker Image

Pre-built images are published to GitHub Container Registry when relevant files change on `main` (see path filters in the workflow):

```
ghcr.io/bootjp/elastickv/redis-proxy:latest
ghcr.io/bootjp/elastickv/redis-proxy:sha-<commit>
```
Comment on lines +7 to +12
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs list only latest and sha-... image tags, but the workflow configuration also publishes a branch tag (e.g., :main) via type=ref,event=branch. Either document the additional tag here or remove the branch tag from the workflow to match the docs.

Copilot uses AI. Check for mistakes.

The CI workflow (`.github/workflows/redis-proxy-docker.yml`) builds the image automatically when files under `cmd/redis-proxy/`, `proxy/`, `Dockerfile.redis-proxy`, `go.mod`, `go.sum`, or the workflow file itself change.

### Building locally

```bash
# Docker
docker build -f Dockerfile.redis-proxy -t redis-proxy .

# Binary
go build -o redis-proxy ./cmd/redis-proxy/
```

## Command-Line Options

| Flag | Default | Description |
|------|---------|-------------|
| `-listen` | `:6479` | Proxy listen address |
| `-primary` | `localhost:6379` | Primary (Redis) address |
Comment on lines +28 to +31
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The markdown table header uses a double leading pipe (|| ... |), which renders as an extra empty column in most Markdown viewers. Use a single leading pipe for the header and separator row (and apply the same fix to the other tables in this doc).

Copilot uses AI. Check for mistakes.
| `-primary-db` | `0` | Primary Redis DB number |
| `-primary-password` | (empty) | Primary Redis password |
| `-secondary` | `localhost:6380` | Secondary (ElasticKV) address |
| `-secondary-db` | `0` | Secondary Redis DB number |
| `-secondary-password` | (empty) | Secondary Redis password |
| `-mode` | `dual-write` | Proxy mode (see below) |
| `-secondary-timeout` | `5s` | Secondary write timeout |
| `-shadow-timeout` | `3s` | Shadow read timeout |
| `-sentry-dsn` | (empty) | Sentry DSN (empty = disabled) |
| `-sentry-env` | (empty) | Sentry environment name |
| `-sentry-sample` | `1.0` | Sentry sample rate |
| `-metrics` | `:9191` | Prometheus metrics endpoint |

## Proxy Modes

Five modes support a phased migration strategy.

| Mode | Reads from | Writes to | Use case |
|------|-----------|-----------|----------|
| `redis-only` | Redis | Redis only | Transparent proxy. Route traffic through the proxy first |
| `dual-write` | Redis | Redis + ElasticKV | Begin data sync. Populate ElasticKV |
Comment on lines +49 to +52
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proxy modes table also starts with a double leading pipe (||), which will render an extra empty column. Use a single leading pipe for proper markdown table formatting.

Copilot uses AI. Check for mistakes.
| `dual-write-shadow` | Redis (+ shadow compare from ElasticKV) | Redis + ElasticKV | Verify read consistency between backends |
| `elastickv-primary` | ElasticKV (+ shadow compare from Redis) | ElasticKV + Redis | Promote ElasticKV to primary. Redis as fallback |
| `elastickv-only` | ElasticKV | ElasticKV only | Migration complete. Decommission Redis |

### Recommended Migration Path

```
redis-only -> dual-write -> dual-write-shadow -> elastickv-primary -> elastickv-only
```

Monitor metrics at each stage and roll back to the previous mode if issues arise. Mode changes require a proxy restart.

## Deployment Examples

### Minimal (redis-only)

```bash
docker run --rm \
-p 6379:6379 \
ghcr.io/bootjp/elastickv/redis-proxy:latest \
-listen :6379 \
-primary redis.internal:6379 \
-mode redis-only
```
Comment on lines +69 to +76
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the “Minimal (redis-only)” docker run example, the container listens on :6379 but no -p/--publish is set, so the proxy won’t be reachable from the host in a typical local run. Consider adding the appropriate -p 6379:6379 mapping (or clarify that this example assumes a user-defined Docker network / --network host).

Copilot uses AI. Check for mistakes.
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback


Point your application at the proxy. Behavior is identical to connecting directly to Redis.

### Dual-Write with Shadow Comparison

```bash
docker run --rm \
-p 6379:6479 \
-p 9191:9191 \
ghcr.io/bootjp/elastickv/redis-proxy:latest \
-listen :6479 \
-primary redis.internal:6379 \
-primary-password "${REDIS_PASSWORD}" \
-secondary elastickv.internal:6380 \
-mode dual-write-shadow \
-secondary-timeout 5s \
-shadow-timeout 3s \
-sentry-dsn "${SENTRY_DSN}" \
-sentry-env production \
-metrics :9191
```

### Docker Compose

```yaml
services:
redis-proxy:
image: ghcr.io/bootjp/elastickv/redis-proxy:latest
ports:
- "6379:6479"
- "9191:9191"
command:
- -listen=:6479
- -primary=redis:6379
- -secondary=elastickv:6380
- -mode=dual-write-shadow
- -metrics=:9191
depends_on:
- redis
- elastickv

redis:
image: redis:7
ports:
- "6379"

elastickv:
image: ghcr.io/bootjp/elastickv:latest
ports:
- "6380"
```

### Kubernetes

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis-proxy
spec:
replicas: 1
selector:
matchLabels:
app: redis-proxy
template:
metadata:
labels:
app: redis-proxy
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9191"
spec:
containers:
- name: redis-proxy
image: ghcr.io/bootjp/elastickv/redis-proxy:latest
args:
- -listen=:6479
- -primary=redis:6379
- -secondary=elastickv:6380
- -mode=dual-write-shadow
- -metrics=:9191
ports:
- containerPort: 6479
name: redis
- containerPort: 9191
name: metrics
livenessProbe:
tcpSocket:
port: 6479
initialDelaySeconds: 5
periodSeconds: 10
readinessProbe:
tcpSocket:
port: 6479
initialDelaySeconds: 3
periodSeconds: 5
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: "1"
memory: 512Mi
```

> **Note:** The distroless base image does not include `redis-cli`. If you want to use the `exec`-based probe below, build a redis-proxy image that includes `redis-cli` (or another ping tool) in the same container. Otherwise, prefer the `tcpSocket` probes shown in the Deployment spec above or an HTTP health endpoint.

```yaml
# Alternative: exec-based probe (requires redis-cli in the image)
livenessProbe:
exec:
command:
- /bin/sh
- -c
- 'redis-cli -p 6479 PING || exit 1'
initialDelaySeconds: 5
periodSeconds: 10
```

## Health Checks

The proxy does not expose an HTTP health endpoint. Use the Redis `PING` command to verify availability:

```bash
redis-cli -p 6479 PING
# PONG
```

## Prometheus Metrics

Available at `/metrics` on the address specified by `-metrics`.

### Key Metrics

| Metric | Type | Description |
|--------|------|-------------|
| `proxy_command_total` | Counter | Commands processed (labels: command, backend, status) |
| `proxy_command_duration_seconds` | Histogram | Backend command latency |
Comment on lines +211 to +214
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Key Metrics markdown table uses a double leading pipe (||), which typically renders as an extra empty column. Switch to a single leading pipe for the header and separator rows.

Copilot uses AI. Check for mistakes.
| `proxy_primary_write_errors_total` | Counter | Primary write errors |
| `proxy_secondary_write_errors_total` | Counter | Secondary write errors |
| `proxy_primary_read_errors_total` | Counter | Primary read errors |
| `proxy_shadow_read_errors_total` | Counter | Shadow read errors |
| `proxy_divergences_total` | Counter | Shadow read mismatches (labels: command, kind) |
| `proxy_migration_gap_total` | Counter | Expected mismatches from incomplete migration (labels: command) |
| `proxy_async_drops_total` | Counter | Async operations dropped due to backpressure |
| `proxy_active_connections` | Gauge | Current active client connections |
| `proxy_pubsub_shadow_divergences_total` | Counter | Pub/Sub shadow message mismatches (labels: kind) |
| `proxy_pubsub_shadow_errors_total` | Counter | Pub/Sub shadow operation errors |

### Recommended Alerts

```yaml
groups:
- name: redis-proxy
rules:
- alert: ProxyDivergenceHigh
expr: rate(proxy_divergences_total[5m]) > 0
for: 10m
annotations:
summary: "Data mismatch detected between primary and secondary"

- alert: ProxySecondaryWriteErrors
expr: rate(proxy_secondary_write_errors_total[5m]) > 1
for: 5m
annotations:
summary: "Secondary backend write errors are elevated"

- alert: ProxyAsyncDrops
expr: rate(proxy_async_drops_total[5m]) > 0
for: 5m
annotations:
summary: "Async goroutine limit reached; secondary may be slow"
```

## Internal Parameters

| Parameter | Value | Description |
|-----------|-------|-------------|
| Connection pool size | 128 | go-redis pool size per backend |
| Dial timeout | 5s | Backend connection timeout |
| Read timeout | 3s | Backend read timeout |
Comment on lines +253 to +257
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Internal Parameters markdown table also starts with a double leading pipe (||), which usually creates an empty first column. Use a single leading pipe for correct table formatting.

Copilot uses AI. Check for mistakes.
| Write timeout | 3s | Backend write timeout |
| Async write goroutine limit | 4096 | Max concurrent secondary writes |
| Shadow read goroutine limit | 1024 | Max concurrent shadow comparisons |
| PubSub compare window | 2s | Message matching window |
| PubSub sweep interval | 500ms | Expired message scan interval |

## Graceful Shutdown

The proxy handles `SIGINT` / `SIGTERM` for graceful shutdown:

1. Stops accepting new connections
2. Waits for in-flight async goroutines to complete
3. Releases backend connection pools
4. Flushes Sentry buffers (up to 2 seconds)

Recommended shutdown order: `redis-proxy -> application -> Redis / ElasticKV`.

## Troubleshooting

### Secondary writes are falling behind
- Check `proxy_async_drops_total`. If increasing, the goroutine limit is being hit.
- Reduce `-secondary-timeout` to fail fast on slow secondaries.
- Investigate secondary (ElasticKV) performance.

### High divergence count
- Also check `proxy_migration_gap_total`. Pre-migration missing keys are counted as gaps, not divergences.
- In `dual-write-shadow` mode, inspect `proxy_divergences_total` labels to identify which commands are mismatched.

### Pub/Sub messages missing
- Check `proxy_pubsub_shadow_divergences_total`.
- `kind=data_mismatch`: message received by primary but not secondary.
- `kind=extra_data`: message received by secondary only.
Loading