# Blue/Green Deployment with Automated Monitoring
Production-ready blue/green deployment system with Docker Compose, featuring automated monitoring, error detection, and Slack notifications.
## Features
- **Zero-Downtime Deployments**: Seamless switching between blue and green pools
- **Automatic Failover**: Nginx automatically routes to backup pool when primary fails
- **Real-Time Monitoring**: Continuous monitoring of error rates and failover events
- **Slack Notifications**: Instant alerts for high error rates and failover events
- **Structured Logging**: JSON-formatted nginx logs with pool, release, and latency tracking
- **Health Checks**: Automated container health monitoring
- **Comprehensive Testing**: Automated test suite for all deployment scenarios
## Architecture
```
┌─────────────┐
│ Clients │
└──────┬──────┘
│
┌──────▼──────┐
│ Nginx │ (Port 8080)
│ Reverse │
│ Proxy │
└──┬───────┬──┘
│ │
┌──────────────┘ └──────────────┐
│ │
┌───▼────┐ ┌───▼────┐
│ Blue │ (Primary) │ Green │ (Backup)
│ Pool │ │ Pool │
└────────┘ └────────┘
│ │
└─────────────┬───────────────────────┘
│
┌───────▼────────┐
│ Alert Watcher │
│ (Monitors │
│ Nginx Logs) │
└───────┬────────┘
│
┌───────▼────────┐
│ Slack │
│ Notifications │
└────────────────┘
```
## Quick Start
1. **Setup environment**
```bash
# Make entrypoint executable
chmod +x ./nginx/entrypoint.sh
# Configure your environment (update with your values)
cat > .env << EOF
BLUE_IMAGE=your-image:tag
GREEN_IMAGE=your-image:tag
ACTIVE_POOL=blue
RELEASE_ID_BLUE=v1.0.0-blue
RELEASE_ID_GREEN=v1.0.0-green
SLACK_WEBHOOK_URL=https://hooks.slack.com/services/YOUR/WEBHOOK/URL
ERROR_RATE_THRESHOLD=2
WINDOW_SIZE=200
ALERT_COOLDOWN_SEC=300
MAINTENANCE_MODE=false
PORT=3000
EOF
```
2. **Start the stack**
```bash
docker compose up -d
```
3. **Verify deployment**
```bash
# Check all services are healthy
docker compose ps
# Test the application
curl http://localhost:8080
# Check which pool is active
curl -I http://localhost:8080 | grep -i x-app-pool
# Monitor logs
docker compose logs -f alert_watcher
```
4. **Run automated tests**
```bash
./test_deployment.sh
```
## System Components
### 1. Application Pools
- **Blue Pool** (`app_blue`): Primary deployment pool
- **Green Pool** (`app_green`): Secondary deployment pool for zero-downtime updates
### 2. Nginx Reverse Proxy
- Routes traffic to active pool
- Automatic failover to backup pool on health check failures
- Structured JSON logging with pool tracking
### 3. Alert Watcher
- Monitors nginx access logs in real-time
- Detects error rates exceeding threshold (default: 2%)
- Tracks pool failover events
- Sends alerts to Slack with cooldown period (5 minutes)
## Monitoring & Alerts
### Alert Types
#### 1. High Error Rate Alert
Triggered when error rate exceeds 2% over the last 200 requests.
**Example:**
```
Error Rate Alert
High error rate detected: 25.00% over last 200 requests (threshold: 2.0%)
• Error Rate: 25.00%
• Threshold: 2.0%
• Window Size: 200
• Errors: 50
• Current Pool: green
• Timestamp: 2025-10-30 15:23:35 UTC
```
#### 2. Failover Alert
Triggered when traffic fails over from primary to backup pool.
**Example:**
```
Failover Alert
Failover detected: Traffic switched from blue to green
• Previous Pool: blue
• Current Pool: green
• Upstream Address: 172.20.0.3:3000
• Release ID: v1.0.0-green-2024
• Timestamp: 2025-10-30 15:26:37 UTC
```
## Environment Variables (.env)
```bash
# Application Images
BLUE_IMAGE=your-image:tag
GREEN_IMAGE=your-image:tag
# Active Pool Configuration
ACTIVE_POOL=blue # Current active pool (blue or green)
# Release Identifiers
RELEASE_ID_BLUE=v1.0.0-blue
RELEASE_ID_GREEN=v1.0.0-green
# Slack Webhook for Alerts
SLACK_WEBHOOK_URL=https://hooks.slack.com/services/YOUR/WEBHOOK/URL
# Monitoring Configuration
ERROR_RATE_THRESHOLD=2 # Error rate percentage threshold
WINDOW_SIZE=200 # Number of requests to analyze
ALERT_COOLDOWN_SEC=300 # Seconds between same alert type
MAINTENANCE_MODE=false # Set to true to suppress alerts
# Application Configuration
PORT=3000 # Application port
```
**Important:** No spaces around `=` in the .env file.
## Operations
### Viewing Logs
```bash
# All containers
docker compose logs -f
# Nginx access logs (structured JSON)
docker compose exec nginx tail -f /var/log/nginx/access.log | jq .
# Alert watcher
docker compose logs -f alert_watcher
# Specific app pool
docker compose logs -f app_blue
docker compose logs -f app_green
```
### Switching Active Pool
```bash
# Manual pool switch (zero-downtime)
sed -i 's/ACTIVE_POOL=blue/ACTIVE_POOL=green/' .env
docker compose up -d nginx
# Verify switch
curl -I http://localhost:8080 | grep -i x-app-pool
```
### Zero-Downtime Deployment
```bash
# Use the deployment script
./deploy.sh your-image:new-version
# The script will:
# 1. Pull new image
# 2. Update inactive pool
# 3. Wait for health checks
# 4. Switch traffic
# 5. Verify deployment
```
### Testing
```bash
# Run comprehensive test suite
./test_deployment.sh
# Tests include:
# - Health checks
# - Baseline traffic (150 requests)
# - Failover simulation
# - Error rate detection
# - Slack notification verification
```
### Viewing Structured Logs
```bash
# Single formatted log entry
docker compose exec nginx tail -1 /var/log/nginx/access.log | jq .
# Output:
{
"time_local": "30/Oct/2025:15:27:28 +0000",
"remote_addr": "172.20.0.1",
"request": "GET / HTTP/1.1",
"status": 200,
"body_bytes_sent": 1247,
"request_time": 0.008,
"upstream_addr": "172.20.0.3:3000",
"upstream_status": "200",
"upstream_response_time": "0.008",
"pool": "green",
"release": "v1.0.0-green-2024"
}
# Filter errors only
docker compose exec nginx tail -100 /var/log/nginx/access.log | jq 'select(.status >= 500)'
```
## Files Structure
```
.
├── docker-compose.yml # Main orchestration file
├── .env # Environment configuration
├── nginx/
│ └── entrypoint.sh # Nginx dynamic configuration
├── watcher/
│ ├── Dockerfile # Alert watcher container
│ ├── requirements.txt # Python dependencies
│ └── watcher.py # Monitoring script
├── test_deployment.sh # Automated test suite
├── runbook.md # Operational runbook
└── README.md # This file
```
## Operational Runbook
For detailed operational procedures, troubleshooting, and alert response guidelines, see [runbook.md](runbook.md).
The runbook includes:
- Alert types and response procedures
- Deployment procedures
- Troubleshooting guides
- Emergency procedures
- System health checks
- Configuration reference
## Troubleshooting
### Not Receiving Slack Alerts
```bash
# Test webhook manually
curl -X POST -H 'Content-type: application/json' \
--data '{"text":"Test alert"}' \
"$SLACK_WEBHOOK_URL"
# Check watcher has correct URL
docker compose exec alert_watcher env | grep SLACK_WEBHOOK_URL
# Restart watcher
docker compose restart alert_watcher
```
### Both Pools Unhealthy
```bash
# Check container status
docker compose ps
# Restart all services
docker compose restart app_blue app_green
# Wait for health checks
sleep 15
# Verify health
docker compose ps
```
### High Error Rate
```bash
# Check nginx logs for errors
docker compose exec nginx grep '"status":5' /var/log/nginx/access.log | tail -20 | jq .
# Switch to backup pool if needed
sed -i 's/ACTIVE_POOL=blue/ACTIVE_POOL=green/' .env
docker compose up -d nginx
```
## Advanced Configuration
### Adjusting Error Thresholds
Edit `.env` and restart watcher:
```bash
ERROR_RATE_THRESHOLD=5 # Increase to 5%
docker compose restart alert_watcher
```
### Customizing Health Checks
Edit `docker-compose.yml`:
```yaml
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:3000/healthz || exit 1"]
interval: 10s # Check every 10 seconds
timeout: 5s # Timeout after 5 seconds
retries: 3 # Retry 3 times before marking unhealthy
```
### Custom Alert Messages
Modify `watcher/watcher.py` to customize Slack message format and content.
## Production Checklist
- [ ] Slack webhook configured and tested
- [ ] Both pools using correct images
- [ ] Health check endpoints verified
- [ ] Error thresholds configured appropriately
- [ ] Test deployment script executed successfully
- [ ] Runbook reviewed by operations team
- [ ] Monitoring dashboard set up (optional)
- [ ] Backup and rollback procedures documented
## Support
For operational issues and alert responses, refer to:
- [runbook.md](runbook.md) - Complete operational guide
- `docker compose logs` - Container logs
- Slack alerts channel - Real-time notifications
## License
MIT
-
Notifications
You must be signed in to change notification settings - Fork 0
basit-devBE/Deploy-Watch
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published