Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Empty file.
299 changes: 8 additions & 291 deletions docker/docker-compose-examples/analytics/README.md
Original file line number Diff line number Diff line change
@@ -1,294 +1,11 @@
# dotCMS Analytics Complete Stack
# Content Analytics Infrastructure

This docker-compose setup provides a complete dotCMS instance pre-configured with the full analytics stack including CubeJS, ClickHouse, Jitsu, and Keycloak.
The Docker Compose setup for the dotCMS Content Analytics infrastructure
(ClickHouse cluster + `ca-event-manager`) lives in its own repository to avoid
duplicating configuration files across repos:

## Architecture Overview
**https://github.com/dotCMS/dot-ca-event-manager**

```
┌─────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ dotCMS │────│ Analytics Stack │────│ Data Layer │
│ │ │ │ │ │
│ - dotCMS │ │ - Keycloak (IDP) │ │ - ClickHouse │
│ - OpenSearch│ │ - Jitsu (Events) │ │ - PostgreSQL │
│ - Database │ │ - Cube (Read) │ │ - Redis │
└─────────────┘ │ - Configurator │ └─────────────────┘
└──────────────────┘
```

## Services and Ports

### Core dotCMS Services
- **dotCMS**: http://localhost:8082 (HTTPS: 8443)
- **dotCMS Database**: PostgreSQL (internal only)
- **OpenSearch**: http://localhost:9200 (internal + external)
- **Glowroot**: http://localhost:4000 (monitoring)

### Analytics Services
- **Keycloak (IDP)**: http://localhost:61111
- **dotCMS Analytics Configurator**: http://localhost:8088
- **Jitsu (Event Collection)**: http://localhost:8081
- **CubeJS (Analytics Read)**: http://localhost:4001
- **ClickHouse (Data Warehouse)**: http://localhost:8124
- **Analytics Database**: PostgreSQL (internal only)

## Pre-configured Analytics Settings

The dotCMS instance is pre-configured with the following analytics settings via environment variables:

### Internal URLs (Container-to-Container)
```bash
ANALYTICS_IDP_URL="http://keycloak:8080/realms/dotcms/protocol/openid-connect/token"
ANALYTICS_APP_CONFIG_URL="http://dotcms-analytics:8080/c/customer1/cluster1/keys"
ANALYTICS_APP_WRITE_URL="http://jitsu:8001/api/v1/event"
ANALYTICS_APP_READ_URL="http://cube:4000"
```

### External URLs (Host Access)
For browser/external access, these map to:
```bash
ANALYTICS_IDP_URL="http://localhost:61111/realms/dotcms/protocol/openid-connect/token"
ANALYTICS_APP_CONFIG_URL="https://localhost:8088/c/customer1/cluster1/keys"
ANALYTICS_APP_WRITE_URL="https://localhost:8081/api/v1/event"
ANALYTICS_APP_READ_URL="https://localhost:4001"
```

### Client Configuration (customer1:cluster1)
```bash
Analytics Client ID: "analytics-customer-customer1"
Analytics Client Secret: "testsecret"
Analytics Key: [Auto-generated by configurator]
```

## Getting Started

### Quick Start Options

Choose your startup method based on your needs:

#### Option 1: Using the Startup Script (Recommended)
```bash
# Analytics services only (faster startup, less resources)
./start-analytics.sh --analytics-only

# Full stack with dotCMS (complete development environment)
./start-analytics.sh

# Force recreate containers (required for environment variable changes)
./start-analytics.sh --force-recreate
./start-analytics.sh --analytics-only --force-recreate

# Show help and service details
./start-analytics.sh --help
```

#### Option 2: Using Docker Compose Directly
```bash
# Analytics services only
docker-compose up -d

# Full stack with dotCMS
docker-compose --profile full up -d

# Force recreate containers (for environment variable changes)
docker-compose up -d --force-recreate
docker-compose --profile full up -d --force-recreate

# Stop everything (including dotCMS services)
docker-compose --profile full down
```

### Startup Modes

**Analytics Only Mode** (`--analytics-only`):
- Faster startup and lower resource usage
- Includes: Keycloak, Analytics API, Jitsu, Cube, ClickHouse, Redis, PostgreSQL
- Best for: Analytics development, testing API integrations

**Full Stack Mode** (Default):
- Complete development environment
- Includes: All analytics services + dotCMS + OpenSearch + dotCMS Database
- Best for: End-to-end testing, content + analytics workflows

### Wait for Services

```bash
# Check service health
docker-compose ps

# Monitor startup logs
docker-compose logs -f keycloak dotcms-analytics

# For full stack, monitor dotCMS startup
docker-compose logs -f dotcms
```

### Access Your Services

**Analytics Services (Always Available):**
- **Keycloak Admin**: http://localhost:61111 (admin:keycloak)
- **Analytics API**: http://localhost:8088
- **Cube Analytics**: http://localhost:4001
- **Jitsu Events**: http://localhost:8081
- **ClickHouse**: http://localhost:8124

**dotCMS Services (Full Stack Only):**
- **dotCMS**: http://localhost:8082 (admin@dotcms.com:admin)
- **Glowroot**: http://localhost:4000

### Verify Analytics Configuration (Full Stack)

1. Access dotCMS at http://localhost:8082
2. Navigate to: Apps → dotExperiments-config
3. Analytics should be pre-configured with the URLs above
4. Test connection to verify all services are communicating

## Network Architecture

### Networks
- **dotcms-net**: Isolated network for dotCMS core services (dotCMS, database, OpenSearch)
- **analytics-net**: Isolated network for analytics services (Keycloak, Jitsu, CubeJS, ClickHouse)
- **Bridge**: dotCMS connects to both networks to communicate with analytics services

### Security
- Internal service communication uses container names (e.g., `keycloak:8080`)
- External access uses host ports (e.g., `localhost:61111`)
- Databases are isolated and only accessible within their respective networks
- Analytics uses JWT-based authentication with Keycloak

## Environment Variables

Key environment variables that can be customized:

```bash
# Ports
KEYCLOAK_HOST_PORT=61111
DOTCMS_ANALYTICS_HOST_PORT=8088
JITSU_HOST_PORT=8081
CUBE_HOST_PORT=4001
CH_HOST_PORT=8124

# Database
POSTGRESQL_DB=postgres
POSTGRESQL_USER=postgres
POSTGRESQL_PASS=postgres

# ClickHouse
CH_DB=clickhouse_test_db
CH_USER=clickhouse_test_user
CH_PWD=clickhouse_password

# Keycloak
KEYCLOAK_ADMIN=admin
KEYCLOAK_ADMIN_PASSWORD=keycloak

# dotCMS Experiment Features
DOT_ENABLE_EXPERIMENTS_AUTO_JS_INJECTION=true
```

### ⚠️ Important: Environment Variable Changes

**Environment variables are set when containers are first created and are NOT automatically updated when you restart services.**

To apply changes to environment variables:

1. **Stop and recreate containers:**
```bash
docker-compose down
docker-compose up -d --force-recreate
```

2. **Or use the startup script with force recreate:**
```bash
./start-analytics.sh --force-recreate
./start-analytics.sh --analytics-only --force-recreate
```

3. **For individual services:**
```bash
docker-compose up -d --force-recreate [service-name]
```

**Why this happens:** Docker containers bake environment variables into the container at creation time. Simply restarting (`docker-compose restart`) keeps the existing container with old environment variables. You must recreate the container to pick up new environment variables from the docker-compose.yml file.

## Key Features

### ✅ Complete Analytics Integration
- **Pre-configured dotCMS** with analytics URLs and client credentials
- **Defense-in-depth security** with multi-layer filtering
- **ClickHouse optimization** for customer partition elimination
- **JWT-based authentication** via Keycloak

### ✅ Development Ready
- **Hot-reload** CubeJS schema changes via volume mounts
- **Debug logging** enabled for troubleshooting
- **Health checks** for all critical services
- **Glowroot monitoring** for performance analysis

### ✅ Production Patterns
- **Separate databases** for dotCMS and analytics
- **Network isolation** between service layers
- **Persistent volumes** for data retention
- **Environment-based configuration**

## Troubleshooting

### Common Issues

1. **Services not starting:**
```bash
# Check logs
docker-compose logs [service-name]

# Restart specific service
docker-compose restart [service-name]
```

2. **Analytics connection issues:**
- Verify all services are running: `docker-compose ps`
- Check network connectivity: `docker-compose exec dotcms ping keycloak`
- Verify URLs in dotCMS analytics configuration

3. **Permission issues:**
```bash
# Fix volume permissions
sudo chown -R 1000:1000 ./setup/
```

4. **Port conflicts:**
- Modify port mappings in docker-compose.yml
- Update corresponding environment variables

### Useful Commands

```bash
# View service logs
docker-compose logs -f dotcms
docker-compose logs -f cube

# Access service shells
docker-compose exec dotcms bash
docker-compose exec analytics-postgres psql -U postgres

# Restart specific services
docker-compose restart dotcms keycloak

# Clean restart
docker-compose down -v
docker-compose up -d
```

## Next Steps

1. **Configure A/B Testing**: Set up experiments in dotCMS
2. **Create Dashboards**: Build analytics dashboards using CubeJS
3. **Monitor Performance**: Use Glowroot for application monitoring
4. **Scale Services**: Add replicas for high availability
5. **Production Hardening**: Implement proper secrets management and SSL certificates

## Security Considerations

- Change default passwords in production
- Use proper SSL certificates for external access
- Implement proper firewall rules
- Regular security updates for all services
- Monitor access logs and authentication attempts
Refer to the `docker/` directory in that repository for the full setup,
including the ClickHouse keeper, replica nodes, initialization scripts, and
the event manager service.
Loading
Loading