Version:17.06
Source: https://github.com/nicolaka/checklist
docker run -t nicolaka/checklist:17.06
- Cluster Sizing and Zoning
- Supported and Compatible ( OS, Docker Engine, UCP, DTR)
- Adequate Resource ( Manager vs Worker Nodes)
- Manager: 16G mem, 4 vCPU, 1+ Gbps, 32+ GB disk
- Worker(minimum): 4G mem, 2 vCPU, 100+ Mbps, 8 GB disk
- Resources
- Redundant/Highly Available UCP managers
- Deployed in odd numbers (3,5,7) to maintain quorum
- Distributed across data centers or availability zones (1-1-1, 2-2-1..etc)
- Fine-tuned orchestration settings
- Upstream TCP load balancing
- No application workloads on managers
- Automated join and leave process
- Labeled resources (networks, volumes, containers, services, secrets, nodes)
- Resources * Docker EE Reference Architecture * UCP Architecture * Limiting Application Deployment Workers * Resource Labeling
- Redundant (3,5,7) DTR Replicas
- Replicated and secured image backend storage (NFS, S3, Azure Storage…etc)
- Garbage collection enabled
- Security scanning enabled
- Resources
- Utilize Docker EE RBAC Model ( Subjects, Grants, Roles, Collection, Resource)
- AD/LDAP groups mapped to teams and organizations
- Docker Content Trust Signing and Enforcement
- Regular Run of Docker Security Bench
- Restricted direct access (SSH/RDP)
- Utilize built-in Secrets functionality (encrypted, controlled)
- Rotate orchestration join keys
- Use built-in or your own CA for intra-cluster mTLS (Node Identity, Mgmt Traffic)
- Valid SSL/TLS certificates for UCP and DTR
- Resources:
- Pick right networking driver for your application
- Select proper publishing mode ( Ingress vs. Host Mode)
- Pick suitable load-balancing mode ( client side = dnsrr, server-side = vip)
- Network latency < 100ms
- Segment App at L3 with Overlays (1 App 1 Overlay Network)
- Utilize built-in encrypted overlay feature ( app <--> app encrypted)
- Pick the application subnet size carefully
- Designated non-overlapping subnets to be used by Docker for overlay networks
- Resources:
- Production-ready configured engine storage backend
- Replicated and secure DTR storage backend
- Certified and tested application data storage plugin for replicating application data
- Resources:
- External centralized logging for engine and application containers logs
- Local logging for active trouble-shooting
- Host-level and container-level resource monitoring
- DTR image backend storage monitoring
- Docker engine storage monitoring
- Use built-in application health checking functionality
- Resources:
- UCP and DTR are well integrated ( SSO, DCT..etc)
- CI/CD tooling ( Jenkins, Bamboo, CircleCI..etc)
- Development tooling (dev machines, IDEs)
- Configuration automation tools (Puppet, Chef, Ansible, Salt)
- Resource provisioning systems (Terraform..etc)
- Change management systems
- Internal/external DNS or other service discovery and registration systems
- Load balancing for both the management plane and each of the applications ( L4/L7)
- Incident/ticketing management systems (ServiceNow..etc)
- Regular (rec. weekly) backups (UCP, DTR, and Swarm)
- Well-tested, automated, and documented
- platform restoration
- upgrade + downgrade
- application recovery procedure
- Resources:
- Multi-platform image pull and push to DTR
- Confirm users have the right set of access to their respective resources
- Confirm application resource limitation works as expected
- End-to-end stack deployment from CLI and UI
- Updating applications with new configuration, images, networks using rolling upgrade