**Chapter 9: Docker Image Optimization**

Efficient container images are not merely about disk space—they directly impact deployment velocity, security exposure, and runtime performance. Large images slow CI/CD pipelines, increase vulnerability surface area, and delay autoscaling responses. This chapter provides systematic approaches to analyzing, measuring, and optimizing Docker images using industry-standard tools and methodologies.

---

### 9.1 Image Size Analysis

Before optimizing, you must measure. Docker provides built-in inspection tools, but specialized utilities offer deeper visibility into layer composition and inefficiencies.

#### Docker Built-in Inspection

**Image History:**
```bash
docker history myapp:latest
```

Output shows each layer with size and creation command:
```
IMAGE          CREATED        CREATED BY                                      SIZE      COMMENT
abc123...      5 minutes ago  CMD ["node" "server.js"]                        0B        buildkit.dockerfile.v0
def456...      5 minutes ago  COPY . . # buildkit                             2.45MB    buildkit.dockerfile.v0
789ghi...      6 minutes ago  RUN npm ci # buildkit                           156MB     buildkit.dockerfile.v0
```

**Image Inspect:**
```bash
docker inspect myapp:latest --format='{{.Size}}'
# Returns size in bytes (convert: /1024/1024 for MB)
```

**System Usage:**
```bash
docker system df -v
# Shows images, containers, volumes, and build cache usage
```

#### Dive: Layer Exploration Tool

**Dive** visualizes Docker image layers and helps identify wasted space.

**Installation:**
```bash
# macOS
brew install dive

# Linux
wget https://github.com/wagoodman/dive/releases/download/v0.12.0/dive_0.12.0_linux_amd64.deb
sudo dpkg -i dive_0.12.0_linux_amd64.deb

# Docker (no install)
docker run --rm -it \
  -v /var/run/docker.sock:/var/run/docker.sock \
  wagoodman/dive:latest myapp:latest
```

**Using Dive:**
```bash
dive myapp:latest
```

**Key Metrics to Watch:**
- **Efficiency Score:** Percentage of files that are not duplicated or wasted
- **Wasted Space:** Files overwritten or removed in subsequent layers
- **Layer Sizes:** Individual layer contributions to total size

**Interpretation:**
- Red zones indicate files added then removed (classic layer bloat)
- Yellow zones show modified files
- Green zones are efficient, persistent files

#### Container Diff

Compare image versions to identify size regressions:

```bash
# Install container-diff
curl -LO https://storage.googleapis.com/container-diff/latest/container-diff-linux-amd64
chmod +x container-diff-linux-amd64

# Compare two images
container-diff diff \
  --type=file \
  --type=size \
  myapp:v1.0.0 \
  myapp:v1.0.1
```

**CI/CD Integration:**
```bash
# Fail build if image exceeds 200MB
SIZE=$(docker inspect -f "{{ .Size }}" myapp:latest)
MAX_SIZE=$((200 * 1024 * 1024))  # 200MB in bytes

if [ "$SIZE" -gt "$MAX_SIZE" ]; then
  echo "Image size $SIZE exceeds maximum $MAX_SIZE"
  exit 1
fi
```

**Key Takeaway:** Measurement precedes optimization. Use `docker history` for quick checks and **Dive** for detailed layer analysis. Establish image size budgets in CI/CD pipelines to prevent regression.

---

### 9.2 Layer Inspection and Optimization

Each Dockerfile instruction creates a layer. Understanding layer mechanics enables strategic optimization—balancing cache efficiency against layer count.

#### Layer Fundamentals

**Union Filesystem Behavior:**
- Layers are read-only except the top container layer
- Files deleted in upper layers hide (but don't remove) lower layer files
- Each layer stores the delta from the previous state

**The Layer Count Balance:**
- **Few layers:** Poor cache granularity, slow rebuilds
- **Many layers:** Image bloat from intermediate files, excessive metadata

**Target:** 15-25 layers for complex applications, 5-10 for minimal images.

#### Squashing Layers

**Caution:** Layer squashing destroys cache benefits. Use only for final production images where size trumps build speed.

```bash
# Experimental squash flag
docker build --squash -t myapp:squashed .

# Or using BuildKit exporter
docker buildx build \
  --output type=docker,name=myapp:squashed,compression=gzip \
  -t myapp:squashed .
```

**When to Squash:**
- Security scrubbing (removing build tools entirely)
- Final production artifacts where rebuild frequency is low
- Base image creation

#### Minifying with Docker-Slim

**Docker-Slim** analyzes runtime behavior and removes unused files:

```bash
# Install
curl -sL https://raw.githubusercontent.com/slimtoolkit/slim/master/scripts/install-slim.sh | sudo -E bash

# Use
slim build --target myapp:latest --tag myapp:slim

# Results often show 10-30x size reduction
```

**How it works:**
1. Runs container with instrumentation
2. Monitors file system access
3. Creates new image with only accessed files
4. Removes package managers, unused libraries, documentation

**Limitations:**
- Dynamic loading (plugins, reflection) may be missed
- Requires comprehensive test coverage during build
- Adds build complexity

#### Chain Optimization

Combine RUN commands to reduce layers, but maintain readability:

```dockerfile
# Instead of:
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get install -y git
RUN rm -rf /var/lib/apt/lists/*

# Use:
RUN apt-get update && apt-get install -y \
    curl \
    git \
    && rm -rf /var/lib/apt/lists/* \
    && apt-get clean
```

**Key Takeaway:** Optimize layers by **coalescing related operations** (install + cleanup) while keeping independent steps (dependencies vs. source code) separate for cache efficiency. Use squashing and minification only when cache preservation is unnecessary.

---

### 9.3 The .dockerignore File

The build context (files sent to Docker daemon) directly impacts build performance and potentially leaks sensitive data. A comprehensive `.dockerignore` is as critical as `.gitignore`.

#### Syntax and Patterns

`.dockerignore` uses Go filepath matching:

```gitignore
# Comment
*.log
temp/
node_modules/
.git/
```

**Special patterns:**
- `**/*.go` — Any .go file in any subdirectory
- `!important.log` — Exception (include this file)
- `[a-z]*` — Character ranges
- `?ingle` — Single character match

#### Comprehensive Template

```gitignore
# Git
.git
.gitignore
.gitattributes
.gitmodules

# CI/CD
.github/
.gitlab-ci.yml
.travis.yml
.azure-pipelines/
jenkins/
drone.yml

# Documentation
*.md
docs/
README*
LICENSE
CHANGELOG

# IDE
.idea/
.vscode/
*.swp
*.swo
*~
.project
.classpath
.settings/

# Dependencies (installed in container)
node_modules/
vendor/
.pnp.*
.yarn/*
!.yarn/patches
!.yarn/plugins
!.yarn/releases
!.yarn/sdks
!.yarn/versions

# Build artifacts (built in container)
dist/
build/
target/
out/
*.exe
*.dll
*.so
*.dylib

# Test files
test/
tests/
__tests__/
*.test.js
*.spec.js
coverage/
.nyc_output/
jest.config.js

# Environment and secrets
.env
.env.*
.envrc
*.pem
*.key
*.crt
secrets/
credentials/
.aws/
.gcp/
.kube/

# Docker itself
Dockerfile*
docker-compose*.yml
.dockerignore
.docker/

# Logs
logs/
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*

# Temporary files
tmp/
temp/
*.tmp
.cache/

# OS files
.DS_Store
Thumbs.db
desktop.ini

# Language specific
__pycache__/
*.py[cod]
*$py.class
.Python
*.egg-info/
.pytest_cache/
.mypy_cache/
.ruff_cache/
.venv/
venv/
ENV/

# Compiled files
*.com
*.class
*.o
*.a
```

#### Context Size Verification

```bash
# Check context size
docker build --no-cache -t temp . 2>&1 | head -20

# Or use du to estimate
du -sh .  # Total size
du -sh --exclude=node_modules --exclude=.git .  # Effective size
```

#### Negative Patterns (Exceptions)

Include specific files within excluded directories:

```gitignore
# Ignore all markdown except README
*.md
!README.md
!docs/API.md

# Ignore config except production
config/*
!config/production.yml
```

#### Impact on Build Cache

`.dockerignore` affects cache invalidation:
- Files ignored are not considered in `COPY` checksums
- Changing ignored files does not invalidate layers

**Key Takeaway:** A well-crafted `.dockerignore` reduces **build context transfer time**, prevents **secret leakage** into images, and improves **cache hit rates** by excluding files that change frequently but aren't needed (like logs or local config).

---

### 9.4 Base Image Selection Strategies

Base image selection is the single most impactful decision for image size and security. This section provides decision matrices for choosing between Alpine, Debian Slim, Distroless, and Scratch.

#### Alpine Linux

**Characteristics:**
- Size: ~5MB base
- musl libc (not glibc)
- busybox utilities
- Package manager: `apk`

**Best for:**
- Statically compiled languages (Go, Rust)
- Simple services with no complex dependencies
- Resource-constrained environments (IoT, edge)

**Considerations:**
- musl vs glibc compatibility issues (DNS resolution, threading)
- Timezone data requires explicit installation
- Debugging harder (no bash by default)

```dockerfile
FROM alpine:3.18
RUN apk add --no-cache ca-certificates tzdata
```

#### Debian Slim (bookworm-slim, bullseye-slim)

**Characteristics:**
- Size: ~30-50MB base
- glibc (full compatibility)
- Standard GNU utilities
- Package manager: `apt`

**Best for:**
- Python, Node.js, Ruby applications
- Applications requiring glibc compatibility
- Teams needing familiar debugging tools

```dockerfile
FROM debian:bookworm-slim
RUN apt-get update && apt-get install -y --no-install-recommends \
    ca-certificates \
    && rm -rf /var/lib/apt/lists/*
```

#### Distroless (Google)

**Characteristics:**
- Size: ~20MB (varies by language)
- No package manager
- No shell
- Minimal system libraries
- Non-root by default

**Best for:**
- Production applications where debug access isn't needed
- Maximum security (minimal attack surface)
- Java, Python, Node.js, .NET

**Variants:**
- `gcr.io/distroless/static` — Static binaries (Go, C++)
- `gcr.io/distroless/base` — glibc + openssl
- `gcr.io/distroless/java17` — JRE 17
- `gcr.io/distroless/nodejs20` — Node.js 20

```dockerfile
FROM gcr.io/distroless/nodejs20-debian12
COPY --chown=nonroot:nonroot . /app
WORKDIR /app
USER nonroot
CMD ["index.js"]
```

#### Scratch (Empty)

**Characteristics:**
- Size: 0 bytes base
- Nothing included
- No shell, no utilities, no libraries

**Best for:**
- Statically compiled Go/Rust binaries
- When you bundle all dependencies

**Requirements:**
- Must statically link (CGO_ENABLED=0 for Go)
- Must copy CA certificates and timezone data manually
- Cannot use DNS resolution without copying `/etc/nsswitch.conf`

```dockerfile
FROM scratch
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY --from=builder /usr/share/zoneinfo /usr/share/zoneinfo
COPY --from=builder /app/server /server
EXPOSE 8080
ENTRYPOINT ["/server"]
```

#### Selection Decision Matrix

| Requirement | Alpine | Debian Slim | Distroless | Scratch |
|-------------|--------|-------------|------------|---------|
| **Size** | ★★★★★ | ★★★☆☆ | ★★★★☆ | ★★★★★ |
| **Compatibility** | ★★★☆☆ | ★★★★★ | ★★★★☆ | ★★☆☆☆ |
| **Security** | ★★★☆☆ | ★★☆☆☆ | ★★★★★ | ★★★★★ |
| **Debuggability** | ★★★☆☆ | ★★★★★ | ★☆☆☆☆ | ☆☆☆☆☆ |
| **Build Complexity** | Low | Low | Medium | High |

**Recommendation:**
- **Development:** Debian Slim (debugging ease)
- **Production (General):** Distroless (security + compatibility)
- **Production (Max Security):** Scratch (Go/Rust only)
- **Legacy/Compatibility:** Alpine (if musl works)

**Key Takeaway:** Choose **Distroless** for production microservices where you don't need shell access. Use **Alpine** only after verifying musl compatibility. Reserve **Scratch** for expertly crafted static binaries.

---

### 9.5 Multi-Platform Builds

Modern infrastructure requires images for multiple CPU architectures (AMD64 for servers, ARM64 for Apple Silicon/M1/M2, ARM for IoT). BuildKit enables cross-platform compilation from a single machine.

#### Buildx Setup

```bash
# Create multi-platform builder
docker buildx create --name multiplatform --use
docker buildx inspect --bootstrap

# Verify platforms
docker buildx ls
```

#### Cross-Platform Dockerfile

```dockerfile
# syntax=docker/dockerfile:1
FROM --platform=$BUILDPLATFORM golang:1.21 AS builder
ARG TARGETOS
ARG TARGETARCH
WORKDIR /app
COPY . .
RUN GOOS=$TARGETOS GOARCH=$TARGETARCH CGO_ENABLED=0 go build -o app .

FROM gcr.io/distroless/static:nonroot
COPY --from=builder /app/app /app
ENTRYPOINT ["/app"]
```

**Build Commands:**
```bash
# Build for current platform
docker buildx build -t myapp:latest .

# Build for specific platform
docker buildx build --platform linux/amd64 -t myapp:amd64 .

# Build for multiple platforms (creates manifest list)
docker buildx build \
  --platform linux/amd64,linux/arm64,linux/arm/v7 \
  -t myapp:latest \
  --push .
```

#### Platform-SIFIC Instructions

```dockerfile
# Install different dependencies per architecture
FROM node:20-alpine
RUN case "$(uname -m)" in \
    x86_64) echo "AMD64 detected" ;; \
    aarch64) echo "ARM64 detected" ;; \
    armv7l) echo "ARMv7 detected" ;; \
    *) echo "Unknown: $(uname -m)" ;; \
    esac
```

#### Manifest Lists

Buildx creates **manifest lists** (multi-arch images) automatically. When users pull `myapp:latest`, Docker automatically selects the matching architecture.

**Inspect manifests:**
```bash
docker manifest inspect myapp:latest
```

**Key Takeaway:** Multi-platform builds ensure your applications run natively on **ARM64 Macs** (M1/M2) and **cloud ARM instances** (AWS Graviton, Azure Cobalt) without emulation overhead. This is essential for modern CI/CD supporting diverse development environments.

---

### 9.6 Performance Benchmarking

Optimization requires measurement. Establish baseline metrics for build time, startup latency, and runtime resource consumption.

#### Build Time Benchmarking

```bash
# Hyperfine for statistically significant timing
hyperfine --prepare 'docker system prune -f' \
  'docker build -t test1 -f Dockerfile.slow .' \
  'docker build -t test2 -f Dockerfile.optimized .'
```

**CI/CD Integration:**
```bash
# Time the build
start=$(date +%s)
docker build -t myapp:$CI_COMMIT_SHA .
end=$(date +%s)
build_time=$((end-start))

# Report metric
echo "build_time_seconds $build_time" | curl -X POST \
  --data-binary @- \
  http://metrics-server/metrics/job/docker-build
```

#### Startup Time Measurement

Container startup speed affects autoscaling responsiveness.

```bash
# Measure cold start
time docker run --rm myapp:latest echo "Started"

# Detailed analysis with instrumentation
docker run --rm \
  -e LOG_LEVEL=debug \
  myapp:latest 2>&1 | grep -E "(boot|start|init|ready)"
```

**Application-Level Instrumentation:**
```javascript
// Node.js startup timing
const start = Date.now();
// ... initialization ...
console.log(`Startup time: ${Date.now() - start}ms`);
```

#### Runtime Performance

**Memory Usage:**
```bash
# Monitor container stats
docker stats myapp --no-stream --format "table {{.Container}}\t{{.MemUsage}}\t{{.MemPerc}}"

# Or inspect cgroup limits
docker inspect myapp --format='{{.HostConfig.Memory}}'
```

**CPU Profiling:**
```bash
# Stress test
docker run --rm -it myapp:latest /bin/sh -c "stress-ng --cpu 4 --timeout 60s"
```

#### Image Pull Performance

Measure deployment speed (simulating Kubernetes image pull):

```bash
# Clear local cache
docker rmi myapp:latest
docker system prune -f

# Time the pull
time docker pull registry/myapp:latest
```

**Layer Parallelization:**
BuildKit pulls layers in parallel. Ensure your registry supports HTTP/2 for maximum efficiency.

**Key Takeaway:** Establish **SLOs (Service Level Objectives)** for image metrics:
- Build time: < 5 minutes for incremental builds
- Image size: < 500MB for application images
- Startup time: < 10 seconds from `docker run` to ready state
- Memory overhead: Base image < 100MB

---

### 9.7 BuildKit Optimization Features

Expanding on Chapter 8, BuildKit provides advanced optimization techniques specifically for image size and build speed.

#### Inline Cache for CI/CD

Enable cache export/import between CI runs without registry support:

```bash
# Build with inline cache metadata
docker buildx build \
  --cache-to type=inline \
  --cache-from type=registry,ref=myapp:cache \
  -t myapp:latest \
  --push .
```

#### Garbage Collection Tuning

Configure BuildKit's aggressive cache cleaning:

```bash
# /etc/buildkit/buildkitd.toml
[worker.oci]
  gc = true
  gckeepstorage = 10000000000  # 10GB

[worker.oci.gc]
  defaultKeepPolicy = true
```

#### Optimization Frontend

Use the experimental Dockerfile frontend for latest features:

```dockerfile
# syntax=docker/dockerfile:1.6
FROM node:20-alpine
# Access to newest syntax features
```

#### No-Cache for Specific Steps

Force cache invalidation for specific RUN commands:

```dockerfile
RUN --mount=type=cache,target=/root/.npm \
    npm ci
# Cache mounted separately, but you can invalidate with:
RUN --mount=type=cache,target=/root/.npm,id=npm-$(date +%Y%m%d) \
    npm ci
```

**Key Takeaway:** BuildKit's advanced features—**inline caching**, **garbage collection**, and **mount semantics**—provide the fine-grained control needed for enterprise CI/CD pipelines where build performance directly impacts developer productivity.

---

### 9.8 Image Signing and Provenance

Optimization includes trust and verification. Signed images prevent tampering and establish supply chain integrity.

#### Cosign (Sigstore)

**Installation:**
```bash
brew install cosign
# or
go install github.com/sigstore/cosign/v2/cmd/cosign@latest
```

**Signing:**
```bash
# Generate key pair (or use KMS)
cosign generate-key-pair

# Sign image
cosign sign --key cosign.key myregistry/myapp:1.0.0

# Verify
cosign verify --key cosign.pub myregistry/myapp:1.0.0
```

**Keyless Signing (Recommended):**
Uses OIDC identity (GitHub Actions, Google, etc.) without managing keys:

```bash
# In CI/CD (GitHub Actions)
cosign sign --yes myregistry/myapp:${{ github.sha }}
```

#### SBOM Generation

Software Bill of Materials lists all components:

```bash
# Generate SBOM with Syft
syft myapp:latest -o spdx-json > sbom.spdx.json

# Attach to image
cosign attach sbom --sbom sbom.spdx.json myregistry/myapp:1.0.0

# Verify SBOM
cosign download sbom myregistry/myapp:1.0.0
```

#### Provenance Attestation

Record build metadata (source repo, build timestamp, build system):

```bash
docker buildx build \
  --provenance=true \
  --sbom=true \
  -t myapp:latest .
```

**Verification:**
```bash
cosign verify-attestation \
  --type=slsaprovenance \
  myapp:latest
```

**Key Takeaway:** Image signing and SBOMs transform Docker from convenient packaging into **trusted, auditable artifacts**. This is increasingly mandatory for compliance (FedRAMP, SOC2) and supply chain security (SLSA).

---

### Chapter Summary and Preview

In this chapter, you mastered the measurement and optimization of Docker images. You analyzed image composition using **Dive** and `docker history`, implemented comprehensive **.dockerignore** patterns to minimize build context, selected optimal **base images** (Alpine, Distroless, Scratch) based on application requirements, enabled **multi-platform builds** for ARM64 and AMD64 support, established **performance benchmarks** for build and startup times, leveraged advanced **BuildKit features** for cache optimization, and implemented **image signing and SBOMs** for supply chain security.

Your containers are now optimized for size, speed, and security—ready for the orchestration challenges ahead. These images represent the immutable artifacts that Kubernetes will schedule, scale, and manage across distributed clusters.

In **Chapter 10: Docker Security Best Practices**, we harden these optimized images further with comprehensive security scanning, least-privilege configurations, secret management, and compliance patterns. You will learn to scan for CVEs, implement read-only root filesystems, drop capabilities, and configure security contexts. This security-focused chapter completes our Docker expertise before transitioning to **Part III: Kubernetes Fundamentals**, where we scale from single-host containers to distributed cluster orchestration.