Skip to content

lconnell/aws-eks

Repository files navigation

AWS EKS Cluster with Pulumi

This Pulumi application provisions an AWS Elastic Kubernetes Service (EKS) cluster using Python. It is built with a modern Python toolchain, including uv for dependency management and ruff for code quality. The infrastructure follows best practices with DRY principles, helper functions for common operations, and centralized configuration management.

Prerequisites

  1. Pulumi CLI: Install Pulumi.
  2. Python 3.12: Required for the Pulumi Python runtime.
  3. uv: Fast Python package manager. Installation instructions.
  4. AWS CLI: Install and configure with credentials that have permissions to create EKS clusters and related resources.
  5. kubectl: Install kubectl to interact with the Kubernetes cluster.
  6. Task CLI: Install Task for running automation commands.

Directory Structure

aws-eks/
├── .venv/                # Virtual environment managed by uv
├── pulumi/
│   ├── eks.py            # Main Pulumi application with EKS cluster configuration
│   ├── Pulumi.yaml       # Pulumi project configuration
│   ├── Pulumi.staging.yaml    # Stack-specific configuration for staging
│   └── Pulumi.production.yaml # Stack-specific configuration for production
├── .pre-commit-config.yaml # Pre-commit hook configurations
├── pyproject.toml        # Project metadata and dependencies (for uv)
├── Taskfile.yml          # Task definitions for common operations
├── uv.lock               # Locked dependency versions (generated by uv)
└── README.md             # This file

Project Automation with Task

This project uses Task as a command runner. All common development and operational tasks are defined in Taskfile.yml.

First-Time Setup

To get started, run the main setup task. This single command will:

  1. Sync Python dependencies using uv.
  2. Create the uv.lock file.
  3. Install the pre-commit hooks into your local git repository.
  4. Initialize a default staging Pulumi stack.
# Run the complete setup process
task setup

Code Quality

Code quality is enforced using ruff (for linting and formatting) and mypy (for static type checking). These checks are integrated into pre-commit hooks and can also be run manually.

  • task lint: Check for linting errors with ruff.
  • task format: Format code with ruff.
  • task typecheck: Run static type analysis with mypy.
  • task check: Run all quality checks (lint, format, typecheck).
  • task precommit: Manually run all pre-commit hooks against all files.

Pulumi Operations

All Pulumi commands are wrapped in Tasks for consistency.

  • task preview env=<staging|production>: Preview a deployment.
  • task deploy env=<staging|production>: Deploy infrastructure changes.
  • task destroy env=<staging|production>: Tear down all infrastructure.
  • task validate env=<staging|production>: Run all code quality checks and then a Pulumi preview.
  • task config:set env=<stack> key=<key> value=<value>: Set a configuration value for a stack.
  • task eks:kubeconfig env=<stack>: Configure kubectl to connect to the deployed cluster.

Configuration Management

All configuration is managed through Pulumi's native config system. Use the task config:set command as shown above.

Key Configuration Values

  • AWS: aws:region
  • Cluster: node-disk-size, node-ami-type, vpc-cidr, cluster-name-prefix
  • Feature Toggles: enable-vpc-endpoints, enable-cluster-logging, enable-public-endpoint, enable-private-endpoint
  • Environment Sizing: staging-instance-type, staging-min-size, production-instance-type, etc.

Refer to pulumi/eks.py for all available configuration keys and their default values.


### Service URLs After Setup

**Default AWS Domain Mode:**
- API service: `http://api.k8s-default-xxx.us-east-1.elb.amazonaws.com`
- ArgoCD service: `http://argocd.k8s-default-xxx.us-east-1.elb.amazonaws.com`

**Custom Domain Mode:**
- API service: `https://api.yourdomain.com`
- ArgoCD service: `https://argocd.yourdomain.com`

For detailed ALB setup instructions, see [ALB-SETUP.md](ALB-SETUP.md).

## Scaling the Cluster

To scale the cluster, run:
```bash
task eks:scale env=staging desiredSize=3
# or with all parameters
task eks:scale env=staging desiredSize=3 minSize=1 maxSize=5

Destroying the Cluster

To tear down the resources, run:

task destroy env=staging
# or
task destroy env=production

Code Quality Features

This implementation follows software engineering best practices:

DRY (Don't Repeat Yourself) Principles

  • Helper Functions: Reusable functions for IAM role creation and policy attachment
  • Centralized Configuration: All settings managed through config_vars dictionary
  • Environment Abstraction: Single codebase handles multiple environments

Easy Maintenance

  • Modular Design: Separated concerns with dedicated functions
  • Clear Structure: Logical grouping of related resources
  • Consistent Naming: Standardized resource naming patterns
  • Type Hints: Python type annotations for better code clarity

Configuration Management

  • Pulumi Config: Native configuration management with encryption for secrets
  • Stack Isolation: Environment-specific configuration per stack
  • Validation: Input validation and sensible defaults
  • Documentation: Comprehensive inline comments and README

Modern Tooling

  • uv Package Manager: Lightning-fast Python package installation and dependency resolution
  • Dependency Locking: Reproducible builds with requirements.lock file
  • Virtual Environment: Automatic virtual environment management
  • Pulumi State: Automatic state management with collaboration features

Useful Commands

The Pulumi code follows best practices:

# View stack outputs
task outputs env=staging

# Check stack status
task status env=staging

# View stack information
task info env=staging

# Validate configuration
task validate env=staging

# Update dependencies and create lock file
task pulumi:sync

# View and manage configuration
task config:list env=staging
task config:set env=staging key=node-disk-size value=200

Configuration Management

This project uses Pulumi's native configuration system:

# List all configuration for a stack
task config:list env=staging

# Set configuration values
task config:set env=staging key=vpc-cidr value="172.16.0.0/16"
task config:set env=production key=production-instance-type value="m5.xlarge"

# Set secrets (encrypted)
task config:set-secret env=staging key=admin-user-arn value="arn:aws:iam::123:user/admin"

# Get specific configuration value
task config:get env=staging key=node-disk-size

Dependency Management with uv

This project uses uv for fast and reliable Python package management:

# Install dependencies (done automatically by task setup)
task pulumi:deps

# Update and lock dependencies
task pulumi:sync

Task Reference

View all available tasks:

task --list

Important Notes

  • Kubernetes Version: The cluster_version variable in variables.tf is set to a default (e.g., 1.32). Always check the official AWS EKS documentation for the latest supported Kubernetes versions and update accordingly.
  • VPC and Subnets: This configuration creates a new VPC specifically for the EKS cluster using the terraform-aws-modules/vpc/aws module. This module handles the creation of public and private subnets across specified availability zones, NAT gateways (configurable, e.g., one per AZ or a single NAT gateway), and ensures all resources are tagged appropriately for EKS compatibility (e.g., kubernetes.io/cluster/<cluster_name>=shared).
  • IAM Permissions: The AWS credentials used to run Terraform need sufficient permissions to create and manage EKS clusters, IAM roles, EC2 instances, security groups, and other related resources.
  • EKS Access Entries: Cluster access for IAM principals is managed via EKS Access Entries, which is the successor to the aws-auth ConfigMap method. This configuration uses API_AND_CONFIG_MAP mode to support both modern access entries and node group bootstrapping.
  • Worker Node Labeling: Managed node groups are automatically labeled with node-role.kubernetes.io/worker = "worker". This standard label helps kubectl display node roles correctly and can be used for scheduling workloads.
  • ALB Configuration: Two modes available - default AWS domain for testing (subdomain routing without SSL) or custom domain for production (subdomain routing with SSL and Route 53).
  • State Management:
    • This project uses an AWS S3 remote backend with S3-native state locking (via use_lockfile = true in backend.tf)
    • The S3 bucket is managed separately in terraform-backend/ to avoid circular dependencies
    • The bucket includes versioning and lifecycle policies (retains 2 newest noncurrent versions, older versions deleted after 90 days)
    • Workspaces are used to manage multiple environments (staging/production) with isolated state files
  • Cost: Running an EKS cluster and associated resources will incur costs on your AWS bill. Make sure to destroy resources when they are no longer needed if you are experimenting.

About

Provisions an AWS EKS cluster

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages