This Pulumi application provisions an AWS Elastic Kubernetes Service (EKS) cluster using Python. It is built with a modern Python toolchain, including uv
for dependency management and ruff
for code quality. The infrastructure follows best practices with DRY principles, helper functions for common operations, and centralized configuration management.
- Pulumi CLI: Install Pulumi.
- Python 3.12: Required for the Pulumi Python runtime.
- uv: Fast Python package manager. Installation instructions.
- AWS CLI: Install and configure with credentials that have permissions to create EKS clusters and related resources.
- kubectl: Install kubectl to interact with the Kubernetes cluster.
- Task CLI: Install Task for running automation commands.
aws-eks/
├── .venv/ # Virtual environment managed by uv
├── pulumi/
│ ├── eks.py # Main Pulumi application with EKS cluster configuration
│ ├── Pulumi.yaml # Pulumi project configuration
│ ├── Pulumi.staging.yaml # Stack-specific configuration for staging
│ └── Pulumi.production.yaml # Stack-specific configuration for production
├── .pre-commit-config.yaml # Pre-commit hook configurations
├── pyproject.toml # Project metadata and dependencies (for uv)
├── Taskfile.yml # Task definitions for common operations
├── uv.lock # Locked dependency versions (generated by uv)
└── README.md # This file
This project uses Task as a command runner. All common development and operational tasks are defined in Taskfile.yml
.
To get started, run the main setup task. This single command will:
- Sync Python dependencies using
uv
. - Create the
uv.lock
file. - Install the
pre-commit
hooks into your local git repository. - Initialize a default
staging
Pulumi stack.
# Run the complete setup process
task setup
Code quality is enforced using ruff
(for linting and formatting) and mypy
(for static type checking). These checks are integrated into pre-commit hooks and can also be run manually.
task lint
: Check for linting errors withruff
.task format
: Format code withruff
.task typecheck
: Run static type analysis withmypy
.task check
: Run all quality checks (lint
,format
,typecheck
).task precommit
: Manually run all pre-commit hooks against all files.
All Pulumi commands are wrapped in Tasks for consistency.
task preview env=<staging|production>
: Preview a deployment.task deploy env=<staging|production>
: Deploy infrastructure changes.task destroy env=<staging|production>
: Tear down all infrastructure.task validate env=<staging|production>
: Run all code quality checks and then a Pulumi preview.task config:set env=<stack> key=<key> value=<value>
: Set a configuration value for a stack.task eks:kubeconfig env=<stack>
: Configurekubectl
to connect to the deployed cluster.
All configuration is managed through Pulumi's native config system. Use the task config:set
command as shown above.
- AWS:
aws:region
- Cluster:
node-disk-size
,node-ami-type
,vpc-cidr
,cluster-name-prefix
- Feature Toggles:
enable-vpc-endpoints
,enable-cluster-logging
,enable-public-endpoint
,enable-private-endpoint
- Environment Sizing:
staging-instance-type
,staging-min-size
,production-instance-type
, etc.
Refer to pulumi/eks.py
for all available configuration keys and their default values.
### Service URLs After Setup
**Default AWS Domain Mode:**
- API service: `http://api.k8s-default-xxx.us-east-1.elb.amazonaws.com`
- ArgoCD service: `http://argocd.k8s-default-xxx.us-east-1.elb.amazonaws.com`
**Custom Domain Mode:**
- API service: `https://api.yourdomain.com`
- ArgoCD service: `https://argocd.yourdomain.com`
For detailed ALB setup instructions, see [ALB-SETUP.md](ALB-SETUP.md).
## Scaling the Cluster
To scale the cluster, run:
```bash
task eks:scale env=staging desiredSize=3
# or with all parameters
task eks:scale env=staging desiredSize=3 minSize=1 maxSize=5
To tear down the resources, run:
task destroy env=staging
# or
task destroy env=production
This implementation follows software engineering best practices:
- Helper Functions: Reusable functions for IAM role creation and policy attachment
- Centralized Configuration: All settings managed through
config_vars
dictionary - Environment Abstraction: Single codebase handles multiple environments
- Modular Design: Separated concerns with dedicated functions
- Clear Structure: Logical grouping of related resources
- Consistent Naming: Standardized resource naming patterns
- Type Hints: Python type annotations for better code clarity
- Pulumi Config: Native configuration management with encryption for secrets
- Stack Isolation: Environment-specific configuration per stack
- Validation: Input validation and sensible defaults
- Documentation: Comprehensive inline comments and README
- uv Package Manager: Lightning-fast Python package installation and dependency resolution
- Dependency Locking: Reproducible builds with
requirements.lock
file - Virtual Environment: Automatic virtual environment management
- Pulumi State: Automatic state management with collaboration features
The Pulumi code follows best practices:
# View stack outputs
task outputs env=staging
# Check stack status
task status env=staging
# View stack information
task info env=staging
# Validate configuration
task validate env=staging
# Update dependencies and create lock file
task pulumi:sync
# View and manage configuration
task config:list env=staging
task config:set env=staging key=node-disk-size value=200
This project uses Pulumi's native configuration system:
# List all configuration for a stack
task config:list env=staging
# Set configuration values
task config:set env=staging key=vpc-cidr value="172.16.0.0/16"
task config:set env=production key=production-instance-type value="m5.xlarge"
# Set secrets (encrypted)
task config:set-secret env=staging key=admin-user-arn value="arn:aws:iam::123:user/admin"
# Get specific configuration value
task config:get env=staging key=node-disk-size
This project uses uv for fast and reliable Python package management:
# Install dependencies (done automatically by task setup)
task pulumi:deps
# Update and lock dependencies
task pulumi:sync
View all available tasks:
task --list
- Kubernetes Version: The
cluster_version
variable invariables.tf
is set to a default (e.g.,1.32
). Always check the official AWS EKS documentation for the latest supported Kubernetes versions and update accordingly. - VPC and Subnets: This configuration creates a new VPC specifically for the EKS cluster using the
terraform-aws-modules/vpc/aws
module. This module handles the creation of public and private subnets across specified availability zones, NAT gateways (configurable, e.g., one per AZ or a single NAT gateway), and ensures all resources are tagged appropriately for EKS compatibility (e.g.,kubernetes.io/cluster/<cluster_name>=shared
). - IAM Permissions: The AWS credentials used to run Terraform need sufficient permissions to create and manage EKS clusters, IAM roles, EC2 instances, security groups, and other related resources.
- EKS Access Entries: Cluster access for IAM principals is managed via EKS Access Entries, which is the successor to the
aws-auth
ConfigMap method. This configuration usesAPI_AND_CONFIG_MAP
mode to support both modern access entries and node group bootstrapping. - Worker Node Labeling: Managed node groups are automatically labeled with
node-role.kubernetes.io/worker = "worker"
. This standard label helpskubectl
display node roles correctly and can be used for scheduling workloads. - ALB Configuration: Two modes available - default AWS domain for testing (subdomain routing without SSL) or custom domain for production (subdomain routing with SSL and Route 53).
- State Management:
- This project uses an AWS S3 remote backend with S3-native state locking (via
use_lockfile = true
inbackend.tf
) - The S3 bucket is managed separately in
terraform-backend/
to avoid circular dependencies - The bucket includes versioning and lifecycle policies (retains 2 newest noncurrent versions, older versions deleted after 90 days)
- Workspaces are used to manage multiple environments (staging/production) with isolated state files
- This project uses an AWS S3 remote backend with S3-native state locking (via
- Cost: Running an EKS cluster and associated resources will incur costs on your AWS bill. Make sure to destroy resources when they are no longer needed if you are experimenting.