Table of Contents##
- Setup AWS credentials
- Install tools
- Quick start
- Build multi-node cluster
- Manage individual platform resources
- Technical notes
This is a practical implementation of [CoreOS cluster architectures ] (https://coreos.com/os/docs/latest/cluster-architectures.html) built on AWS. The cluster follows CoreOS production cluster model that contains an autoscaling etcd cluster, and an autoscaling worker cluster for hosted containers. You can optionally add an admiral cluster for shared services such as CI, private docker registry, logging and monitoring, etc.
The entire infrastructure is managed by Terraform.
For other type of Unix cluster, see a similar repo aws-linux-cluster.
Setup AWS credentials
Go to AWS Console.
- Signup AWS account if you don't already have one. The default EC2 instances created by this tool is covered by AWS Free Tier (https://aws.amazon.com/free/) service.
- Create a group
- Create a user
coreos-clusterand Download the user credentials.
- Add user
Instructions for install tools on MacOS:
$ brew update $ brew install terraform
$ mkdir -p ~/bin/terraform $ cd ~/bin/terraform $ curl -L -O https://dl.bintray.com/mitchellh/terraform/terraform_0.6.0_darwin_amd64.zip $ unzip terraform_0.6.0_darwin_amd64.zip
$ brew install jq
Install AWS CLI
$ brew install awscli
$ sudo easy_install pip $ sudo pip install --upgrade awscli
For other platforms, follow the tool links and instructions on tool sites.
``` $ brew install awscli ``` or ``` $ sudo easy_install pip $ sudo pip install --upgrade awscli ```
Clone the repo:
$ git clone https://github.com/xuwang/aws-terraform.git $ cd aws-terraform
Run Vagrant ubuntu box with terraform installed (Optional)
If you use Vagrant, instead of install tools on your host machine, there is Vagranetfile for a Ubuntu box with all the necessary tools installed:
$ vagrant up $ vagrant ssh $ cd aws-terraform
Configure AWS profile with
$ aws configure --profile coreos-cluster
Use the downloaded aws user credentials when prompted.
The above command will create a coreos-cluster profile authentication section in ~/.aws/config and ~/.aws/credentials files. The build process bellow will automatically configure Terraform AWS provider credentials using this profile.
This default build will create one etcd node and one worker node cluster in a VPC, with application buckets for data, necessary iam roles, polices, keypairs and keys. The instance type for the nodes is t2.micro. You can review the configuration and make changes if needed. See Customization for details.
$ make ... build steps info ... ... at last, shows the worker's ip: worker public ips: 220.127.116.11 ...
To see the list of resources created:
$ make show ... module.etcd.aws_autoscaling_group.etcd: id = etcd availability_zones.# = 3 availability_zones.2050015877 = us-west-2c availability_zones.221770259 = us-west-2b availability_zones.2487133097 = us-west-2a default_cooldown = 300 desired_capacity = 1 force_delete = true health_check_grace_period = 0 health_check_type = EC2 launch_configuration = terraform-4wjntqyn7rbfld5qa4qj6s3tie load_balancers.# = 0 max_size = 9 min_size = 1 name = etcd tag.# = 1 ....
Login to the worker node:
$ ssh -A email@example.com CoreOS beta (723.3.0) firstname.lastname@example.org ~ $ fleetctl list-machines MACHINE IP METADATA 289a6ba7... 10.0.1.141 env=coreos-cluster,platform=ec2,provider=aws,region=us-west-2,role=etcd2 320bd4ac... 10.0.5.50 env=coreos-cluster,platform=ec2,provider=aws,region=us-west-2,role=worker
Destroy all resources
$ make destroy_all
This will destroy ALL resources created by this project.
$ vagrant destroy
The default values for VPC, ec2 instance profile, policies, keys, autoscaling group, lanuch configurations etc., can be override in resources/terraform/module-.tf` files.
AWS profile and cluster name are defined at the top of Makefile:
AWS_PROFILE := coreos-cluster CLUSTER_NAME := coreos-cluster
These can also be customized to match your AWS profile and cluster name.
Build multi-node cluster
The number of etcd nodes and worker nodes are defined in resource/terraform/module-etcd.tf and resource/terraform/module-worker.tf
Change the cluster_desired_capacity in the file to build multi-nodes etcd/worker cluster, for example, change to 3:
cluster_desired_capacity = 3
Note: etcd minimum, maximum and cluster_desired_capacity should be the same and in odd number, e.g. 3, 5, 9
You should also change the aws_instance_type
large if heavy docker containers to be hosted on the nodes:
image_type = "t2.medium" root_volume_size = 12 docker_volume_size = 120
$ make all ... build steps info ... ... at last, shows the worker's ip: worker public ips: 18.104.22.168 22.214.171.124 126.96.36.199 ...
Login to a worker node:
$ ssh -A email@example.com CoreOS beta (723.3.0) firstname.lastname@example.org ~ $ etcdctl cluster-health cluster is healthy member 34d5239c565aa4f6 is healthy member 5d6f4a5f10a44465 is healthy member ab930e93b1d5946c is healthy core@ip-10-0-1-92 ~ $ etcdctl member list 34d5239c565aa4f6: name=i-65e333ac peerURLs=http://10.0.1.92:2380 clientURLs=http://10.0.1.92:2379 5d6f4a5f10a44465: name=i-cd40d405 peerURLs=http://10.0.1.185:2380 clientURLs=http://10.0.1.185:2379 ab930e93b1d5946c: name=i-ecfa0d1a peerURLs=http://10.0.1.45:2380 clientURLs=http://10.0.1.45:2379 email@example.com ~ $ fleetctl list-machines MACHINE IP METADATA 0d16eb52... 10.0.1.92 env=coreos-cluster,platform=ec2,provider=aws,region=us-west-2,role=etcd2 d320718e... 10.0.1.185 env=coreos-cluster,platform=ec2,provider=aws,region=us-west-2,role=etcd2 f0bea88e... 10.0.1.45 env=coreos-cluster,platform=ec2,provider=aws,region=us-west-2,role=etcd2 0cb636ac... 10.0.5.4 env=coreos-cluster,platform=ec2,provider=aws,region=us-west-2,role=worker 4acc8d6e... 10.0.5.112 env=coreos-cluster,platform=ec2,provider=aws,region=us-west-2,role=worker fa9f4ea7... 10.0.5.140 env=coreos-cluster,platform=ec2,provider=aws,region=us-west-2,role=worker
Manage individual platform resources
You can create individual resources and the automated-scripts will create resources automatically based on dependencies.
$ make help Usage: make (<resource> | destroy_<resource> | plan_<resource> | refresh_<resource> | show | graph ) Available resources: vpc s3 route53 iam etcd worker For example: make worker # to show what resources are planned for worker
Currently defined resources:
|vpc||VPC, gateway, and subnets|
|iam||Setup a deployment user and deployment keys|
|route53||Setup public and private hosted zones on Route53 DNS service|
|elb||Setup application ELBs|
|etcd||Setup ETCD2 cluster|
|worker||Setup application docker hosting cluster|
|admiral||Central service cluster (Jenkins, fleet-ui, monitoring, logging, etc)|
|dockerhub||Private docker registry cluster|
|cloudtrail||Setup AWS CloudTrail|
To build the cluster step by step:
$ make init $ make vpc $ make etcd $ make worker
Make commands can be re-run. If a resource already exists, it just refreshes the terraform status.
This will create a build/ directory, copy all terraform files to the build dir, and execute correspondent terraform cmd to build the resource on AWS.
To destroy a resource:
$ make destroy_<resource>
- Etcd cluster is on an autoscaling group. It should be set with a fixed, odd number (1,3,5..), and cluster_desired_capacity=min_size=max_size.
- Cluster discovery is managed with dockerage/etcd-aws-cluster image. etcd cluster is formed by self-discover through its auto-scaling group and then an etcd initial cluster is updated automatically to s3://AWS-ACCOUNT-CLUSTER-NAME-cloudinit/etcd/initial-cluster s3 bucket. Worker nodes join the cluster by downloading the etcd initial-cluster file from the s3 bucket during their bootstrap.
- AWS resources are defined in resources and modules directories. The build process will copy all resource files from resources to a build directory. The terraform actions are performed under build, which is ignored in .gitignore. The original Terraform files in the repo are kept intact.
- Makefiles and shell scripts are used to give us more flexibility on tasks Terraform leftover. This provides stream-lined build automation.
- All nodes use a common bootstrap shell script as user-data, which downloads initial-cluster file and nodes specific cloud-config.yaml to configure the node. If cloud-config changes, no need to rebuild an instance. Just reboot it to pick up the change.
- CoreOS AMI is generated on the fly to keep it up-to-data. Default channel can be changed in Makefile.
- Terraform auto-generated launch configuration name and CBD feature are used to allow launch configuration update on a live autoscaling group, however, running ec2 instances in the autoscaling group has to be recycled outside of the Terraform management to pick up the new LC.
- For a production system, the security groups defined in etcd, worker, and admiral module should be carefully reviewed and tightened.
Once the tools are installed run
$ aws configure AWS Access Key ID [None]: <YOUR ACCESS KEY> AWS Secret Access Key [None]: <YOUR SECRET KEY> Default region name [None]: <YOUR PREFRED REGION> Default output format [None]: When prompted for the access and secret keys, enter the ones you saved earlier. Set the default region to your prefered one (based on the area you are located in) and the output format can be left as default. Here is list of AWS Regions: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html#concepts-available-regions
You will now need to create a file in this directory called terraform.tfvars with contents like this:
access_key = "YOUR ACCESS KEY" secret_key = "YOUR SECRET KEY" allowed_network = "YOUR NETWORK CIDR"
Choose a region that has at-least 3 availability zones! as otherwise the script will fail! e.g.
For complete list please take a look here: http://www.stelligent.com/cloud/list-all-the-availability-zones/
# For get-ami.sh COREOS_UPDATE_CHANNE=stable AWS_REGION=us-west-2 VM_TYPE=hvm
- add possibility that user can specify the number of AZ his region has so, that script doesn't fail. user should be able to setup in region which has currently 2 AZs only!
- bastion host -- create default users which can be used to get into the environment