📙 Disclaimer: This is being deprecated for the Mesosphere Universal Installer located here.

Install Mesosphere DC/OS on AWS

Prerequisites

Terraform 0.11.x
AWS SSH Keys
AWS IAM Keys

Getting Started

Create installer directory
Initialize Terraform
Configure AWS keys
Configure and deploy DC/OS

Create Installer Directory

Make your directory where Terraform will download and place your Terraform infrastructure files.

mkdir dcos-installer
cd dcos-installer

Initialize Terraform

Run this command below to have Terraform initialized from this repository. There is no git clone of this repo required as Terraform performs this for you.

terraform init -from-module github.com/dcos/terraform-dcos/aws
cp desired_cluster_profile.tfvars.example desired_cluster_profile.tfvars

Configure AWS SSH Keys

You can either upload your existing SSH keys or use an SSH key already created on AWS.

Upload existing key: To upload your own key not stored on AWS, read how to import your own key
Create new key: To create a new key via AWS, read how to create a key pair

When complete, retrieve the key pair name and ensure that it matches the ssh_key_name in your desired_cluster_profile.tfvars.

Note: The desired_cluster_profile.tfvars always takes precedence over the variables.tf and is best practice for any variable changes that are specific to your cluster.

When you have your key available, you can use ssh-add.

ssh-add ~/.ssh/path_to_you_key.pem

Note: When using an SSH agent it is best to add the command above to your ~/.bash_profile. Next time your terminal gets reopened, it will reload your keys automatically.

Configure AWS IAM Keys

You will need your AWS aws_access_key_id and aws_secret_access_key. If you don't have one yet, you can get them from the AWS access keys documentation.

When you get them, you can install it in your home directory. The default location is $HOME/.aws/credentials on Linux and macOS, or "%USERPROFILE%\.aws\credentials" for Windows users.

Here is an example of the output when you're done:

$ cat ~/.aws/credentials
[default]
aws_access_key_id = ACHEHS71DG712w7EXAMPLE
aws_secret_access_key = /R8SHF+SHFJaerSKE83awf4ASyrF83sa471DHSEXAMPLE

Note: [default] is the name of the aws_profile. You may select a different profile to use in Terraform by adding it to your desired_cluster_profile.tfvars as aws_profile = "<INSERT_CREDENTIAL_PROFILE_NAME_HERE>".

Configure And Deploy DC/OS

Deploying with Custom Configuration

The default variables are tracked in the variables.tf file. Since this file can be overwritten during updates when you may run terraform get --update when you fetch new releases of DC/OS to upgrade to, it's best to use the desired_cluster_profile.tfvars and set your custom Terraform and DC/OS flags there. This way you can keep track of a single file that you can use manage the lifecycle of your cluster.

Supported Operating Systems

Here is the list of operating systems supported.

Supported DC/OS Versions

Here is the list of DC/OS versions supported.

Note: Master DC/OS version is not meant for production use. It is only for CI/CD testing.

To apply the configuration file, you can use this command below.

terraform apply -var-file desired_cluster_profile.tfvars

Advanced YAML Configuration

We have designed this project to be flexible. Here are the example working variables that allows very deep customization by using a single tfvars file.

For advanced users with stringent requirements, here are DC/OS flag examples you can simply paste in desired_cluster_profile.tfvars.

$ cat desired_cluster_profile.tfvars
dcos_version = "1.11.1"
os = "centos_7.3"
num_of_masters = "3"
num_of_private_agents = "2"
num_of_public_agents = "1"
ssh_key_name = "default" 
dcos_cluster_name = "DC/OS Cluster"
dcos_cluster_docker_credentials_enabled =  "true"
dcos_cluster_docker_credentials_write_to_etc = "true"
dcos_cluster_docker_credentials_dcos_owned = "false"
dcos_cluster_docker_registry_url = "https://index.docker.io"
dcos_use_proxy = "yes"
dcos_http_proxy = "example.com"
dcos_https_proxy = "example.com"
dcos_no_proxy = <<EOF
# YAML
 - "internal.net"
 - "169.254.169.254"
EOF
dcos_overlay_network = <<EOF
# YAML
    vtep_subnet: 44.128.0.0/20
    vtep_mac_oui: 70:B3:D5:00:00:00
    overlays:
      - name: dcos
        subnet: 12.0.0.0/8
        prefix: 26
EOF
dcos_rexray_config = <<EOF
# YAML
  rexray:
    loglevel: warn
    modules:
      default-admin:
        host: tcp://127.0.0.1:61003
    storageDrivers:
    - ec2
    volume:
      unmount:
        ignoreusedcount: true
EOF
dcos_cluster_docker_credentials = <<EOF
# YAML
  auths:
    'https://index.docker.io/v1/':
      auth: Ze9ja2VyY3licmljSmVFOEJrcTY2eTV1WHhnSkVuVndjVEE=
EOF

Note: The YAML comment is required for the DC/OS specific YAML settings.

Upgrading DC/OS

You can upgrade your DC/OS cluster with a single command. This terraform script was built to perform installs and upgrades from the inception of this project. With the upgrade procedures below, you can also have finer control on how masters or agents upgrade at a given time. This will give you the ability to change the parallelism of master or agent upgrades.

DC/OS Upgrades

Rolling Upgrade

Supported upgraded by dcos.io

Prerequisite:

Update your terraform scripts to gain access to the latest DC/OS version with this command below. Please make sure you meet the current upgrade version conditions here https://docs.mesosphere.com/1.11/installing/oss/upgrading/#supported-upgrade-paths.

terraform get --update
# change dcos_version = "<desired_version>" in desired_cluster_profile.tfvars

Masters Sequentially, Agents Parellel:

terraform apply -var-file desired_cluster_profile.tfvars -var state=upgrade -target null_resource.bootstrap -target null_resource.master -parallelism=1
terraform apply -var-file desired_cluster_profile.tfvars -var state=upgrade

All Roles Simultaniously

Not supported by dcos.io but it works without dcos_skip_checks enabled.

terraform apply -var-file desired_cluster_profile.tfvars -var state=upgrade

Maintenance

If you would like to add more or remove (private) agents or public agents from your cluster, you can do so by telling terraform your desired state and it will make sure it gets you there.

Adding Agents

# update num_of_private_agents = "5" in desired_cluster_profile.tfvars
terraform apply -var-file desired_cluster_profile.tfvars

Removing Agents

# update num_of_private_agents = "2" in desired_cluster_profile.tfvars
terraform apply -var-file desired_cluster_profile.tfvars

Important: Always remember to save your desired state in your desired_cluster_profile.tfvars

Redeploy an Existing Master

If you wanted to redeploy a problematic master (ie. storage filled up, not responsive, etc), you can tell Terraform to redeploy during the next cycle.

Note: This only applies to DC/OS clusters that have set their dcos_master_discovery to master_http_loadbalancer and not static.

Master Node

Taint master node:

terraform taint aws_instance.master.0 # The number represents the agent in the list

Redeploy master node:

terraform apply -var-file desired_cluster_profile.tfvars

Redeploy an Existing Agent

If you wanted to redeploy a problematic agent, (ie. storage filled up, not responsive, etc), you can tell terraform to redeploy during the next cycle.

Private Agents

Taint private agent:

terraform taint aws_instance.agent.0 # The number represents the agent in the list

Redeploy agent:

terraform apply -var-file desired_cluster_profile.tfvars

Public Agents

Taint private agent:

terraform taint aws_instance.public-agent.0 # The number represents the agent in the list

Redeploy agent:

terraform apply -var-file desired_cluster_profile.tfvars

Experimental

Adding GPU Private Agents

Note: Best used with DC/OS 1.9 and above

As of Mesos 1.0, which now supports GPU agents, you can experiment with them immediately by simply removing .disabled from dcos-gpu-agents.tf.disabled. Once you do that, you can simply perform terraform apply and the agents will be deployed and configure and automatically join your mesos cluster. The default of num_of_gpu_agents is 1. You can also remove GPU agents by simply adding .disabled and it will exit as well.

Add GPU Private Agents

mv dcos-gpu-agents.tf.disabled dcos-gpu-agents.tf
terraform get
# add num_of_gpu_agents = "3" in desired_cluster_profile.tfvars
terraform apply -var-file desired_cluster_profile.tfvars

Remove GPU Private Agents

mv dcos-gpu-agents.tf dcos-gpu-agents.tf.disabled
# remove num_of_gpu_agents = "3" in desired_cluster_profile.tfvars
terraform apply -var-file desired_cluster_profile.tfvars

Destroy Cluster

You can shutdown/destroy all resources from your environment by running this command below:

terraform destroy -var-file desired_cluster_profile.tfvars

Roadmap

Support for AWS
Support for CoreOS
Support for Public Agents
Support for expanding Private Agents
Support for expanding Public Agents
Support for specific versions of CoreOS
Support for Centos
Secondary support for specific versions of Centos
Support for RHEL
Secondary support for specific versions of RHEL
Multi AZ Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Install Mesosphere DC/OS on AWS

Prerequisites

Getting Started

Create Installer Directory

Initialize Terraform

Configure AWS SSH Keys

Configure AWS IAM Keys

Configure And Deploy DC/OS

Deploying with Custom Configuration

Supported Operating Systems

Supported DC/OS Versions

Advanced YAML Configuration

Upgrading DC/OS

DC/OS Upgrades

Rolling Upgrade

Supported upgraded by dcos.io

Prerequisite:

Masters Sequentially, Agents Parellel:

All Roles Simultaniously

Not supported by dcos.io but it works without dcos_skip_checks enabled.

Maintenance

Adding Agents

Removing Agents

Redeploy an Existing Master

Master Node

Redeploy an Existing Agent

Private Agents

Public Agents

Experimental

Adding GPU Private Agents

Add GPU Private Agents

Remove GPU Private Agents

Destroy Cluster

Roadmap

Files

README.md

Latest commit

History

README.md

File metadata and controls

Install Mesosphere DC/OS on AWS

Prerequisites

Getting Started

Create Installer Directory

Initialize Terraform

Configure AWS SSH Keys

Configure AWS IAM Keys

Configure And Deploy DC/OS

Deploying with Custom Configuration

Supported Operating Systems

Supported DC/OS Versions

Advanced YAML Configuration

Upgrading DC/OS

DC/OS Upgrades

Rolling Upgrade

Supported upgraded by dcos.io

Prerequisite:

Masters Sequentially, Agents Parellel:

All Roles Simultaniously

Not supported by dcos.io but it works without dcos_skip_checks enabled.

Maintenance

Adding Agents

Removing Agents

Redeploy an Existing Master

Master Node

Redeploy an Existing Agent

Private Agents

Public Agents

Experimental

Adding GPU Private Agents

Add GPU Private Agents

Remove GPU Private Agents

Destroy Cluster

Roadmap