Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-build AWS ami with Packer to minimise EC2 bootstrapping time #260

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 30 additions & 2 deletions amlb/runners/aws.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ def __init__(self, framework_name, benchmark_name, constraint_name, region=None)
self.region = (region if region
else rconfig().aws.region if rconfig().aws['region']
else boto3.session.Session().region_name)
self.ami = rconfig().aws.ec2.regions[self.region].ami
self.ami = rconfig().aws.ec2.regions[self.region].ami if not rconfig().aws.use_packer_ami else rconfig().aws.ec2.regions[self.region].packer_ami
self.cloudwatch = None
self.ec2 = None
self.iam = None
Expand All @@ -148,7 +148,10 @@ def _validate(self):

def _validate2(self):
if self.ami is None:
raise ValueError("Region {} not supported by AMI yet.".format(self.region))
if rconfig().aws.use_packer_ami:
raise ValueError("Region {} has no pre-build packer AMI configured.".format(self.region))
else:
raise ValueError("Region {} not supported by AMI yet.".format(self.region))

def setup(self, mode):
if mode == SetupMode.skip:
Expand Down Expand Up @@ -1050,6 +1053,31 @@ def _ec2_startup_script(self, instance_key, script_params="", timeout_secs=-1):
""" if rconfig().aws.use_docker else """
#cloud-config

runcmd:
- apt-get -y remove unattended-upgrades
- systemctl stop apt-daily.timer
- systemctl disable apt-daily.timer
- systemctl disable apt-daily.service
- systemctl daemon-reload
- cd /repo
- alias PY='/repo/venv/bin/python3 -W ignore'
- aws s3 cp '{s3_input}' /s3bucket/input --recursive
- aws s3 cp '{s3_user}' /s3bucket/user --recursive
- PY {script} {params} -i /s3bucket/input -o /s3bucket/output -u /s3bucket/user -s only --session=
- PY {script} {params} -i /s3bucket/input -o /s3bucket/output -u /s3bucket/user -Xrun_mode=aws -Xproject_repository={repo}#{branch} {extra_params}
- aws s3 cp /s3bucket/output '{s3_output}' --recursive

final_message: "AutoML benchmark {ikey} completed after $UPTIME s"

power_state:
delay: "+1"
mode: poweroff
message: "I'm losing power"
timeout: {timeout}
condition: True
""" if rconfig().aws.use_packer_ami else """
#cloud-config

package_update: true
package_upgrade: false
packages:
Expand Down
53 changes: 53 additions & 0 deletions aws_ami/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Auto-ML AWS AMI

This directory contains the instructions and configuration files to build a custom automl AWS Amazon machine image (AMI).

Note, the current configuration only works in the eu-central-1 region of AWS.
Since, the base ami is hard coded to match an AMI that's only available in this region.


## Prerequisites

### Install Packer

[Packer](https://learn.hashicorp.com/packer) is the command line tool that's used by this module to build a custom AWS AMI.
To build the image, the packer cmd tool must be installed on your local machine.
For more install information, [how to install packer](https://learn.hashicorp.com/tutorials/packer/getting-started-install)

### AWS credentials setup
Before building a new automl AMI, the AWS credentials must be configured to allow for programmatic access
- configure AWS credentials
- create AWS profile (e.g. automl)
- set profile name (2 options)
- update profile name in packer config file (i.e. ami-automl.pkr.hcl)
- set profile environment variable (e.g. 'export AWS_PROFILE=automl')


## Validate and Build

Before building the automl AMI, the packer config file should be validated.
To validate, run the following command.

```sh
packer validate ./config/ami-automl.pkr.hcl
```

If the validation step has succeeded, run the following command to build the ami.

```sh
packer build ./config/ami-automl.pkr.hcl
```

## AMI steps
build steps:
- install: curl wget unzip git
- install: software-properties-common
- add repository ppa:deadsnakes/ppa
- update packages
- install python3 (pip3, pyvenv, pydev, python)
- install awscli, wheel
- create directories
- git clone stable branche automl
- create python environment
- install python packages

51 changes: 51 additions & 0 deletions aws_ami/config/ami-automl.pkr.hcl
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
variable "source_ami" {
type = string
description = "Ubuntu Server 18.04 LTS (HVM), EBS General Purpose (SSD) VolumeType"
default = "ami-0bdf93799014acdc4"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this defaults to an ami available only in eu-central-1: would be nice to have a way to automatically default to the ami defined associated to the selected region in config.yaml.
Until then, I suggest:

Suggested change
default = "ami-0bdf93799014acdc4"
default = "<use one ami defined in config.yaml, namespace aws.ec2.regions>"

# Optional: define a filter to automatically pick the latest version of an ami as source ami.
// source_ami_filter {
// filters = {
// name = "ubuntu/images/*ubuntu-xenial-16.04-amd64-server-*"
// root-device-type = "ebs"
// virtualization-type = "hvm"
// }
// most_recent = true
// owners = ["099720109477"]
// }
}

locals { timestamp = regex_replace(timestamp(), "[- TZ:]", "") }

# source blocks configure your builder plugins; your source is then used inside
# build blocks to create resources. A build block runs provisioners and
# post-processors on an instance created by the source.
source "amazon-ebs" "automl-ami" {
# the profile to use in the shared credentials file for AWS.
// profile = "default"

ami_name = "ami-automl-${local.timestamp}"
ami_description = "AMI for the AutoML benchmark project"

# uncomment following line to create a public ami, default a private ami is created
// ami_groups = ["all"]

instance_type = "t2.micro"
sebhrusen marked this conversation as resolved.
Show resolved Hide resolved
source_ami = var.source_ami

ssh_username = "ubuntu"
}

# a build block invokes sources and runs provisioning steps on them.
build {
sources = ["source.amazon-ebs.automl-ami"]

provisioner "shell" {
execute_command = "echo 'packer' | sudo -S env {{ .Vars }} {{ .Path }}"
environment_vars = [
"BRANCH=stable",
"GITREPO=https://github.com/openml/automlbenchmark",
"PYV=3"
Comment on lines +45 to +47
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can those be turned into variables?

]
script = "./scripts/configure-ami.sh"
}
}
36 changes: 36 additions & 0 deletions aws_ami/scripts/configure-ami.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
#!/bin/bash

echo "*** start ami configuration ***"

apt-get update

echo "*** install: curl, wget, unzip, git ***"
# Skip restart prompt of 'libssl1.1' by running following command
echo '* libraries/restart-without-asking boolean true' | debconf-set-selections
apt-get -y install curl wget unzip git
apt-get -y install software-properties-common
add-apt-repository -y ppa:deadsnakes/ppa
apt-get update

echo "*** install python${PYV} ***"
apt-get -y install python$PYV python$PYV-venv python$PYV-dev python3-pip

echo "*** install awscli ***"
pip3 install -U wheel awscli --no-cache-dir

echo "make automl directory structure"
mkdir -p /s3bucket/input
mkdir -p /s3bucket/output
mkdir -p /s3bucket/user
mkdir /repo

echo "clone repo"
cd /repo
git clone --depth 1 --single-branch --branch $BRANCH $GITREPO .

echo "create python environment"
python3 -m venv venv

echo "install python packages"
/repo/venv/bin/pip3 install -U pip
xargs -L 1 /repo/venv/bin/pip3 install --no-cache-dir < requirements.txt
8 changes: 8 additions & 0 deletions resources/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,10 @@ aws:
root_key: ec2/
delete_resources: false

use_packer_ami: false # if true, the EC2 instance will be started with the AMI ID of the pre build packer AMI.
# Note, make sure to enter the AMI ID of your packer build image in the packer_ami field (i.e. ec2.regions.[region].packer_ami).
# For more information, see the aws_ami directory.

Comment on lines +119 to +122
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather move this config under the aws.ec2 namespace

ec2:
key_name: # the name of the key pair passed to EC2 instances (if not set, user can't ssh the running instances)
security_groups: [] # the optional additional security groups to set on the instances
Expand Down Expand Up @@ -150,14 +154,18 @@ aws:
us-east-1:
ami: ami-0ac019f4fcb7cb7e6
description: Ubuntu Server 18.04 LTS (HVM), EBS General Purpose (SSD) VolumeType
packer_ami:
us-west-1:
ami: ami-063aa838bd7631e0b
description: Ubuntu Server 18.04 LTS (HVM), EBS General Purpose (SSD) VolumeType
packer_ami:
eu-west-1:
ami: ami-00035f41c82244dab
description: Ubuntu Server 18.04 LTS (HVM), EBS General Purpose (SSD) VolumeType
packer_ami:
eu-central-1:
ami: ami-0bdf93799014acdc4
packer_ami:
description: Ubuntu Server 18.04 LTS (HVM), EBS General Purpose (SSD) VolumeType
spot:
enabled: false # if enabled, aws mode will try to obtain a spot instance instead of on-demand.
Expand Down