# Introduction

This package makes it easy to start and stop AWS spot instances; and simpler to work with AWS resources. The following cells walk through some examples.

In [2]:
from ipstartup import *
from aws2 import aws, Image, Instance, Volume, Spot, Snapshot

[root:INFO]:starting (cellevents.py:36, time=Jul-03 23:16)


time: 662 ms


# Creating an AMI

First step is to create a base AMI for our project. We need the id of an existing AMI and then give our project a name. In this case we use one of the fastai Europe Region AMIs and call the image "fastai2". Note you may need to change the defaults:
* security=["default"]
* key="key"
* ip=0 which is your first elastic ip. For random set ip=None.

In [4]:
Image.copy("ami-9e1a35ed", "fastai2")

[root:INFO]:starting (cellevents.py:36, time=Jul-03 23:24)
[root:INFO]:requesting spot t2.micro $0.00, eu-west-1c, memory=1, vcpu=1 (spot.py:87, time=Jul-03 23:24)
[root:INFO]:waiting for request to be fulfilled (spot.py:112, time=Jul-03 23:24)
[root:INFO]:waiting for instance running (spot.py:125, time=Jul-03 23:24)
[root:INFO]:waiting for ssh server at 34.248.84.101 (instance.py:140, time=Jul-03 23:24)
[root:INFO]:waiting for snapshot (volume.py:24, time=Jul-03 23:25)
[root:INFO]:You now have 1 fastai2 snapshots (volume.py:30, time=Jul-03 23:25)
[root:INFO]:waiting for image to be saved (snapshot.py:38, time=Jul-03 23:25)
[root:INFO]:waiting for volume available (volume.py:38, time=Jul-03 23:25)


time: 1min 27s


Now we have a base AMI in our account from which we can launch spot instances or standard instances.

# Starting and stopping a spot instance

## Simple version

In [5]:
i=Spot("fastai2")
i.jupyter()

[root:INFO]:starting (cellevents.py:36, time=Jul-03 23:28)
[root:INFO]:requesting spot t2.micro $0.00, eu-west-1c, memory=1, vcpu=1 (spot.py:87, time=Jul-03 23:28)
[root:INFO]:waiting for request to be fulfilled (spot.py:112, time=Jul-03 23:28)
[root:INFO]:waiting for instance running (spot.py:125, time=Jul-03 23:29)
[root:INFO]:waiting for ssh server at 34.248.84.101 (instance.py:140, time=Jul-03 23:29)
[root:INFO]:waiting for jupyter notebook server at 34.248.84.101:8888 (instance.py:155, time=Jul-03 23:30)


time: 2min 32s


Now you have an instance named "fastai2" with boot volume named "fastai2". The address is already in the clipboard so can be pasted into your browser address bar. Then you can use jupyter notebooks on the remote server. When you have finished then just stop the instance like this:

In [6]:
i.stop()

[root:INFO]:starting (cellevents.py:36, time=Jul-03 23:31)
[root:INFO]:waiting for snapshot (volume.py:24, time=Jul-03 23:31)
[root:INFO]:You now have 2 fastai2 snapshots (volume.py:30, time=Jul-03 23:31)
[root:INFO]:waiting for image to be saved (snapshot.py:38, time=Jul-03 23:31)
[root:INFO]:waiting for volume available (volume.py:38, time=Jul-03 23:31)


time: 2min 35s


Now you have a new snapshot named "fastai2" which contains the boot volume from your session; the "fastai2" image now points to the latest snapshot; and the spot instance and volume have been deleted.

Each time you stop:
* You get an additional fastai2 snapshot. The older versions can be deleted via the AWS menus if required. Note snapshots are incremental so the cost of multiple snapshots is low and can be useful to rollback.
* The image gets replaced. There is only one fastai2 image. This points to the latest fastai2 snapshot.

If AWS terminates your spot instance:
* The stop function will be called automatically
* As a failsafe, if shutdown occurs without calling stop then the volume will still be present. In this case you can create a snapshot and image manually. No data has been lost until you delete the volume.

Naming of resources uses AWS tags:
* The name "fastai2" is stored as AWS "tag:Name"
* This is shown on AWS menus as "Name"
* Note some AWS resource types also have a separate, unconnected "Name" field which may also appear on AWS menus. The latter is not used here as it is inconsistently available and has to be unique.



## Other instance or spot methods

See the docstring for details. For example can select based on instance type or if you can't remember the name just enter selection/sort criteria

In [None]:
# i=Spot("fastai2", "p2.xlarge")

In [9]:
i=Spot("fastai2", "memory>=15 & vcpu>=2", sort=["percpu"])

[root:INFO]:starting (cellevents.py:36, time=Jul-03 23:35)
[root:INFO]:requesting spot m1.xlarge $0.04, eu-west-1a, memory=15, vcpu=4 (spot.py:87, time=Jul-03 23:36)
[root:INFO]:waiting for request to be fulfilled (spot.py:112, time=Jul-03 23:36)
[root:INFO]:waiting for instance running (spot.py:125, time=Jul-03 23:36)
[root:INFO]:waiting for ssh server at 34.248.84.101 (instance.py:140, time=Jul-03 23:36)


time: 1min 19s


Terminate a spot instance without saving. Spot instances have delete_on_termination=False by default.

In [10]:
i.terminate(delete_volume=True)

[root:INFO]:starting (cellevents.py:36, time=Jul-03 23:37)
[root:INFO]:waiting for volume available (volume.py:38, time=Jul-03 23:37)


time: 31 s


Launch a standard (non-spot) instance

In [12]:
i=Instance("fastai2")
i.start()

[root:INFO]:starting (cellevents.py:36, time=Jul-03 23:38)
[root:INFO]:waiting for instance running (instance.py:48, time=Jul-03 23:38)
[root:INFO]:waiting for ssh server at 54.76.67.99 (instance.py:140, time=Jul-03 23:38)


time: 1min 4s


Use python dict syntax for tags

In [17]:
i.set_tags(someflag="www", another="xxx")
i.tags

[root:INFO]:starting (cellevents.py:36, time=Jul-03 23:41)


{'Name': 'fastai2', 'another': 'xxx', 'someflag': 'www'}

time: 322 ms


In [18]:
i.terminate()

[root:INFO]:starting (cellevents.py:36, time=Jul-03 23:41)


time: 364 ms


# Other utilities

Get resources sorted with most recent first.

In [19]:
aws.get_images()

[root:INFO]:starting (cellevents.py:36, time=Jul-03 23:42)


[ec2.Image(id='ami-50cb5129'),
 ec2.Image(id='ami-f298d98b'),
 ec2.Image(id='ami-6b080a81'),
 ec2.Image(id='ami-9c9f9476'),
 ec2.Image(id='ami-bd001457')]

time: 755 ms


....and apply filter

In [20]:
aws.get_snapshots(Name="fastai2")

[root:INFO]:starting (cellevents.py:36, time=Jul-03 23:42)


[ec2.Snapshot(id='snap-0138298ebef60c772'),
 ec2.Snapshot(id='snap-0fc6c225fb5c04ed6'),
 ec2.Snapshot(id='snap-07c24276a06cc16e3'),
 ec2.Snapshot(id='snap-0fbb9f6f25e92d356'),
 ec2.Snapshot(id='snap-0929cb167e88fa510')]

time: 109 ms


Get list of instance types and spot prices

In [21]:
df=aws.get_spotprices()
df[df.memory>30].sort_values("SpotPrice")[["memory", "InstanceType", "SpotPrice", "vcpu"]].head(10)

[root:INFO]:starting (cellevents.py:36, time=Jul-03 23:42)


Unnamed: 0,memory,InstanceType,SpotPrice,vcpu
299,34,m2.2xlarge,0.055,4
300,34,m2.2xlarge,0.055,4
301,34,m2.2xlarge,0.055,4
266,68,m2.4xlarge,0.11,8
265,68,m2.4xlarge,0.11,8
264,68,m2.4xlarge,0.11,8
181,32,t2.2xlarge,0.121,8
180,32,t2.2xlarge,0.121,8
179,32,t2.2xlarge,0.121,8
17,32,m4.2xlarge,0.1284,8


time: 15 s


This is the data that can be extracted from aws for each instance

In [22]:
df.columns

[root:INFO]:starting (cellevents.py:36, time=Jul-03 13:06)


Index(['capacitystatus', 'clockSpeed', 'currentGeneration',
       'dedicatedEbsThroughput', 'ecu', 'enhancedNetworkingSupported', 'gpu',
       'instanceFamily', 'InstanceType', 'intelAvx2Available',
       'intelAvxAvailable', 'intelTurboAvailable', 'licenseModel', 'location',
       'locationType', 'memory', 'networkPerformance',
       'normalizationSizeFactor', 'operatingSystem', 'operation',
       'physicalProcessor', 'preInstalledSw', 'processorArchitecture',
       'processorFeatures', 'servicecode', 'servicename', 'storage', 'tenancy',
       'usagetype', 'vcpu', 'AvailabilityZone', 'SpotPrice', 'percpu',
       'per64cpu'],
      dtype='object')

time: 6 ms


Use of python dictionaries for filters and tag filters

In [22]:
aws.filt(a=4, b="hhhh")

[root:INFO]:starting (cellevents.py:36, time=Jul-03 23:42)


[{'Name': 'a', 'Values': [4]}, {'Name': 'b', 'Values': ['hhhh']}]

time: 7.84 ms


In [23]:
aws.tfilt(a=4, b="hhhh")

[root:INFO]:starting (cellevents.py:36, time=Jul-03 23:42)


[{'Name': 'tag:a', 'Values': [4]}, {'Name': 'tag:b', 'Values': ['hhhh']}]

time: 9.2 ms


# Enhanced Network Architecture (ENA)

Typically AMIs (images) are hardware independent. However there are some exceptions such as ENA. You have to have ENA enabled on the image in order to launch a C5 instance. On the other hand cheaper instance types cannot use ENA. Hence if you are doing simple tasks using a cheap T2 instance and save an image then you will not be able to launch a C5 from that image to do the heavylifting.

The workaround is to stop the spot instance, start a standard instance, stop the instance, change the instance type to C5, enable ENA on the stopped instance, then create the image. Fortunately this can be scripted as below.

It may take up to an hour. However in practice most development and testing can be carried out on non-ENA instances so it is only on release of a production version that you need to create an ENA enabled image.

In [None]:
i.stop()
i = Instance("fastai2")
i.create_image_ena()

# Appendix - AWS resources

These are the main AWS EC2 resource types
* Instance - a running machine
* Spot - lower cost instance. This has two disadvantages. Firstly it can be terminated by AWS with just 2 minutes notice in periods of high demand. Secondly it cannot be stopped and started, only terminated.
* Volume - disk storage for an instance
* Snapshot - longer term storage that is not attached to a instance
* Image - the AMI for launching a machine. This is linked to a snapshot of the boot drive.

Some of the issues with aws/boto3 that are partially addressed here:
* spot instances cannot be stopped only terminated
* identifying resources by name rather than id
* filters and tags not in dict format
* cannot test isinstance(i, ec2.Instance)
* some common actions are multi-step with waiters in between
* inconsistent field names e.g. state, State, status
* waiters only wait for limited number of retries