## Overview of pdrive package

This package puts programs and data on a portable "pdrive" rather than an on 
the AWS server. The "pdrive" can then be moved between different types of 
server including spot instances. Ths saves 100% of the cost of setting up data and programs by using free tier
servers; and 80% of the cost of GPUs by providing persistent storage for spot
instances

The examples here show how to setup and work with various types of server.

## Imports

In [4]:
from ipstartup import *
import aws
import server
import apps
from pdrive import Pdrive
import fabric.api as fab
from config import user, keyfile
fab.env.user = user
fab.output['everything'] = True

## Setup configuration

Setup AWS account
* create an AWS account
* create an AWS config file and AWS credentials file
* request a limit increase to access a GPU

Adapt aws/config.py as required (or just accept the defaults)
* check file locations
* check region and AMIs if eu-west not appropriate
* add any additional server types you might need

Setup credentials
* create _creds.py file in your python path with plain text, unhashed password:
  - notebook=dict(password="dl_course")
* setup key and security group using scripts below

In [None]:
# create a key
try:
    key = aws.ec2.create_key_pair(KeyName="key")
    with open(keyfile, "w") as f:
        f.write(key.key_material)
except Exception as e:
    log.warning(e)

In [None]:
# create a security group
try:
  sec = aws.ec2.create_security_group(GroupName="simon", 
                                Description="wordpress, jupyter, ssh")
  sec.authorize_ingress(
      IpPermissions=[dict(IpProtocol='tcp', FromPort=80, ToPort=80),
                     dict(IpProtocol='tcp', FromPort=443, ToPort=443),
                     dict(IpProtocol='tcp', FromPort=8888, ToPort=8888),
                     dict(IpProtocol='tcp', FromPort=22, ToPort=22)])
except Exception as e:
    log.warning(e)

## Setup programs and data using a free instance

In [None]:
# create a server called "kate" with a free instance; and a volume called "fastai" mounted at /v1 to hold programs and data
server.create("kate", itype="free", pdrive="fastai", pdrivesize=10)

It can take several minutes to pull a large docker image or data file; and doing this via a notebook either produces excessive output or a silent wait. Therefore carry out these steps via SSH as it is easier to monitor progress. 
* download data to /v1 (as required)
* docker pull simonm3/fastai (or other docker image)

In [None]:
# run the notebook. config read from /v1/.jupyter/jupyter_notebook_config.py so password can be changed.
apps.run_fastai()

Optionally you can install some additional tools from github projects

In [None]:
# install some additional tools on /v1
fab.sudo("yum install -y -q git")
with fab.cd("/v1"):
  apps.install_github("simonm3", ["basics"])
# add basics/pathconfig.pth to pythonpath
with fab.quiet():
  fab.run("docker exec notebook python /host/basics/pathconfig.py")

You can test the fastai notebook at the "kate" ip address port 8888. All of the setup time so far has used free instances and free storage. Next step is to terminate this instance. All data and programs will be preserved in a snapshot that we can later attach to a high performance instance such as a GPU.

In [None]:
server.terminate("kate")

## Work with the programs and data using a GPU

Create a spot GPU server called "sarah" with the same data and programs as before

In [None]:
server.create("sarah", itype="gpu", spot=True, pdrive="fastai")

Work with "sarah" ip address port 8888. When you have finished working then terminate the server. Note that calling server.terminate("sarah") saves the pdrive as a snapshot including all data and programs. On termination by AWS (e.g. if outbid on spot instance) or via the AWS menu, the volume will remain but will not automatically be saved as a snapshot. In this case manually save as snapshot and delete the volume. It would be possible to automate this by capturing AWS termination notices but this is not included currently.

In [None]:
server.terminate("sarah")

## Create more servers

It is possible to create servers without a pdrive. For example you may want to create a server with a static IP address running wordpress.
* request an elastic ip address from AWS (this is free as long as attached to a running instance)
* run script below

In [None]:
instance = server.create("sm1")

# attach to the first elastic ip address on your account
fab.env.host_string = aws.get_ips()[0]
aws.client.associate_address(InstanceId=instance.instance_id,
                             PublicIp=fab.env.host_string)
server.wait_ssh()
apps.install_docker()
apps.install_wordpress()

## Work with an existing pdrive

Typically you will create a server and pdrive at the same time. However sometimes you may want to attach the pdrive to an existing instance. This is also possible with the commands below.

In [None]:
pdrive = Pdrive("fastai")
pdrive.connect("sm1")

The pdrive is now attached as /v1 to the server sm1. In this case docker is not installed or started automatically. You can work with /v1 as required. When finished disconnect.

In [None]:
pdrive.disconnect()

## Utilities

As a bonus there are a number of utilities available as below. Also, for convenience, all resources (instances, volumes, snapshots) can be referred to by name rather than the amazon 20 character id.

In [None]:
# get a resource by name
aws.get("sm1")

In [None]:
# get all resources (instances, volumes, snapshots)
aws.get(unique=False)

In [5]:
# show instances used
aws.get_instances()

[root:INFO]:starting (cellevents.py\30, time=15:30)


Unnamed: 0,name,instance_id,image,type,state,ip
0,sm1,i-0e278aadbb0395c13,ami-c51e3eb6,t2.micro,running,34.248.84.101


time: 3.68 s


In [6]:
# show python tasks running in containers
fab.env.host_string=aws.get("sm1").public_ip_address
server.get_tasks("python")

[root:INFO]:starting (cellevents.py\30, time=15:32)


[34.248.84.101] run: docker exec meetup ps -eo args | grep python || true
[34.248.84.101] out: python3
[34.248.84.101] out: python meetup/meetup.py
[34.248.84.101] out: 

[34.248.84.101] run: docker exec wordpress_wordpress_1 ps -eo args | grep python || true
[34.248.84.101] run: docker exec wordpress_db_1 ps -eo args | grep python || true


Unnamed: 0,container,task
0,meetup,python3
1,meetup,python meetup/meetup.py


time: 4.34 s


In [None]:
# run a python program in a container
apps.run_python("meetup")

In [None]:
# show all tasks running in containers
server.get_tasks()

In [None]:
# set docker location to pdrive
apps.set_docker_folder("/v1")

In [None]:
# set docker location to boot drive
apps.set_docker_folder()