### Amazon Web Services

If you have not already done so, set up an AWS account<sup>newaws</sup>. 

newaws: As of 2018/08/25, detailed instructions for doing this can be obtained here: https://aws.amazon.com/premiumsupport/knowledge-center/create-and-activate-aws-account/.


You will be using AWS to manage the hardware upon which your data science platform will run. We will leave the details of what exactly "hardware" means to AWS. This is to say that AWS may be allocating resources as a virtual machine, but for your purposes, the experience will be as if you are using a physical system across the room from you.

The most popular service offered by Amazon Web Services is the Elastic Compute Cloud (EC2), "a web service that provides secure, resizable compute capacity in the cloud"<sup>ec2</sup>. For our purposes, compute capacity means a cloud-based computer you will use to run your platform.

ec2: https://aws.amazon.com/ec2/


Readers new to AWS will be able to work through this text using the AWS Free Tier<sup>free tier</sup>. For the first 12 months following sign up, new users receive 750 Hours per month of EC2 time. This amounts to 31.25 days of availability and, provided that readers keep only one server running at a time, ensures that readers can work through this text at no cost.


free tier: https://aws.amazon.com/free/


### Configure your Local System

In this first chapter, with the exception of some AWS system configuration in the browser, all of the work that you will be doing will take place at the command line. Work at the command line is done via a special type of application called a shell. A shell is a user interface that provides access to the operating system of a computer. We prefer the shell to other modes of working with computers because of its simplicity.

The most popular shell is the Bourne Again Shell or Bash. If you are using a Mac OS X system or a Linux system, you will already have Bash available to you in an application called Terminal. If you are using a Windows system, we recommend the use of Git-Bash<sup>git bash</sup>. We have no preference as to the settings used when configuring Git-Bash. The default settings are fine when installing the program. If you are using a Chromebook, we recommend the use of Termius, available from the Chrome Web Store.

git bash: https://gitforwindows.org


#### SSH Keys

All of the work that you will be doing will take place remotely. As such, there is very little configuration to be done for the local system. The one thing that you will need to do is configure a set of SSH Keys to enable secure connection to the remote system you bring online.

<include type="image" url="ch-01-ssh-keys.png"> 

###### Connecting with SSH Keys

![](../img/ch-01-ssh-keys.png)

</include>


An SSH Key is a password-less method of authenticating to a remote system using public-key cryptography. Authentication is done using a key pair consisting of a public key, which can be shared publicly, and a private key, which is known only to the user (See Figure \ref{fig:ch-01-ssh-keys.png}). One might think of the public key as the lock on your front door, accessible to anyone, and the private key as the key in your pocket so that only you are able to open your door and gain access to your home.

You will generate this key pair on our local system and then provide the public key to AWS so that it can be added to any system you wish to launch. You will keep the private key on our local system and use it whenever you wish to gain access.

### Create a New Key Pair

You will use the Bash tool `ssh-keygen` to create a new key pair. To begin open a new terminal session<sup>term</sup>, where you will examine whether or not you already have a key pair. Launching a new Bash session will put you in your home directory. 

term: On a Mac or Linux system, simply open the Terminal application. On Windows, open Git-Bash. On Chromebook, open Termius.


The canonical location for storing SSH Keys is in a folder called `~/.ssh` in our home directory<sup>home</sup>. Note that this directory begins with a `.` which makes it a hidden directory. In Listing \ref{lst:ls-home}, you use `cd` to navigate to our home directory and `ls -la` to display all of the contents of our home directory in a list.

home: On every system that we will use, the $\sim$ symbol is used as an alias for your home directory. The location of the actual home directory will vary by system. Ubuntu users' home directory will be `/home/username`. Mac OS X users' home directory will be `/Users/username`. Windows/Git-Bash users' home directory will be `/c/Users/username`.


$\square$ **Note:** Occasionally in code listings, I will truncate the output. If you see `...` in the listing, this should be taken to mean that there is additional output generated that is not important for the discussion.

<include type="listing" label="ls-home">

###### List the contents of our home directory

```
$ cd ~
$ ls -la
...
drwxr-xr-x   21 joshuacook  staff    714 Jul 31  2017 .pylint.d
drwx------   11 joshuacook  staff    374 Jan 28 18:49 .ssh
drwxr-xr-x    6 joshuacook  staff    204 Feb  2 22:01 .vim
-rw-------    1 joshuacook  staff  20788 Feb 10 08:54 .viminfo
-rw-r--r--@   1 joshuacook  staff   1263 Jul 26  2017 .vimrc
drwx------@   5 joshuacook  staff    170 Aug 26 09:12 Applications
drwx------+  19 joshuacook  staff    646 Feb 11 09:32 Desktop
drwx------+   6 joshuacook  staff    204 Feb  4 12:18 Documents
...
```
</include>

My local system is running Mac OS X and has the `.ssh` folder already. As the directory already exists, in Listing \ref{lst:ls-ssh}, I list the contents of my `.ssh` directory.

<include type="listing" label="ls-ssh">
    
###### Display contents of the `.ssh` directory

```
$ ls -la .ssh
total 96
drwx------  11 joshuacook  staff    374 Jan 28 18:49 .
drwxr-xr-x+ 74 joshuacook  staff   2516 Feb 10 08:54 ..
-rw-------   1 joshuacook  staff   1679 Jan 11 16:21 id_rsa
-rw-r--r--   1 joshuacook  staff    418 Jan 11 16:21 id_rsa.pub
```

</include>

As can be seen, I already have an SSH Keypair named `id_rsa` and `id_rsa.pub`. If this is true for you, as well, you should skip the next step and not create new SSH Keys (See Listing  \ref{lst:create-new-ssh-key}).

If when listing the home directory, you do not see a folder called `.ssh` or when displaying the contents of `.ssh` you do not see an SSH Keypair named `id_rsa` and `id_rsa.pub`, a new SSH Keypair will need to be created. In Listing \ref{lst:create-new-ssh-key}, you create a new SSH Keypair using the `ssh-keygen` command line utility.

During the creation of the SSH Keypair, you will be prompted three times. The first asks where you should save the SSH Keypair, defaulting to the `.ssh/id_rsa` in our home directory. In Listing \ref{lst:create-new-ssh-key}, you see that this is being done at `/Users/joshuacook/.ssh/id_rsa` on my local system where my username is `joshuacook`. The second and third prompts will ask for a passphrase to be added to the key. For our purposes, leaving this passphrase empty will be fine. In other words, the default options are preferable and you may simply hit `<ENTER>` three times.

<include type="listing" label="create-new-ssh-key">
    
###### Create a new SSH Keypair

```
$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/Users/joshuacook/.ssh/id_rsa):
Created directory '/Users/joshuacook/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /Users/joshuacook/.ssh/id_rsa.
Your public key has been saved in /Users/joshuacook/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:MkCnhaAzjcjRHCUc/pdUsxrk7be6+gNXTrEchJyFOqs joshuacook@LOCAL
The key's randomart image is:
+---[RSA 2048]----+
| .===oo..o*o     |
|o+o=o+o o=oo     |
|*...o  +.o. +    |
| o  ...o=  =     |
|     .o+S.+.     |
|      .= ....    |
|      . o  .     |
|     E   ..      |
|       .o+o      |
+----[SHA256]-----+
```

</include>

You can verify the SSH Keypair you just created by displaying the Public Key in our shell (Listing \ref{lst:cat-pub-key}). Here, you use the `cat` command, which concatenates the contents of `id_rsa.pub` to the shell output.

<include type="listing" label="cat-pub-key">
    
###### Display Public SSH Key

```
$ cat ~/.ssh/id_rsa.pub
ssh-rsa
AAAAB3NzaC1yc2EAAAADAQABAAABAQDdnHPEiq1a4OsDDY+g9luWQS8pCjBmR
64MmsrQ9MaIaE5shIcFB1Kg3pGwJpypiZjoSh9pS55S9LckNsBfn8Ff42ALLj
R8y+WlJKVk/0DvDXgGVcCc0t/uTvxVx0bRruYxLW167J89UnxnJuRZDLeY9fD
OfIzSR5eglhCWVqiOzB+OsLqR1W04Xz1oStID78UiY5msW+EFg25Hg1wepYMC
JG/Zr43ByOYPGseUrbCqFBS1KlQnzfWRfEKHZbtEe6HbWwz1UDL2NrdFXxZAI
XYYoCVtl4WXd/WjDwSjbMmtf3BqenVKZcP2DQ9/W+geIGGjvOTfUdsCHennYI
EUfEEP joshuacook@LOCAL
```
</include>

### Configure your AWS Account

That is the sum of the local configuration you will need to do in order to get started. The next thing you will need to do is configure our AWS Account. To do this, you will need to configure a Key Pair corresponding to the SSH Keypair on your local system. The AWS Key Pair is slightly misnamed as it is not in fact a pair, but rather is simply the public portion of the SSH Keypair you have on our local system.

To begin, log in to your AWS control panel and navigate to the EC2 Dashboard (Figure \ref{fig:ch-01-access-ec2-dash.png}). First, access “Services” (Figure \ref{fig:ch-01-access-ec2-dash.png}, #1) then access “EC2” (Figure \ref{fig:ch-01-access-ec2-dash.png}, #2). The Services link can be accessed from any page in the AWS website.

<include type="image" url="ch-01-access-ec2-dash.png"> 

###### Access EC2 Dashboard

![](../img/ch-01-access-ec2-dash.png)

</include>

#### Add the Public Key to AWS

Once at the EC2 control panel, access the Key Pairs pane using either link (Figure \ref{fig:ch-01-access-key-pairs.png}).

<include type="image" url="ch-01-access-key-pairs.png"> 

###### Access Key Pairs in the EC2 Dashboard

![](../img/ch-01-access-key-pairs.png)

</include>

From the Key Pairs pane, choose “Import Key Pair.” This will activate a modal that you can use to create a new key pair associated with a region on your AWS account. Make sure to give the key pair a computer-friendly name, like `from-MacBook-2018`. Paste the contents of your public key (`id_rsa.pub`) into the public key contents. Prior to clicking Import, your key should appear as in Figure \ref{fig:ch-01-import-public-key.png}. Click Import to create the new key.

<include type="image" url="ch-01-import-public-key.png"> 

###### Import a New Public Key

![](../img/ch-01-import-public-key.png)

</include>

You have created a key pair between AWS and your local system. When you create a new instance, you will instruct AWS to provision the instance with this public key and thus you will be able to access the cloud-based system from your local system using your private key.

### Launch a New EC2 Instance

To create a new instance, start from the EC2 Dashboard and click the Launch Instance button (Figure \ref{fig:ch-01-launch-instance.png}).

<include type="image" url="ch-01-launch-instance.png"> 

###### Begin the launch process for a new instance

![](../img/ch-01-launch-instance.png)

</include>

#### Step 1: Choose an Amazon Machine Image (AMI)

The launching of a new instance is a multi-step process that walks the user through all configurations necessary. The **first tab** is “Choose AMI.” An AMI is an Amazon Machine Image\footnote{http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AMIs.html}. and contains the software you will need to run your sandbox machine. I recommend choosing the latest stable Ubuntu Server release that is free-tier eligible. At the time of writing, this was ami-efd0428f, Ubuntu Server 16.04 LTS (HVM), SSD Volume Type (Figure \ref{fig:ch-01-latest-ubuntu.png}).

<include type="image" url="ch-01-latest-ubuntu.png"> 

###### Choose the latest stable Ubuntu Server release as AMI

![](../img/ch-01-latest-ubuntu.png)

</include>

#### Step 2: Choose Instance Type

The **second tab** is “Choose Instance Type.” In practice, I have found that the free tier, `t2.micro`  (Figure \ref{fig:ch-01-choose-type.png}), is sufficient for many applications. Furthermore, the instance type may always be changed later should the need present itself.

<include type="image" url="ch-01-choose-type.png"> 

###### Choose `t2.micro` for Instance Type

![](../img/ch-01-choose-type.png)

</include>

#### Step 3: Configure Instance Details

The **third tab**, “Configure Instance,” can be safely ignored.

#### Step 4: Add Storage

The **fourth tab** is “Add Storage.” This option is also specific to intended usage. It should be noted that Jupyter Docker images can take up more than 5GB of disk
space in the local image cache. For this reason, it is recommended to raise the value from the default 8GB to 30GB. Furthermore, as noted on this tab:

<blockquote>
    Free tier eligible customers can get up to 30 GB of EBS General Purpose (SSD) or Magnetic storage.
</blockquote>

#### Step 5: Add Tags

The fifth tab, “Add Tags,” can be safely ignored.

#### Step 6: Configure Security Group

The sixth tab, “Configure Security Group,” is critical for the proper functioning of your systems. By default this tab will be set up to "Create a **new** security group". This will not work for us! Ultimately, we will be accessing our system via a web browser which we require at a minimum that port 80 is open. We recommend simply using the default group which will open our system on all ports. If greater security is required for your specific application a more restrictive security group may be defined and used.

Select the "default" security group (Figure \ref{fig:ch-01-default-security-group.png}).

<include type="image" url="ch-01-default-security-group.png"> 

###### Choose the latest stable Ubuntu Server release as AMI

![](../img/ch-01-default-security-group.png)

</include>

$\square$ **Note:** You may receive a Warning stating, "Rules with source of 0.0.0.0/0 allow all IP addresses to access your instance. We recommend setting security group rules to allow access from known IP addresses only." This is expected and is okay.

#### Step 7: Review Instance Launch

Finally, click “Review and Launch.” Here, you see the specific configuration of the EC2 instance you will be creating. Verify that you are creating a `t2.micro` (Figure \ref{fig:ch-01-review-and-launch.png}, #2)running the latest free tier-eligible version of Ubuntu Server (Figure \ref{fig:ch-01-review-and-launch.png}, #1)and that it is available to all traffic (Figure \ref{fig:ch-01-review-and-launch.png}, #3), and then click the Launch button (Figure \ref{fig:ch-01-review-and-launch.png}, #4).

<include type="image" url="ch-01-review-and-launch.png"> 

###### Review and launch the new instance

![](../img/ch-01-review-and-launch.png)

</include>

#### Add an SSH Key

In a final confirmation step, you will see a modal titled “Select an existing key pair or create a new key pair.” Select the key pair you previously created. Check the box acknowledging access to that key pair and launch the instance (Figure \ref{fig:ch-01-add-key-pair.png}).

<include type="image" url="ch-01-add-key-pair.png"> 

###### Add a key pair to the instance

![](../img/ch-01-add-key-pair.png)

</include>

$\square$ **Note:** If this step is not done correctly, that is, if the correct key pair is not added to the launching instance, the instance will need to be terminated and a new instance will need to be launched. There is now way to add a key pair to a running instance.

You should see a notification that the instance is now running. Click the View Instances tab in the lower right corner to be taken to the EC2 Dashboard Instances pane, where you should see your new instance running.

#### Examing the Newly Launched Instance

Make note of the IP address of the new instance (Figure \ref{fig:ch-01-new-ip.png}).

<include type="image" url="ch-01-new-ip.png"> 

###### Note the IP address of the new instance

![](../img/ch-01-new-ip.png)

</include>

### Git and Github

As you work through this text, you will be developing a series of data science projects. Tracking software development work is typically done using version control software. One of the most popular version control tools is `git`. Additonally, it can be useful to use a version control hosting service as a remote backup for work being tracked using `git`. The remote service we will use is Github.com. In my experience, learners who are new to version control often confuse `git` and Github, so it bares repeating -- we will use `git` to track changes we make to our code and Github as a remote backup for these changes.

#### Configuring Github

We will assume that you have a Github account. Once this has been done, you will need to configue an SSH connection between AWS and Github. This next part may potentially create a confusion. We are actually going to need a new SSH Keypair, this one associated with our AWS instance. This is because it is our AWS instance that will be connecting to Github, not our local machine (Figure See \ref{fig:ch-01-ssh-local-remote.png}).

<include type="image" url="ch-01-ssh-local-remote.png"> 

###### SSH Connections

![](../img/ch-01-ssh-local-remote.png)

</include>

#### Create a New Key Pair

In Listing \ref{lst:create-new-ssh-key-remote}, you create a new key pair on your remote AWS instance. In Listing \ref{lst:ssh-into-new-instance}, you connect to your new AWS instance. To do this we will use the IP address we made note of in Figure \ref{fig:ch-01-new-ip.png}. We use SSH to connect to our remote AWS instance. Note that we use the username, `ubuntu`, the default username for the Ubuntu 16 AMI provided by AWS.

<include type="listing" label="ssh-into-new-instance">
    
###### SSH into New Instance

```
$ ssh ubuntu@54.244.109.176
Welcome to Ubuntu 16.04.2 LTS (GNU/Linux 4.4.0-64-generic x86_64)
```

</include>

$\square$ **Note:** The first time you access your EC2 instance, you should see the following message: `The authenticity of host '54.244.109.176 (54.244.109.176)' can't be established ... Are you sure you want to continue connecting (yes/no)? This is expected. You should hit <ENTER> to accept or type yes and hit <ENTER>.`

In Listing \ref{lst:create-new-ssh-key-remote}, you create a new SSH Keypair on our remote AWS instance. Again, during the creation of the SSH Keypair, you will be prompted three times. The first asks where you should save the SSH Keypair, defaulting to the `.ssh/id_rsa` in our home directory. In Listing \ref{lst:create-new-ssh-key-remote}, you see that this is being done at `/home/ubuntu/.ssh/id_rsa`\footnote{This should be the same for everyone now, as you should be working on an AWS \texttt{t2.micro} running an Ubuntu system where the user's name is \texttt{ubuntu}}. The second and third prompts will ask for a passphrase to be added to the key. For our purposes, leaving this passphrase empty will be fine. In other words, the default options are preferable and you may simply hit `<ENTER>` three times.

<include type="listing" label="create-new-ssh-key-remote">
    
###### Create a new SSH Keypair

```
$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/ubuntu/.ssh/id_rsa):
Created directory '/home/ubuntu/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/ubuntu/.ssh/id_rsa.
Your public key has been saved in /home/ubuntu/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:ZSpFpgSRgRqlQom8yVBG2dZo1tgkPQdrmUGgMXGDtRY
ubuntu@ip-172-31-43-19
The key's randomart image is:
+---[RSA 2048]----+
|o=XBE/*.o        |
|==+=O**O.        |
|=o++o *o. o      |
|o+ . . . +       |
|      . S        |
|       .         |
|                 |
|                 |
|                 |
+----[SHA256]-----+
```

</include>

As before, you can verify the SSH Keypair you just created by displaying the Public Key in your shell (Listing \ref{lst:cat-pub-key-remote}). Again, you use the `cat` command, which concatenates the contents of `id_rsa.pub` to the shell output.

<include type="listing" label="cat-pub-key-remote">
    
###### Display Public SSH Key

```
$ cat ~/.ssh/id_rsa.pub
ssh-rsa
AAAAB3NzaC1yc2EAAAADAQABAAABAQDQ896GUMgCMAIW79gwF3ojRjcUYCKUKc8b+q
iQlah2jtr7s0K4WRGjktOy3lCCHO+1UK/GrzY1Y4VxCKoKJDH3G9N5UzyGhlxa/2Ah
kKxzHht1knyh/mkVGqYUhuHpXfxUQAstCFrIdp3G0MDPiko2qeJcBF7JSv1lLMbIuM
XuVU/Mzq6BU+tEogScYytmLckyEe1j8RJ+e5nBURwmkgj3UAN1DzmU/lVwLlltEpmC
DlOel4yEXAw8yBwM3GwjahfiBThvBHpsc43HxWrkM8Yi/kdDnvsDZYxU4zhXZPsPab
UY/LfxEod9c6Sui5W8GtAfdi6krnqbzxrKt81Mradh ubuntu@ip-172-31-43-19
```

</include>

#### Add the Public Key to Github

Previously, you added your local SSH public key to your AWS account. Now, you will add your AWS SSH public key to your Github account\footnote{https://help.github.com/articles/adding-a-new-ssh-key-to-your-github-account/}. First, access the **Settings** for your account by clicking the profile photo in the upper-right corner of any page on Github. Next, in the user settings sidebar, select **SSH and GPG keys**. On the SSH and GPG Keys page, click **New SSH key**. On the next page, give your key a descriptive title e.g. "AWS Feb 2018" and then paste your AWS public key in to the "Key" field . Finally, click **Add SSH key** and confirm your Github password , if prompted.

### Learning to read the Bash Prompt

During your work you will no doubt notice that an idle SSH connection may become disconnected and/or unresponsive. Should this happen, simply close the terminal session, launch a new one, and reconnect to the remote instance.

The most important thing is that you are aware of which system your current shell session is connected to. Shell prompts are designed to relay this information to you immediately. If you are new to working with Bash, you may need to train yourself to being aware of the prompt when typing. Listing \ref{lst:default-AWS-prompt} shows the default AWS Bash prompt. The information contained is the username, `ubuntu`, and the private IP address of the AWS instance. **This is not the public address you use to connect**. What is useful about this, is that we can immediately see that the user is `ubuntu`. This tells us we are connected to AWS.

<include type="listing" label="default-AWS-prompt">
    
###### The default AWS Bash prompt
```
ubuntu@ip-172-31-21-89:~$
```

</include>

Your local system will no doubt display something different (See Listing \ref{lst:other-prompt}). Again, the important thing is to take note of what is displayed by the prompt and to learn to associate that prompt with the correct system. As you become a more advanced Bash user, you may wish to personalize your prompt, but for now it is imperative that you learn to read the prompt in order to always know to which system you are connected.

<include type="listing" label="other-prompt">
    
###### A local Bash prompt
```
joshuas-macbook-pro:~$
```
</include>

### Test your SSH Connection to Github

Having added you AWS Public Key to your Github account, you should verify your SSH connection from your AWS instance. In Listing \ref{lst:verify-github-ssh}, we attempt to connect to Github via SSH. As before, we receive a message about the authenticity of the connection. Again, type `yes`, and continue. If successful, you will see a message telling you have successfully authenticated but that Github does not provide shell access.

<include type="listing" label="verify-github-ssh">
    
###### Verify Github SSH Key
```
ubuntu@ip-172-31-21-89:~$ ssh -T git@github.com
The authenticity of host 'github.com (IP ADDRESS)' can't be
established.
RSA key fingerprint is
16:27:ac:a5:76:28:2d:36:63:1b:56:4d:eb:df:a6:48.
Are you sure you want to continue connecting (yes/no)? yes
Hi username! You've successfully authenticated, but GitHub does
not provide shell access.
```

</include>

### Docker

Having configured our SSH connections and provisioned a new AWS EC2 instance, it is time to get to the business of building your data science platform. To do this you will use the containerization platform Docker and its Docker Compose tool. While Docker is very easy to use, it can be difficult to understand for the uninitiated. In an earlier work, *Docker for Data Science*<sup>dockerfordatascience</sup>, I wrote:

dockerfordatascience: https://www.apress.com/us/book/9781484230114

> [Using Docker] we add a layer of complexity to our software, but in doing so gain the advantage of ensuring that our local development environment will be identical to any possible environment into which we would deploy the application.

It may be simpler, however, to simply think about using Docker as a way to manage a running process. Your system will be running two processes: an IPython shell and a PostgreSQL server. Were you to not use Docker, you would need to ensure that the AWS instance had all of the libraries required to run both of those processes (and keep those libraries up to date).

Instead, you will let Docker manage the processes using a container for each process. Each respective container will be run using a predefined image built using best practices and ready to run their respective process. The exchange is this: you will take on the congitive burden of *understanding* what Docker is doing and Docker (and the Docker community) will take over the burden of making sure that your processes run.

#### Docker Compose

Docker Compose is a tool built for managing an application consisting of multiple containers. Using Docker Compose, it is possible to completely define an application using a simple text file. To make this conversation less abstract, let's have a look at the `docker-compose.yml` file you will use to define your first application (See Listing \ref{lst:docker-compose}).

<include type="listing" label="docker-compose">
    
###### Your Data Science Application

```
version: "3"
services:
  ipython_shell:
    image: jupyter/scipy-notebook
  database:
    image: postgres
    volumes:
      - postgres_data:/var/lib/postgresql/data
volumes:
  postgres_data
```
    
</include>

That's it. This simple file completely defines a fully-functioning Data Science Application. In it, we define the two services we need: `ipython_shell` and `database`. These two services are defined using the `jupyter/scipy-notebook` and `postgres` images. When we launch the application, the images will be pulled from Docker Hub into our local memory and then launched. The one other thing we do is create a data volume `postgres_data`. We will use this as the data volume for our database server so that if for some reason we have to shut our system down, we do not lose our data. The data will exist on this volume independent of the services.

$\square$ **Note:** Throughout this text, when discussing infrastructure, I may casually refer to containers, services, and processes. At the risk of annoying your local site reliability engineer, you may treat these as terms as synonomous. Care should be taken, however, not to confuse services/containers/processes and images. An image defines a service, but a service should be thought of as a living and active thing. You may loosely compare the service-image relationship to the object-class relationship in Object-Oriented Programming. A service is a running container defined by an image, just like an object is an instance of a class that exists in memory.

#### Installing and Configuring Docker

Installing Docker on your AWS instance is a downright trivial process. It consists of running an install script that can be obtained from Docker and then adding your user to the Docker group. In Listing \ref{lst:install-docker}, we run these two commands. First, we download the install script from https://get.docker.com, then immediately pipe the script into the shell (`| sh`).

$\square$ **Note:** It is generally considered to be a significant security vulnerability to execute arbitrary code obtained from an unknown, or untrusted source. For our purposes, the source (https://get.docker/com) is considered trustworthy, we are using SSL to perform the curl, and in practice this is the method I use to install Docker. Still, it may make the security minded more comfortable to `curl` the script, inspect, and then run it.

<include type="listing" label="install-docker">
    
###### Install Docker via a Shell Script

```
$ curl -sSL https://get.docker.com/ | sh
# Executing docker install script, commit: 1d31602
+ sudo -E sh -c apt-get update -qq >/dev/null
...

Client:
 Version:   18.02.0-ce
 API version:   1.36
 Go version:    go1.9.3
 Git commit:    fc4de44
 Built: Wed Feb  7 21:16:33 2018
 OS/Arch:   linux/amd64
 Experimental:  false
 Orchestrator:  swarm

Server:
 Engine:
  Version:  18.02.0-ce
  API version:  1.36 (minimum version 1.12)
  Go version:   go1.9.3
  Git commit:   fc4de44
  Built:    Wed Feb  7 21:15:05 2018
  OS/Arch:  linux/amd64
  Experimental: false

...

```

</include>

When the script completes there is one last thing to be done. In Listing \ref{lst:add-to-docker-group}, you add the `ubuntu` user to the `docker` group. By default, the command line docker client will require sudo access in order to issue commands to the docker daemon. You can add the `ubuntu` user to the `docker` group in order to allow the `ubuntu` user to issue commands to docker without sudo.

<include type="listing" label="add-to-docker-group">
    
###### Add the Ubuntu User to the Docker Group

```
$ sudo usermod -aG docker ubuntu
```

</include>

Finally, in order to force the changes to take effect, you should disconnect and reconnect to their remote system. You can achieve this by typing `exit` or `ctrl-d` and then reconnecting via ssh to your EC2 instance.

#### Installing and Configuring Docker

Recall that regardless of your local operating system, you are working on an AWS EC2 Instance running the Linux variant, Ubuntu. As such, `docker-compose`can be installed using the instructions provided here: https://github.com/docker/compose/releases, which are written specifically for Linux machines. As of the writing of this book, this consists of two steps.

In Listing \ref{lst:curl-docker-compose}, you use `curl` to retrieve the `docker-compose` binary from Github. As of the writing of this book. the latest version of `docker-compose` was `1.19.0`. You should retrieve the latest version from the above url.

<include type="listing" label="curl-docker-compose">
    
###### Retrieve `docker-compose` binary from Github

```
$ sudo curl -L https://github.com/docker/compose/releases/download/1.19.0/docker-compose-`uname -s`-`uname -m` -o /usr/local/bin/docker-compose
```
</include>

In Listing \ref{lst:chmod-docker-compose}, we use the `chmod`<sup>chmod</sup> utility to allow `docker-compose` to be executed (`+x`).

chmod: The unix "change mode" utility. I pronounce it "shmod".

<include type="listing" label="chmod-docker-compose">
    
###### Enable Docker Compose to be Executed

```
$ sudo chmod +x /usr/local/bin/docker-compose
```

</include>

Finally, in Listing \ref{lst:docker-compose-version}, we check the version of `docker-compose` against what we expect to have installed.

<include type="listing" label="docker-compose-version">

###### Check Docker-Compose Version

```
$ docker-compose -v
docker-compose version 1.19.0, build 9e633ef
```