<img src="https://ga-dash.s3.amazonaws.com/production/assets/logo-9f88ae6c9c3871690e33280fcf557f33.png" style="float: left; margin: 15px;">
# AWS Intro
Week 9 | Day 2

### LEARNING OBJECTIVES
*After this lesson, you will be able to:*
- Explain what services AWS offers and which ones are relevant to data science
- Start and terminate an EC2 instance in the cloud
- Store and download data from an S3 bucket

### Topic Outline
- What is AWS?
- Getting setup with AWS
- EC2
- S3
- AWSCLI

## What is AWS?

<img src="http://i.giphy.com/3oEjHBa34dVLv0jnoc.gif">

## AWS

<img src="http://i.imgur.com/NSjFEWZ.png">
[Wikipedia](https://en.wikipedia.org/wiki/Amazon_Web_Services)

## Who uses it?

Notable clients include:
- Yelp
- Netflix
- NASA
- Pinterest
- Spotify
- The CIA

<img src="./images/outage.png">

## Why do they use it?

- Because it is far cheaper than rolling their own
- Offers them reliability and scalability

## Why did they build it?

<img src="http://i.imgur.com/vdiEwK8.png">
[Quora](https://www.quora.com/How-and-why-did-Amazon-get-into-the-cloud-computing-business)

## Additional Highlights

- Generates ~$10 billion in annual revenue
- Have over 30% market share
- Nearest competitor is MSFT with less than 10%
- Most common services are EC2, S3, RDS, EMR, Redshift

## EC2

The first service we will discover is _Elastic Compute Cloud_ or _EC2_. This service forms a central part of Amazon.com's cloud-computing platform by allowing users to rent virtual computers on which to run their own computer applications. Let's learn some terms first:

- **Instance**: virtual machine hosted in Amazon Cloud running the software we want


- **Amazon Machine Image (AMI)**: a snapshot of a configured machine that we can use as starting point to boot an instance. We can also save a running instance to a new AMI so that in the future we can boot a new machine with identical configuration.
- **SSH Key**: [pair of keys](https://en.wikipedia.org/wiki/Public-key_cryptography) necessary to connect to an instance remotely. The private key will be downloaded to our laptop, the matching public key will be automatically configured on the instance.

## How is EC2 different from using our laptop or a server?

The main conceptual shift from using a laptop to running an instance in the cloud is that we should think of computing power as ephemeral. We can request computing power when we need it, do a calculation and dismiss that power as we are done. Input and output will not be stored on the machine, rather stored somewhere else in the cloud (hint: S3). In this sense, computing power is a commodity that we purchase and use in the amount and time that we need.

## Demo

Let's see how it works.

1) Log in to your account [here](https://aws.amazon.com/)

<img src="http://i.imgur.com/HPSwQGP.png">

Once you have signed in to the console, you should get to this page:<br>
<img src="http://i.imgur.com/owqrd0e.png">

## EC2 - Elastic Cloud Compute

We'll go ahead and follow the tutorial [here](https://aws.amazon.com/getting-started/tutorials/launch-a-virtual-machine/)

### Step 1: Launch an Amazon EC2 Instance

<img src="http://i.imgur.com/KTsbP71.png">

## Step 2: Download and configure your key


<img src="./images/key_pair.png">

Once you have downloaded the key, move it out of your Downloads folder.  You can put it anywhere you want, as long as you leave it there and remember it's there!  We're going to put it into our .ssh folder.

- in the terminal, navigate to your Downloads folder
- move the key pair to the .ssh folder:
    - `mv ~/Downloads/aws_keypair.pem ~/.ssh/aws_keypair.pem`
- change permission for the key:
    - `chmod 400 ~/.ssh/aws_keypair.pem`
    

### Step 2: Configure your Instance

<img src="http://i.imgur.com/ASAfxND.png">

Notice that we can have a lot of information about the instance, in particular:

- it's DNS name and IP address
**Check:** What is an IP address?

- The type of instance
- The key necessary to connect

#### Step 3: Connect to your Instance

The instructions are all in the tutorial but let's do this together.

Copy the connection string into a terminal window, but update to reference to the location of your key:

`ssh -i "~.ssh/aws_keypair.pem" ec2-user@ec2-34-207-178-209.compute-1.amazonaws.com`
<br><br>


<img src="http://i.imgur.com/acGZQUe.png">

Congratulations!! You've just connected to an instance in the cloud!! How awesome is that!!

Try launching python from the shell and do something with it.

<img src="http://i.imgur.com/39xhith.png">

#### Step 4: Terminate Your Instance

Once you're done with your calculation and you no longer need the instance, you can go ahead and terminate it. NB: this will kill the instance and it will no longer be available to you. You should make sure you have saved all the data and the code you needed somewhere else.

<img src="http://i.imgur.com/z4GhG7S.png">

Unless you are using your machine to serve a live application (like a web app or an api) it's very important that you terminate your instance if you don't use it so that you don't incur in additional unnecessary costs.

<img src="http://i.imgur.com/kA8Mxvz.png">

### Additional remarks

We've seen the simplest way to launch and terminate an instance in the cloud.

There's a lot more to it, that you'll discover in time, here are some pointers you may find useful:

- [Pricing](https://aws.amazon.com/ec2/pricing/): EC2 pricing depends on the type of instance and on the chosen region. Make sure you understand the cost of the instance you request in order to avoid surprise bills. If you're in doubt you can use the convenient [Cost Calculator](http://calculator.s3.amazonaws.com/index.html) to get an exact forecast of your costs.

<img src="http://i.imgur.com/HYS2lTe.png">

- [Spot instances](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-instances.html): spot instances are even more ephemeral than normal instances. They only live till their cost is lower than the price you agreed to pay. They are a great way to save money when using more powerful machines. <br><br>
- [AMIs](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AMIs.html) AMIs are shapshot of our machine. They are great if we installed a lot of software on our machine and want to save that particular configuration.

<img src="http://i.imgur.com/JAeDqTO.png">

**Check:** can you give an example of when AMIs could be useful?

- [Security Groups](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-network-security.html): security groups are ways to open ports to the services running on our machine.
**Check:** can you give an example of a practical case?


- [Elastic IPs](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/elastic-ip-addresses-eip.html): we can rent a fixed IP address and associate it to our instance. This way we can configure tools to always connect to the same address, independently of which machine is behind it.
**Check:** can you give a practical use case?


## S3 - Simple Storage Service

We have learned how to start and stop an instance in the cloud. That's great, because it gives us "computing power as a service". Now let's learn how we can store data in the cloud too.

Amazon S3 (Simple Storage Service) is an online file storage. It provides storage through web services interfaces (REST, SOAP, and BitTorrent) using an _object storage architecture_. According to Amazon, S3's design aims to provide scalability, high availability, and low latency at commodity costs.

Objects are organized into buckets (each owned by an Amazon Web Services account), and identified within each bucket by a unique, user-assigned key. Buckets and objects can be created, listed, and retrieved using either a REST-style HTTP interface or a SOAP interface. Additionally, objects can be downloaded using the HTTP GET interface and the BitTorrent protocol.

<a name="ind-practice"></a>
## S3 - Simple Storage Service

In pairs: go ahead and follow the [tutorial for S3](http://docs.aws.amazon.com/AmazonS3/latest/gsg/CreatingABucket.html).


**Check:** what's a practical case you can envision using S3 for?

<a name="demo"></a>
## AWSCLI - AWS Command Line

Wow, great! We have learned to request and access computing power and storage as a service through AWS. Wouldn't it be nice to be able to do this in a quick way from the command line? Yeah! Let's introduce AWSCLI!

[AWSCLI](https://github.com/aws/aws-cli) is a unified command line interface to Amazon Web Services. It allows us to control most of AWS services from the same command line interface.

**Check:** Why is that useful? Why is that powerful? Can you give some examples?


<a name="ind-practice"></a>
## AWSCLI - AWS Command Line

Let's go ahead and follow the [tutorial for AWSCLI](https://aws.amazon.com/getting-started/tutorials/backup-to-s3-cli/)

### Steps to complete

#### Step 1: Create an AWS IAM User

In order to use the command line we will have to configure a set of access credentials on our laptop. It's very important to create a separate identity with limited permissions instead of using our root account credentials.

- Add a user
- Select "Programatic Access"
- On the next screen, select "Attach existing policies directly"
- Check the box for "AdministratorAccess"
- Once you've created the user, you will see a screen that allows you to download your credenitals as a csv.  You should download these, and keep them secure!  

#### Step 2: Install and Configure the AWS CLI


http://docs.aws.amazon.com/cli/latest/userguide/installing.html

Note that one of the method is to simply use `pip` to install the AWSCLI on your computer (ec-2 instances have it pre-installed).

`pip install --upgrade --user awscli`

#### Step 3: Configure AWS CLI:
It's explained [here](http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html#cli-quick-configuration)

**Note:** If you already have AWSCLI configured and you would like to have multiple roles, you can do that as explained [here](http://docs.aws.amazon.com/cli/latest/userguide/cli-roles.html).

#### Step 3: Using the AWS CLI with Amazon S3

Now you can go ahead and copy files back and forth from your command line, without ever having to click on the web interface. How cool is that?


Here's a [Cheat Sheet](https://github.com/toddm92/aws/wiki/AWS-CLI-Cheat-Sheet) for the AWSCLI.

<a name="guided_practice"></a>
## EC2 from the command line (15 min)

Empowered with a well configured AWSCLI, we can now start and stop EC2 instances from the command line! Let's use it to spin up a spot instance.

#### 1. Check prices

Let's check the price for an `m3.medium` spot instance:

```bash
aws ec2 describe-spot-price-history \
    --start-time $(date -u +"%Y%m%dT%H0000") \
    --product "Linux/UNIX" \
    --instance-type "m3.medium" \
    --region us-west-2 \
    --output table
```

**Note:** you may have to set the region to the same region you used when opening your account.

<img src="http://i.imgur.com/kZZsuEV.png">

#### 2. List all your buckets

`aws s3 ls`

#### 3. Launch spot instance

You're now ready to submit the spot instance request:

`
aws ec2 run-instances --image-id ami-f4cc1de2 --count 1 --instance-type t2.micro --key-name <YOUR-KEY>
`

The ami id we can get from the list of AMIs (this is the ubuntu instance)

If working, this should return a json description of the instance request.

You can check that the instance request has been opened:

<img src="http://i.imgur.com/5IA0r6g.png">

You can also check from the command line:
- `aws ec2 describe-instances`

Let's retrieve the DNS name:

```bash
aws ec2 describe-instances --output json | grep PublicDnsName | head -n 1
```

#### 4. Connect to the spot instance

```bash
ssh -i ~/.ssh/MyFirstKey.pem ec2-user@<YOUR INSTANCE DNS>
```

#### 5. Terminate the spot instance

Let's retrieve the instance id and kill it:

```bash
aws ec2 describe-instances --output json | grep InstanceId

aws ec2 terminate-instances --instance-ids <INSTANCEID>
```
<img src="http://i.imgur.com/Aype3xS.png">

## Moving files

We can use AWSCLI to move files between our local computer and a S3, or between en ec-2 instance and S3.



## Copying from your local computer to s3

Let's first create a file<br>
`touch test.txt`

Now let's move it to our s3 bucket<br>
`aws s3 cp test.txt s3://<YOUR-BUCKET-NAME>`

Check that it's there:<br>
`aws s3 ls s3://<YOUR-BUCKET-NAME>`

## Copying a file from s3 to your local computer

>How do you think we do this?

`aws s3 cp s3://<YOUR-BUCKET-NAME>/test.txt ~/`

## Copying a file from ec-2 to s3

Works the same way!  AWS CLI is pre-installed on the instances, so you can use it out of the box.

<a name="conclusion"></a>
## Conclusion (5 min)

In this lesson we have learned about 2 fundamental Amazon web services: Elastic Cloud Compute and Simple Storage Service. These 2 services are so common because they provide on demand computation and storage at a very affordable cost.

We have learned how to use them both from the web interface and from the command line.

### ADDITIONAL RESOURCES

- [EC2](https://aws.amazon.com/ec2/?nc2=h_m1)
- [S3](https://aws.amazon.com/s3/?nc2=h_m1)
- [Tutorials](https://aws.amazon.com/getting-started/tutorials/)
- [AWS CLI Tutorial](http://www.joyofdata.de/blog/guide-to-aws-ec2-on-cli/)