# Amazon Web Services (AWS)

### Learning Objectives

- Describe core AWS services & concepts
- Configure your laptop to use AWS
- Use SSH key to access EC2 instances
- Launch & access EC2
- Access S3

### AWS Storage + Execution

What are the primary services that Amazon AWS offers?

Name | Full Name | Service
---|---|---
S3 | Simple Storage Service | Storage
EC2 | Elastic Compute Cloud | Execution
EBS | Elastic Block Store | Storage attached to EC2 instances

### Pop Quiz

<details>
<summary>Q: I want to store some video files on the web. Which Amazon service should I use?</summary>
A: S3
</details>

<details>
<summary>Q: I just created an iPhone app which needs to store user profiles on the web somewhere. Which Amazon service should I use?</summary>
A: S3
</details>

<details>
<summary>Q: I want to create a web application that uses Javascript in the backend along with a MongoDB database. Which Amazon service should I use?</summary>
A: S3 + EC2 + EBS
</details>

### S3 vs. EBS

What is the difference between S3 and EBS? Why would I use one versus the other?

Feature | S3 | EBS
---|---|---
Can be accessed from | Anywhere on the web;<br/>any EC2 instance | Specific availability zone;<br/>EC2 instance attached to it
Pricing | Less expensive;<br/>Storage (3¢/GB);<br/>Use (1¢/10,000 requests) | More expensive;<br/>Storage (3¢/GB) [+ IOPS]
Latency | Higher | Lower
Throughput | Usually more | Usually less
Performance | Slightly worse |Slightly better
Max volume size | Unlimited | 16 TB
Max file size | 5 TB | 16 TB

### Pop Quiz

<details>
<summary>Q: What is latency?</summary>
A: Latency is the time it takes between making a request and the start of a response.
</details>

<details>
<summary>Q: Which is better?  Higher latency or lower?</summary>
A: Lower is better.
</details>

# Leveraging S3

### Buckets and Files

What is a bucket?
- A bucket is a container for files.
- Think of a bucket as a logical grouping of files like a sub-domain.
- A bucket can contain an arbitrary number of files.

How large can a file in a bucket be?
- A file in a bucket can be 5 TB.

### Bucket Names

What are best practices on naming buckets?
- Bucket names must be unique across all of s3.
- Bucket names must be at least 3 and no more than 63 characters long.
- Bucket names must be a series of one or more labels, separated by a single period. 
- Bucket names can contain lowercase letters, numbers, and hyphens. 
- Each label must start and end with a lowercase letter or a number.
- Bucket names must _not_ be formatted as an IP address (e.g., 192.168.5.4).

What are some examples of valid bucket names?
- `myawsbucket`
- `my.aws.bucket`
- `myawsbucket.1`

What are some examples of invalid bucket names? 
- `.myawsbucket`
- `myawsbucket.`
- `my..examplebucket`

### Pop Quiz

<details>
<summary>Q: Why are these bucket names invalid?</summary>
A: Bucket names cannot start or end with a period. And they cannot have a multiple periods next to each other.
</details>

# Python - AWS Integration with boto3   

`$ pip install boto3`  

http://boto3.readthedocs.io/en/latest/guide/quickstart.html

In [None]:
!pip install boto3

> Boto is the Amazon Web Services (AWS) SDK for Python, which allows Python developers to write software that makes use of Amazon services like S3 and EC2. Boto provides an easy to use, object-oriented API as well as low-level direct service access.

### Step 0: Credentials

Your AWS access keys should be in your ~./bashrc (Linux) or ~/.bash_profile (MacOS)
  
```
# AWS
export AWS_ACCESS_KEY_ID=AXXXXXXXXXXXXXXXXXXXXA
export AWS_SECRET_ACCESS_KEY=YXXXXXXXXXXXXXXXXXXXXXXXXXXXYY

```

If you are using the AWS CLI, then `$ aws configure` should have put them in `~/.aws/credentials`

### Step 1: Create a Connection to S3

In [1]:
# Boto 3
import boto3
boto3_connection = boto3.resource('s3')

Check contents of existing buckets

In [2]:
# Boto 3
def print_s3_contents_boto3(connection):
    for bucket in connection.buckets.all():
        for key in bucket.objects.all():
            print(key.key)


In [3]:
print_s3_contents_boto3(boto3_connection)

cancer.csv
hello-remote.txt


Most likely nothing there yet.

### Step 2: Create a Bucket

In [4]:
import os
username = os.environ['USER']
bucket_name = username + "-terminal-from-boto3"
boto3_connection.create_bucket(Bucket=bucket_name)

s3.Bucket(name='jessicacurley-terminal-from-boto3')

### Step 3: Make a file

In [5]:
# make a file (could use an existing file, but make one quick from the command line)
!echo 'Hello world from boto3!' > hello-boto.txt

### Step 4: Upload the file to s3

In [None]:
s3_client = boto3.client('s3')
s3_client.upload_file('hello-boto.txt', bucket_name, 'hello-remote.txt')

In [None]:
# did it make it?
print_s3_contents_boto3(boto3_connection)

### Step 5: Download a file from s3

In [None]:
s3_client.download_file(bucket_name, 'hello-remote.txt', 'hello-back-again.txt')
print(open('hello-back-again.txt').read())

### Step 6: Demo how to upload & download a file with subfolders using a script
`$ python example_upload_download_image.py`

# Now Go Forth and Conquer the Individual assignment.
Notes below are for reference only.

### Optional: Control Access to a Bucket and its contents

By default, our buckets are private.  But you can set them to be public.

In [None]:
# let's find out...
def s3_url(bucket, key):
    return 'http://s3.amazonaws.com/{}/{}'.format(bucket, key)

print(s3_url(bucket_name, 'hello-remote.txt'))
print("I can, but the public couldn't.")

In [None]:
# Set so the public could read it, buy changing the Access Control List (ACL)
bucket = boto3_connection.Bucket(bucket_name)
obj = boto3_connection.Object(bucket_name,'hello-remote.txt')
bucket.Acl().put(ACL='public-read')
obj.Acl().put(ACL='public-read')

In [None]:
# Now let's try again
print(s3_url(bucket_name, 'hello-remote.txt'))

### More on Access Control

Q: I want to access my S3 file from a web browser without giving my access and secret keys. How can I open up access to the file to anyone?
- You can set up Access Control Lists (ACLs) at the level of the bucket or at the level of the individual objects in the bucket (folders, files).

Q: What are the different ACL policies?

ACL Policy | Meaning
---|---
`private` | No one else besides owner has any access rights.
`public-read` | Everyone has read access.
`public-read-write` | Everyone has read/write access.
`authenticated-read` | Registered Amazon S3 users have read access.

Q: What does `read` and `write` mean for buckets and files?
- Read access to a file lets you read the file.
- Read access to a bucket or folder lets you see the names of the files inside it.

#### Pop Quiz

<details>
<summary>Q: If a bucket is `private` and a file inside it is `public-read` can I view it through a web browser?</summary>
A: Yes. Access to the file is only determined by its ACL policy.
</details>

<details>
<summary>Q: If a bucket is `public-read` and a file inside it is `private` can I view the file through a web browser?</summary>
A: No, you cannot. However, if you access the URL for the bucket you will see the file listed.
</details>

## Amazon EC2

### Regions

Q: What are *AWS Regions*?
- AWS is hosted in different geographic locations worldwide. 
- For example, there are 4 regions in the US.

Q: What are the regions in the US

Region | Name | Location 
---|---|--- 
us-east-1 | US East | N. Virginia
us-east-2 | US East 2 | Ohio
us-west-1 | US West | N. California
us-west-2 | US West 2 | Oregon

Q: How should I choose a region?
- N. Virginia or `us-east-1` is the default region for EC2.
- Using a region other than N. Virginia requires additional configuration.
- If you are not sure choose N. Virginia.

### Availability Zones

Q: What are *AWS Availability Zones*?

- Regions are divided into isolated availability zones for fault tolerance.
- Availability zones run on physically separate hardware and infrastructure.
- They do not share hardware, generators, or cooling equipment. 
- Availability zones are assigned automatically to your EC2 instances based on your user ID.


<details>
<summary>Q: Is it possible for two separate users to coordinate and land on the same availability zone?</summary>
1. Availability zones are assigned automatically by the system.
2. It is not possible for two AWS users to coordinate and be hosted on the same availability zone.
</details>

### Connecting to EC2

Q: How can I connect to an EC2 instance?
- Login to the AWS console.
- Navigate: EC2 > Launch Instance > Community AMIs > Search community AMIs > Look for an 'Ubuntu' and 'anaconda3' AMI.
- View the instance and get its Public DNS.
    - This should look something like `ec2-34-229-96-155.compute-1.amazonaws.com`.
- Use this command to connect to it.
    - `ssh -i ~/.ssh/keypair.pem user@domain`
    - Here is an example:
        - `ssh -i ~/.ssh/keypair.pem ubuntu@ec2-34-229-96-155.compute-1.amazonaws.com`
- Make sure you replace the Public DNS value below with the value you have for your instance.

### Copying Files to EC2

Q: How can I copy files to the EC2 instance?
- To copy a file `myfile.txt` to EC2, use a command like this.
    - `scp -i ~/.ssh/keypair.pem myfile.txt user@domain:`
- To copy a directory `mydir` recursively to EC2, use a command like this. 
    - `scp -i ~/.ssh/keypair.pem -r mydir user@domain:`

#### Pop Quiz

<details>
<summary>Q: When you copy a file to EC2 with `scp` will this show up in S3?</summary>
A: No. The file will be stored on the disk on the EC2 instance. It will not be in S3.
</details>