# Using Amazon Web Services (AWS)

## Objectives

- Describe core AWS services & concepts
- Configure your laptop to use AWS
- Use SSH key to access EC2 instances
- Launch & access EC2
- Access S3

## Cloud computing & AWS

the numbers on cloud computing adoption: https://www.rightscale.com/lp/state-of-the-cloud

### What are the benefits of AWS and other cloud services?

- AWS provides on-demand use of computing resources in the cloud
- No need to build data centers
- Easy to create a new business
- Only pay for what you use
- Handle spikes in computational demand
- Secure, reliable, flexible, scalable, cost-effective

AWS skills are much in demand!


### AWS core services:
- Elastic compute cloud (EC2): computers for diverse problems
- Elastic block store (EBS): virtual hard disks to use with EC2
- Simple storage solution (S3): long-term bulk storage
- DynamoDB: NoSQL database
- And much, much more


### S3 vs. EBS

What is the difference between S3 and EBS? Why would I use one versus the other?

- [S3 pricing](https://aws.amazon.com/s3/pricing/): `1¢` to `2.2¢` per GB per month
- [EBS SSD pricing](https://aws.amazon.com/ebs/pricing/): `10¢` per GB per month

Feature | S3 | EBS
---|---|---
Can be accessed from | Anywhere on the web;<br/>any EC2 instance | Specific availability zone;<br/>EC2 instance attached to it
Latency | Higher | Lower
Throughput | Usually more | Usually less
Performance | Slightly worse |Slightly better
Max volume size | Unlimited | 16 TB
Max file size | 5 TB | 16 TB

#### Pop Quiz

<details>
<summary>Q: I want to store some video files on the web. Which Amazon service should I use?</summary>
A: S3
</details>

<details>
<summary>Q: I just created an iPhone app which needs to store user profiles on the web somewhere. Which Amazon service should I use?</summary>
A: S3
</details>

<details>
<summary>Q: I want to create a web application that uses Javascript in the backend along with a MongoDB database. Which Amazon service should I use?</summary>
A: S3 + EC2 + EBS
</details>

## Setting up AWS access

- sign in to your AWS console: https://console.aws.amazon.com

- click on your name in the upper right, then click `My Security Credentials`

- click `Get started with IAM users`, which should take you [here](https://console.aws.amazon.com/iam/)

- click the blue `Add User` button
    - type your name in the `User Name` text box
    - check the `Programmatic Access` box
    - add a custom password
    - uncheck `require password reset`
    - click `Next`
- click `Create Group`
    - in the `Group name` box, enter `admin`
    - in the search box next to "filter policies", type `AdministratorAccess`
    - check the box next to that entry and click `Create Group`
    - then click `Review` and `Next`
- **important**: you are now on a screen displaying your user credentials. Click `Download .csv` to save them: by default, this downloads `credentials.csv` to your `~/Downloads` folder. You will not be able to download these credentials again! If you close the window and lose the file, you'll have to generate a new set of credentials.
    - open `~/Downloads/credentials.csv` in a text editor to make sure the download succeeded.
- Back at the [IAM dashboard](https://console.aws.amazon.com/iam), click `Roles` and `Create Role`
    - under "Choose the service that will use this role", click `EC2`, then `Next`
    - check `AdministratorAccess`, then `Next`
    - in `Role Name`, type `yourname_role` or `admin_role`, then `Create Role` and you're done


- in your terminal, install the AWS command line tools with `pip install awscli`
- then type `aws configure`
    - paste your AWS Access Key ID and AWS Secret Access Key when prompted
    - for Default Region Name, enter `us-east-1`
    - for Default Output Format, enter `json` (or leave it as `None`, this doesn't matter to us for now)
- this created a folder, `~/.aws`, containing two files: `config` and `credentials`. You can use these to manage multiple profiles. For now, we're cool.    


## Amazon S3

### Buckets and Files

[Amazon S3 FAQ](https://aws.amazon.com/s3/faqs/)

What is a bucket?
- A bucket is a container for files.
- Think of a bucket as a logical grouping of files, like a [subdomain](https://en.wikipedia.org/wiki/Subdomain).
- A bucket can contain an arbitrary number of files.

How large can a file in a bucket be?
- A file in a bucket can be 5 TB.

### Bucket Names

[AWS bucket name guidelines](https://docs.aws.amazon.com/AmazonS3/latest/dev/BucketRestrictions.html)

What are best practices on naming buckets?
- Bucket names should be DNS-compliant.
- They must be at least 3 and no more than 63 characters long.
- Bucket names can contain lowercase letters, numbers, and hyphens. 
- Bucket names must be a series of one or more labels. Adjacent labels are separated by a single period (.). Each label must start and end with a lowercase letter or a number.
- Bucket names must not be formatted as an IP address (e.g., 192.168.5.4).

What are some examples of valid bucket names?
- `myawsbucket`
- `my.aws.bucket`
- `myawsbucket.1`
- `my-aws-bucket`

What are some examples of invalid bucket names? 
- `.myawsbucket`
- `myawsbucket.`
- `my..examplebucket`
- `MyAwsBucket`
- `my_aws_bucket`

### Managing buckets with the AWS console GUI

You can use the [AWS console web interface for S3](https://s3.console.aws.amazon.com/s3/home) to create & manage buckets.

- click `Create Bucket`
- give the bucket a GLOBALLY UNIQUE name
- we can leave the region as the default, `US East (N. Virginia)`
- click `Next`, note all the options we don't care about right now, and click `Next` again
- this screen is about permission: which accounts do you want to give read / write access to this bucket? do you want the bucket to be publicly readable?
    - you do not want your bucket to be publicly writable. 
- click `Next` to review your settings, then `Create`

Clicking on the bucket name brings you to the page where you can update bucket permissions, create folders, upload files, and so on. 

##### S3 Files to URLs
- you can compose the URL using the region, bucket, and filename. 
- For the `N. Virginia` region, the general template for the URL is `http://s3.amazonaws.com/BUCKETNAME/FILENAME`.
    - [Region-specific endpoint](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html) is `http://s3-AWSREGION.amazonaws.com/BUCKETNAME/FILENAME`
- You can also find the URL by looking at the file on the AWS web console.

##### Permissions
You can manage individual file permissions as well. For example:
- If a bucket is `private` and a file inside it is `public-read`, anyone can view the file through a web browser
- If a bucket is `public-read` and a file inside it is `private`, the file will not be readable by everyone. However, if anyone accesses the URL for the bucket, they will see the file listed.

### Managing buckets with the AWS command line interface

I loathe the mouse, the cursor, the clicking. I would like to use the keyboard only. But how?

[AWS CLI S3 documentation](https://docs.aws.amazon.com/cli/latest/userguide/using-s3-commands.html)

- `aws s3 ls` to list your buckets
- `aws s3 ls s3://BUCKETNAME/` to list contents of a bucket
- `aws s3 ls s3://BUCKETNAME/FOLDERNAME/` to list contents of a directory in the bucket (the trailing `/` is necessary)
- `aws s3 mb s3://BUCKETNAME` to create a new bucket
- `aws s3 rb s3://BUCKETNAME` to delete a bucket (the bucket must be empty)
    - `aws s3 rb s3://BUCKETNAME --force` to delete a non-empty bucket
- `aws s3 cp LOCALFILE s3://BUCKETNAME/` to upload a local file to a bucket
- `aws s3 cp s3://BUCKETNAME/FILENAME .` to download a `FILENAME` to the current directory
    - just like the UNIX `cd`, use the `--recursive` flag for copying directories
    - you can also use `rm` and `mv` the same way
- see the `aws s3 sync` command in the docs above for more examples of how to keep a local directory & a remote bucket directory synchronized


### Managing buckets using the `boto3` library in python

[boto3 s3 documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-examples.html)

In [1]:
import boto3

s3 = boto3.client('s3')

##### List buckets

In [2]:
s3.list_buckets()

{'Buckets': [{'CreationDate': datetime.datetime(2017, 12, 11, 22, 44, 10, tzinfo=tzutc()),
   'Name': 'aws-logs-232319016740-us-east-1'},
  {'CreationDate': datetime.datetime(2018, 3, 29, 2, 39, 34, tzinfo=tzutc()),
   'Name': 'aws-logs-232319016740-us-west-1'},
  {'CreationDate': datetime.datetime(2017, 12, 14, 22, 6, 6, tzinfo=tzutc()),
   'Name': 'galv-wiki-data'},
  {'CreationDate': datetime.datetime(2018, 9, 18, 1, 6, 10, tzinfo=tzutc()),
   'Name': 'moses-party-bucket-zone'},
  {'CreationDate': datetime.datetime(2018, 9, 18, 1, 19, 28, tzinfo=tzutc()),
   'Name': 'moses-test-bucket-horse'},
  {'CreationDate': datetime.datetime(2018, 9, 18, 17, 9, 8, tzinfo=tzutc()),
   'Name': 'moses-unique-breadfruit'},
  {'CreationDate': datetime.datetime(2018, 9, 18, 1, 34, 2, tzinfo=tzutc()),
   'Name': 'moses79'}],
 'Owner': {'DisplayName': 'mosesmarsh',
  'ID': '9b5e4b0fcb1e508dcd31e5c87181a04d913f35587d2d9ff551ef61aaf7ce14f4'},
 'ResponseMetadata': {'HTTPHeaders': {'content-type': 'applica

In [3]:
s3.list_buckets()['Buckets']

[{'CreationDate': datetime.datetime(2017, 12, 11, 22, 44, 10, tzinfo=tzutc()),
  'Name': 'aws-logs-232319016740-us-east-1'},
 {'CreationDate': datetime.datetime(2018, 3, 29, 2, 39, 34, tzinfo=tzutc()),
  'Name': 'aws-logs-232319016740-us-west-1'},
 {'CreationDate': datetime.datetime(2017, 12, 14, 22, 6, 6, tzinfo=tzutc()),
  'Name': 'galv-wiki-data'},
 {'CreationDate': datetime.datetime(2018, 9, 18, 1, 6, 10, tzinfo=tzutc()),
  'Name': 'moses-party-bucket-zone'},
 {'CreationDate': datetime.datetime(2018, 9, 18, 1, 19, 28, tzinfo=tzutc()),
  'Name': 'moses-test-bucket-horse'},
 {'CreationDate': datetime.datetime(2018, 9, 18, 17, 9, 8, tzinfo=tzutc()),
  'Name': 'moses-unique-breadfruit'},
 {'CreationDate': datetime.datetime(2018, 9, 18, 1, 34, 2, tzinfo=tzutc()),
  'Name': 'moses79'}]

In [4]:
for b in s3.list_buckets()['Buckets']:
    print(b['Name'])

aws-logs-232319016740-us-east-1
aws-logs-232319016740-us-west-1
galv-wiki-data
moses-party-bucket-zone
moses-test-bucket-horse
moses-unique-breadfruit
moses79


##### Create a bucket

In [5]:
s3.create_bucket(Bucket='moses79')

{'Location': '/moses79',
 'ResponseMetadata': {'HTTPHeaders': {'content-length': '0',
   'date': 'Tue, 18 Sep 2018 17:15:57 GMT',
   'location': '/moses79',
   'server': 'AmazonS3',
   'x-amz-id-2': 'whVTr9PvorMadWThIot0pHiB1nufG6yzW3hrR2hl16udRyDQk86onI/0QCUVztOHzNYG1k8Z4zo=',
   'x-amz-request-id': '08C7E4401A4C43EB'},
  'HTTPStatusCode': 200,
  'HostId': 'whVTr9PvorMadWThIot0pHiB1nufG6yzW3hrR2hl16udRyDQk86onI/0QCUVztOHzNYG1k8Z4zo=',
  'RequestId': '08C7E4401A4C43EB',
  'RetryAttempts': 0}}

In [6]:
for b in s3.list_buckets()['Buckets']:
    print(b['Name'])

aws-logs-232319016740-us-east-1
aws-logs-232319016740-us-west-1
galv-wiki-data
moses-party-bucket-zone
moses-test-bucket-horse
moses-unique-breadfruit
moses79


##### List the contents of a bucket

In [7]:
response = s3.list_objects_v2(Bucket='moses-party-bucket-zone')

In [8]:
response

{'Contents': [{'ETag': '"7285c37765d363b534fd9708d33ce234"',
   'Key': 'test.txt',
   'LastModified': datetime.datetime(2018, 9, 18, 1, 25, 5, tzinfo=tzutc()),
   'Size': 20,
   'StorageClass': 'STANDARD'},
  {'ETag': '"c876003b30f30e80f1e0259da3321749"',
   'Key': 'uploaded_bee.jpg',
   'LastModified': datetime.datetime(2018, 9, 18, 1, 47, 22, tzinfo=tzutc()),
   'Size': 44582,
   'StorageClass': 'STANDARD'}],
 'IsTruncated': False,
 'KeyCount': 2,
 'MaxKeys': 1000,
 'Name': 'moses-party-bucket-zone',
 'Prefix': '',
 'ResponseMetadata': {'HTTPHeaders': {'content-type': 'application/xml',
   'date': 'Tue, 18 Sep 2018 17:16:10 GMT',
   'server': 'AmazonS3',
   'transfer-encoding': 'chunked',
   'x-amz-bucket-region': 'us-east-1',
   'x-amz-id-2': 'n1wlI8BH7HZDpTG9D5YFy/G3wFYeNbI/LqfqStgrkYWXSXXztly5LxZsQhLShHSWo2ZEn8C+4fc=',
   'x-amz-request-id': '8AF7F05CE61A409C'},
  'HTTPStatusCode': 200,
  'HostId': 'n1wlI8BH7HZDpTG9D5YFy/G3wFYeNbI/LqfqStgrkYWXSXXztly5LxZsQhLShHSWo2ZEn8C+4fc=',
  '

In [9]:
response['Contents']

[{'ETag': '"7285c37765d363b534fd9708d33ce234"',
  'Key': 'test.txt',
  'LastModified': datetime.datetime(2018, 9, 18, 1, 25, 5, tzinfo=tzutc()),
  'Size': 20,
  'StorageClass': 'STANDARD'},
 {'ETag': '"c876003b30f30e80f1e0259da3321749"',
  'Key': 'uploaded_bee.jpg',
  'LastModified': datetime.datetime(2018, 9, 18, 1, 47, 22, tzinfo=tzutc()),
  'Size': 44582,
  'StorageClass': 'STANDARD'}]

In [10]:
for obj in response['Contents']:
    print(obj['Key'])

test.txt
uploaded_bee.jpg


##### Upload a file

In [11]:
for obj in s3.list_objects_v2(Bucket='moses-party-bucket-zone')['Contents']:
    print(obj['Key'])

test.txt
uploaded_bee.jpg


In [13]:
remote_file_name = 'uploaded_bee.jpg'
local_file_name = 'carpenter_bee.jpg'
bucket_name = 'moses-party-bucket-zone'

s3.upload_file(Filename=local_file_name, 
               Bucket=bucket_name, 
               Key=remote_file_name)

In [14]:
for obj in s3.list_objects_v2(Bucket='moses-party-bucket-zone')['Contents']:
    print(obj['Key'])

test.txt
uploaded_bee.jpg


##### Read contents of an file

In [15]:
response = s3.get_object(Bucket='moses-party-bucket-zone',
                         Key='test.txt')

In [16]:
response

{'AcceptRanges': 'bytes',
 'Body': <botocore.response.StreamingBody at 0x7fcd2c72e828>,
 'ContentLength': 20,
 'ContentType': 'text/plain',
 'ETag': '"7285c37765d363b534fd9708d33ce234"',
 'LastModified': datetime.datetime(2018, 9, 18, 1, 25, 5, tzinfo=tzutc()),
 'Metadata': {},
 'ResponseMetadata': {'HTTPHeaders': {'accept-ranges': 'bytes',
   'content-length': '20',
   'content-type': 'text/plain',
   'date': 'Tue, 18 Sep 2018 17:16:48 GMT',
   'etag': '"7285c37765d363b534fd9708d33ce234"',
   'last-modified': 'Tue, 18 Sep 2018 01:25:05 GMT',
   'server': 'AmazonS3',
   'x-amz-id-2': '/85cXRRCxyJ2sScWr8r9t1oUPB/U+Ckm8hnFPJx/ZBVDMcQMAX933eJRU812FMXCwRL0jkl0oe8=',
   'x-amz-request-id': '3DF65B4EAC7E074F'},
  'HTTPStatusCode': 200,
  'HostId': '/85cXRRCxyJ2sScWr8r9t1oUPB/U+Ckm8hnFPJx/ZBVDMcQMAX933eJRU812FMXCwRL0jkl0oe8=',
  'RequestId': '3DF65B4EAC7E074F',
  'RetryAttempts': 0}}

In [17]:
response['Body']

<botocore.response.StreamingBody at 0x7fcd2c72e828>

In [18]:
response['Body'].read()

b'not a poem anymore\n\n'

##### Download a file

In [19]:
s3.download_file(Bucket=bucket_name,
                 Key=remote_file_name,
                 Filename="downloaded-" + local_file_name)

In [20]:
!ls

assets		   downloaded-carpenter_bee.jpg
aws_lecture.ipynb  high_performance_python_lecture.ipynb
carpenter_bee.jpg


## Amazon EC2

### [Regions](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html)

Q: What are *AWS Regions*?
- AWS is hosted in different geographic locations worldwide. 
- For example, there are 4 regions in the US.

Region | Name | Location 
---|---|--- 
us-east-1 | US East | N. Virginia
us-east-2 | US East 2 | Ohio
us-west-1 | US West | N. California
us-west-2 | US West 2 | Oregon

Q: How should I choose a region?
- N. Virginia or `us-east-1` is the default region for EC2.
- Using a region other than N. Virginia requires additional configuration.
- If you are not sure, choose N. Virginia.

### Availability Zones

Q: What are *AWS Availability Zones*?

- Regions are divided into isolated availability zones for fault tolerance.
- Availability zones run on physically separate hardware and infrastructure.
- They do not share hardware, generators, or cooling equipment. 
- Availability zones are assigned automatically to your EC2 instances based on your user ID.

<img src='assets/aws_regions.png'>

- Availability zones are assigned automatically by the system. It is not possible for two AWS users to coordinate and be hosted on the same availability zone.

### Starting an EC2 instance
- Login to the AWS console, click `Services`, then, under "Compute", click `EC2`
- In the upper right corner of the page, make sure your region is `N. Virginia`
- Click the blue `Launch Instance` button
- scroll down to the first entry that says `ubuntu` and click `select`
- by default, the `t2.micro` instance (free tier) is selected. leave it, then click `Next: Configure instance details`
- under `IAM role`, select your username, then click `Next`
- here we can add EBS (think of it as more hard drive space). The default disk size is 8 GB. Change it to 20 (which still qualifies for the free tier).
- click `Next` until you are at the `Configure Security Group` screen. Make sure there is an entry with `Type: SSH`, `Protocol: TCP`, `Port Range: 22`, and `Source: Anywhere`. (If not, create one with `Add Rule`)
- Now click `Review and Launch`, then `Launch`
- This brings up a window asking you to select a secure key pair to access this instance. Select `create a new key pair`, give it a name (for example, `my_first_key`), then click `Download Key Pair`
    - this is your only chance to download it. if you lose this file, you'll have to generate a new one, and you'll lose access to any EC2 instances that need the old key pair. 
- Click `launch instance`

##### is there an AWS CLI way to start EC2 instances?
yes, good luck: https://docs.aws.amazon.com/cli/latest/userguide/cli-using-ec2.html

### Accessing your EC2 instance
We are going to set up convenient SSH access to your cloud computer
- move the `pem` file you downloaded to your `~/.ssh` folder (if it doesnt exist, create it)
    - example: `mv ~/Downloads/my_first_key.pem ~/.ssh`
- SSH requires that your key file be accessible only to you, so change the permissions with:
    - `chmod 400 ~/.ssh/my_first_key.pem`
- back on the EC2 dashboard of the AWS web console, click on your instance
- copy its public DNS (which looks something like `ec2-52-90-35-125.compute-1.amazonaws.com`)
- in the `~/.ssh` folder, create a file named `config` (you can create a blank file by typing `touch config`) and enter the following text:

```
Host my_first_ec2
    Hostname PUBLICDNS
    User ubuntu
    IdentityFile ~/.ssh/my_first_key.pem

```

- now we can open a terminal on the remote computer with 
    - `ssh my_first_ec2`  
- you now have a terminal open to enter shell commands on a remote computer! wow! the following commands are to be run on the remote computer, NOT on a local terminal
- let's install the AWS CLI on this system. To do so, in this case:
    - `sudo apt update`
    - `sudo apt upgrade`
    - `sudo apt install awscli`
- since we gave this instance the right IAM role, you don't need to copy over your AWS credentials. Try it!
    - `aws s3 ls`

### Copying Files to EC2
- To copy a file `myfile.txt` to EC2, use a command like this.
    - `scp myfile.txt my_first_ec2:`
- To copy a file from the EC2 to the current directory on the local machine, try
    - `scp my_first_ec2:path/to/remote_file .`
- To copy a directory `mydir` to EC2, use a command like this. 
    - `scp -r mydir my_first ec2:`