# ACS AWS Training
### Nathan Miles
06/01/2020

## Day 1 Outline
- What the heck even is AWS?
- AWS Infrastrucutre
- Interacting with AWS
- Overview of a few AWS services
    - Identity Access Management, IAM
    - Simple Storage Service, S3
    - Elastic Compute Cloud, EC2
    - Lambda

- Getting setup
    - Configuring the AWS credentials file
- Working with the console
- Setting up ssh keys


## First things first
![Change-My-Mind.jpg](attachment:Change-My-Mind.jpg)

## Web Applications and Web APIs
- Web Applications use browser-based interface
- Web APIs use programmatic interface
![tips_webapp.png](attachment:tips_webapp.png)

## AWS Infrastructure
- AWS operates across the world in distinct areas called AWS Regions
- AWS Regions are clusters of data centers
- A group of data centers within an AWS Region is called an Availability Zone (AZ)
    
![aws_regions_2020.png](attachment:aws_regions_2020.png)

## Interacting with AWS
- The AWS Management Console
    - Web application
    - Provides browser-based access to all AWS services
- The AWS CLI
    - Web API
    - Provides programmatic access to all AWS services
    - Install via pip
        - `pip install awscli`
- `Boto3`
    - AWS Software Development Kit (SDK) for Python
    - Provides the programmatic access to AWS via python

## IAM
- Web service designed to manage access to AWS services
- Authentication 
    - Who do we allow to have access to AWS?
    - IAM User
    
- Permissions
    - What services do we allow them to use?
    - IAM Policies



## IAM
- IAM User
    - An individual account that is associated with one person
    - Each account has a set of policies attached that control the person's ability to utilize services on AWS 
    - Access keys are used to authenticate the IAM user when using the programmatic interface

- IAM Group
    - A collection of IAM Users who all require the same permissions
    - IAM Users can belong to multiple IAM Groups

- IAM Role
    - The same concept as an IAM User
    - Each IAM Role has a unqiue set of policies attached that control the role's ability to utilize services on AWS
    - Trusted entities (IAM Users) can _assume_ IAM Roles
 

## S3
- Web service designed to provide people with ability to store virtually any amount of data in the cloud
- Terminology
    - S3 **bucket** is the location where the data is stored
    - S3 **object** is the data itself
    
- Several large datasets are already being hosted in the cloud to facilitate data analysis
    - <a href="https://registry.opendata.aws/">AWS Public Datasets </a>
    - The entire public dataset for HST is currently available on AWS
    - The S3 bucket lives in the AWS Region `us-east-1` (N. Virginia)
    
- S3 bucket mimics the structure of a directory
    - bucket_name/directory1/data
    - bucket_name/directory2/subdirectory1/data

## S3 
- Access can be granted on a per-user basis
    - Keep the results within the collaboration!
    
- Can even host static websites
    - Code documentation
    - Personal website
      

## S3 Pricing
- Pay for the amount of data stored in the bucket on a monthly basis
    
- Pay for any requests submitted to your bucket

- Pay for the amount of data retrieved

- Pay for the bandwidth used in all data transfers **out of S3** via the internet 
    - <span style="color:red">Free</span> to transfer from S3 to EC2 (and vice-versa) if they are in the <span style="color:red">same region</span>

- Always free to transfer data in
<h4><a href="https://aws.amazon.com/s3/pricing/">S3 Pricing Details</a></h4>


## S3 Pricing Example
- Suppose you run some analysis in the cloud and generate 100 GB worth of data that you store in S3
    - Rate for storing 100 GB is \\$0.023/(GB-month)
- Assume data is stored for 1 month  
    - Storage fee = (100 GB * 1 month) * \\$0.023/(GB-month) = \\$6.90
- You have 10 collaborators who all download the results
    - 1 GB per month is free
    - Next 10 TB per month is \\$0.09/(GB-month)
    - Download fee = (10 * 99 GB * 1 month) * \\$0.09/(GB-month) = \\$89.10
- Total S3 cost = \\$96 per month
    - Note that this monthly cost assumes your collaborators repeatedly download the 100 GB dataset every month
    - Hence the next month could be as low as \\$6.90 if no one needs to download the data

    

    
   

## EC2
- Web service designed to provide people with the ability to deploy servers in the cloud
    - Servers are referred to as "instances" in AWS lingo
    
- The operating system is specified by Amazon Machine Images (AMI)
    - Numerous preconfigured ones (Linux, Ubuntu
    - Can also create your own that has all your favorite software installed
        - e.g. astroconda

- Variety of instance types
    - General purpose
    - Compute optimized
    - Memory optimized
    - Accelerated computing
    - Storage optimized


<h4><a href="https://aws.amazon.com/ec2/pricing/">EC2 Pricing Details</a></h4>
 

## EC2
- EC2 instances are accessed via `ssh`
    - `ssh` access is controlled through Security Groups
    - Server can be available to the public
        - Anyone with the public `ssh` key can then log in to the server
    - Can also restrict `ssh` access to only your IP 
- For commonly used instances, you can define launch templates
    - instance type, storage, security groups, key/pair name
- <span style="color:red"> All data transfers are free between S3 buckets and EC2 instances when they are in the same region </span>


## EC2 Pricing
- Pay for the compute resources used
    - Billed in 1 second increments with a minimum of 60s
- Pay for any data transfers from EC2 to the internet or across regions.
    - Same fees apply as those for S3 transfer rates
- Pay for the Elastic Block Storage (EBS) volume used
    - An EBS volume is essentially extra disk space for your server

## EC2 Pricing Example
- Instance type: c5.9xlarge
    - Compute optimized
    - 36 cores, 72 GB of RAM
    - Cost: \\$1.53 per hour
- EBS volume: 100 GB
    - Cost: \\$0.1/(GB-month) -> \\$0.00013/(GB-hr)
- The computation takes 4 hours to complete
    - Instance cost: \\$1.53/hr * 4 hr = \\$6.12
    - EBS cost: \\$0.00013/(GB-hr) * (100 GB * 4hr) = \\$0.05
- Total cost: \\$6.17

## EC2 Pricing by Instance
![ec2_instance_pricing_by_instance.png](attachment:ec2_instance_pricing_by_instance.png)

## EC2 Pricing By Instance Type
![ec2_instance_pricing_by_instance_type.png](attachment:ec2_instance_pricing_by_instance_type.png)