# Amazon S3
Amazon S3 is a fully managed, object-based storage service that is highly available, highly durable, very cost-effective, and widely accessible.

It is promoted as having unlimited storage.<br>
There is a minimum file size of 0 bytes.<br>
There is a maximum file size of 5 terabytes.

The service operates an object storage service meaning all uploads do not conform to a data structure hierachy like a file system would.<br>
Instead its architecture exists across a flat address space and is refereced by a unique URL.

Comparing this to file storage, where data is stored as separate files within a series of directories forming a data structure hierachy like files on a computer, then S3 storage is very different in comparison.

S3 is a regional service therefore when uploading data, you are required to specify the regional location for the data to be stored.

By specifiying the region for the data, Amazon S3 will then store and duplicate the upload multiple times across multiple availability zones within that region to increase durability and availability.

Objects stored in S3 have a durability of 99.999 999 999% (the eleven 9s of durability), so the likelihood of losing data is extremely low and this is due to the duplication of data in different availability zones.<br>
The availability of S3 data objects is dependent on the storage class used, ranging from 99.5% to 99.99%.

The difference between availability and durability:
* Availability - AWS ensures that the uptime of Amazon S3 is between 99.5% and 99.99% depending on the storage class, to enable access to the stored data.
* Durability - The percentage referes to the probability of maintaining your data without it being lost through corruption, degreadtation of data, or other unknown potential damaging effects.

When uploading objects to Amazon S3, a specific structure is used to locate your data in the flat address space.

Storing objects in S3:
1. Define and create a bucket. Consider a bucket as a container for the data.
    * Because buckets exist in a flat address space, **buckets must be have a unique name, globally**.
2. Once the bucket has been created, upload data within it.
    * By default, accounts can have up to 100 buckets. This can be increased on request.
    * Uploaded objects are given a unique object key identifier.
* A file system can be used for organisation purposes. They will however have no effect on the underlying structure of the flat address space.

# Storage Classes
Amazon S3 offers the below storage classes:
* S3 Standard,
* S3 Intelligent Tiering,
* S3 Standard Infrequent Access,
* S3 One Zone Infrequent Access,
* S3 Glacier, and 
* S3 Glacier Deep Archieve.

## S3 Standard
This is a general-purpose storage:
* High throughput, low latency
* Data is accessed frequently
* High durability across multiple availability zones
* 99.99% availability
* Supports Secure Sockets Layer (SSL) for encrypting data in transit.
    * There are also encryption options for when data is at rest.
* Objects in the S3 storage class can be moved to other classes.

## S3 Intelligent Tiering
Best for objects where the frequency of access is unknown/unpredictable.<br>
Same as S3 with these amendments:
* 99.9% availability 

## S3 Standard Infrequent Access
Best for objects infrequently access.<br>
Same as S3 with these amendments:
* 99.9% availability
* Cheaper then S3 Standard

## S3 One Zone Infrequent Access
Best for objects infrequently access.<br>
Same as S3 with these amendments:
* 99.5% availability
* Durability is over a single zone
    * If the Availability Zone is unavailable, the data stored will be unaccessible or even lost.
* 20% cheaper then S3 Standard

## S3 Glacier
Archival data class:
* Accessed separately from the S3 service
* Interacts with S3 lifecycle rules
* Fraction of the cost of S3 storage
* Data cannot be instantly accessed
    * May take hours to gain accessibility
* High durability across multiple availability zones
* 99.9% availability
* Uses vaults and Archives, not buckets or folders.
    * Glacier vault acts as a container for Glacier archives.
    * Vaults are regional
    * Data stored as an archieve
        * Unlimited archives within a Glacier vault
* Retrieval options include:
    1. Expedited - urgent requirement to access data 
        * Less than 250 megabytes
        * Data is available between 1 and 5 minutes
        * Most expensive retrieval option
    2. Standard
        * Any size data to be retrieved
        * Data is available between 3 and 5 hours
        * Second most expensive retrieval option
    3. Bulk
        * Used to retrieve petabytes of data
        * Data is available between 5 and 12 hours
        * Cheapest retrieval option

## S3 Glacier Deep Archive
* Cheapest storage option
* Data retrieval in 12 hours or less.

# EC2 Instance Storage
## Instance store volumes
EC2 instance store volumes are the physical drives that reside on the same host that provide the EC2 instance itself.<br>
They act as local disc drives, allowing storage of data to the instance locally.

Instance store volumes provide ephemeral storage (temporary).<br>
Therefore is not recommended for critical or valuable data.<br>
If the instance is stopped/terminated, the data will be lost.<br>
If the instance is rebooted, the data will remain intact.

### Benefits of using instance store volumes
No additional cost for storage - it is included in the price of the instance.

Offer a very high input/output speed

I3 instance family:
* 3.3 million random read input/output operations per second(IOPS)
* 1.4 million write IOPS

Instance store volumes are ideal as a cache of buffer for rapidly changing data without the need for retention.

Often used within a load balancing group, where data is replicated and pooled between the fleet.

## Instance store volumes
Instance store volumes are not available for all instances.

The capacity of instance store volumes increases with the size of the EC2 instance.

Instance store volumes have the same security mechanisms provided by EC2.

### Anti-patterns
Instance store volumes should not be used for:
* Data that needs to remain persistent
* Data that needs to be accessed and shared by multiple entities.

# Overview of Elastic Block Storage (EBS)
EBS provides persistent and durable block level storage.

EBS volumes offer far more flexiblity with regards to managing data when compared to data stored on instance store volumes.

EBS volumes can be attached to EC2 instances, primarily for data that is rapidly changing that might require a specific Input/Output operations Per Second (IOPS) rate.

EBS volumes are independent of the EC2 instance.

They are logically attached to the instance instead of directly attached like instance store volumes.

From a connectivity perspective, each EBS volume can only be attached to a single EC2 instance at any time.<br>
However, multiple EBS volumes can be attached to the same EC2 instance.

Due to the EBS ability to enforce persistence of data, it does not matter if instances are (un)intentionally stopped, restarted, or terminated.<br>
The data will remain intact when configured to do so.

EBS offers the ability to provide snapstops - backups of the entire volume in a point in time.

Each snapshot will copy data that has changed since the previous snapshot was taken.

Snapshots can be used to create new EBS volumes.<br>
This is useful if an EBS volume is lost/destroyed, it can reconstructed from an existing snapshot.

To add additional flexibility and resilience, it is possible to copy a snapshot from one region to another.

Every write to EBS volumes are replicated multiple times within the same availability zone of your region to help preven the complete loss of data.

The EBS volume is only available in a single availability zone.<br>
Should the avilability zone fail, you will lose access to your EBS volume.<br>
Should this occur, the volume can be recreated from a previous snapshot and can be attached to another instance in a different availability zone.

## EBS volume types
There is Solid State Drives and Hard Disk Drives.

This allows optimisation of storage to fit usage requirements from a cost to performance perspective.

SSDs are suited for work with smaller blocks or as boot volumes for EC2 instances:
* General Purpose SSD (gp2)
    * Balances price and performance for a wide variety of workloads
    * Use case:
        * Recommended for most workloads
        * System boot volumes
        * Virtual desktops
        * Low-latency interactive apps
        * Development and test environments
* Provisioned IOPS SSD (io1)
    * Highest-performance SSD volume for mission-critical low-latency or high-throughput workloads
    * Use case:
        * Critical business applications that require sustained IOPS perofmance, or more than 16,000 IOPS of 250 MiB/s of throughput per volume
        * Large database workloads
            * MongoDB,
            * Cassandra,
            * Microsoft SQL Server,
            * MySQL,
            * PostgreSQL,
            * Oracle

HDDs are suited for workloads that require higher throughput/large blocks of data (e.g. big data, logging information).
* Throughput Optimised HDD (st1)
    * Low-cost HDD volume designed for frequently accessed, thoughput-intensive workloads
    * Use case:
        * Streaming workloads requiring consistent, fast throughput at a low price
        * Big data
        * Data warehouses
        * Log processing
        * Cannot be boot volume

* Cold HDD (sc1)
    * Lowest cost HDD volume designed for less frequently accessed workloads
    * Use case
        * Throughput-oriented storage for large volumes of data that is infrequently accessed
        * Scenarios where the lowest storage cost is important
        * Cannot be boot volume

## EBS Security
There is a simple checkbox if you want to use Volume Encryption.

The encryption process uses AES-256 encryption algorithm and provides its encryption process by interacting with another AWS service, Key Management Service (KMS).

KMS uses Customer Master Keys (CMK) enabling the encryption of data across a range of AWS services.

Any snapshots taken from an encrypted volume will also be encrypted and also any volume created from this encrypted snapshot will also be encrypted.

## Creating EBS volumes
* During the creatiion of a new instance and attach it at the time of launch
* From within the EC2 dashboard of the AWS management console as a standalone volume ready to be attached to an instance wehn required.

### Storage configurations:
* From exising snapshot or new volume
* Size
* Volume type
* Upon termination of the instance, retain or delete the volume
* Data encryption

### Resizing options
* Use AWS CLI
* Create a snapshop of existing volume and create a new volume, using the snapshot, but with increased capacity size

## When not to use EBS
For temporary storage

Multi-instance storage - EBS volumes can only be access by one instance at a time

For very high durability and availability of data storage - use Amazon S3 or Eleastic File System (EFS)

# What is the Amazon Elastic File System (EFS)?
## How does EFS fit within the AWS storage ecosystem?
### Amazon Simple Storage Solution (S3)
An object storage solution, where objects are stored as a single entity, not as small chunks/blocks.<br>
Videos, images, static webpages are best stored as objects, where the files are written to storage once and are accessed multiple times.<br>
Not optimal for heavy read and write access at the same time.<br>
&emsp;Netflix uses S3 for data streaming service - upload large movie files once, stream to multiple users, many times.

### Amazon Elastic Block Store (EBS)
Files are stored as small chunks/blocks of data - changes to files will only update portions of the object.<br>
Great for low latency access and when fast, concurrent read/write operations are needed.<br>
EBS provides persistent block storage volumes for use with a single EC2 instance.<br>
If a EC2 instance using EBS is stopped/terminated, the data on the EBS volume remains intact.<br>
EBS should be used like a computer hard drive - for storing operating system files, applications, and other files obtained for use with the EC2 instance.