# Storage

EC2 Storage Options

* Block
    * Raw device you can mount, create file systems on, etc.
    * Instance store and EBS
* File Share
    * Network file shares, centralized storage
    * EFS, FSx
* Object
    * S3 cloud storage, files stored as objects
    
    
## Block Storage

### Instance Store

* Storage of host computer assigned to EC2 instance
* Temporary storage
    * Data persits only for the lifetime of the instance
    * Persists on reboot
    * Data lost when underlying hardware fails, or instance stops or terminates
* Highest performance
* Storage included as part of instance pricing

Instance Store HA

* Replicate data among multiple instances
* Use some OS level backup tools

Types

* SSD Backed
    * Preferred for random I/O workflow
    * Performance measured in IOPS
* Magnetic/HDD 
    * Preferred for sequential I/O
    * Performance measured in throughput (MiB/s)

Storage Optimized Instances

* SSD and HDD based options
* D2, H1 - Magnetic
    * High throughput access to large datasets
    * Massively parallel processing data warehouse (something like RedShirt)
    * MapReduce/Hadoop Distributed Compute
    * Log or data processing
* I3 - SSD
    * High frequency OLTP
    * NoSQL databases
    * Relational Databases
    * Data warehousing
    * Cache for in-memory databases like Redis
    * Distributed File Systems
    
### Elastic Block Store (EBS)    

EBS

* Managed block storage service
* Storage volume is outside of host computer
* Accessed over network
* Used as a block storage device

Benefits

* Start and stop EC2 instances
* Persist EBS volumes for terminated instances
* Detach and attach volume to a different instance in the same AZ
* Built in snapshot capability for incremental backup to s3
* Create AMI from snapshot to launch new EC2 instances
   
   
EBS Features

* Created at the AZ level
* Highly available and durable
* Build-in snapshot capability for incremental backup to s3
* Create volumes from snapshot (any AZ in the region)
* Copy snapshots to another region

Uses

* Enterprise apps
* Relational and NoSQL databases
* Big data analytics
* Media workflows

Snapshots

* Point in time 
    * Voulume needs to be in consistent state (flush app scache and no traffic)
* Incremental backup
* Snapshot is async
* New feature - multi-volume consistent snapshot

Multivolume Snapshot Options

* Stop system and issue snapshot for each volume and restart
* Take backup at application level
* Use the new multi-volume consistent snapshot feature

Can mix and match volumes of differnt type

Different EBS Volume Types

* SSD
    * General purpose
        * Balance price performance
        * 3 IOPs per GiB
        * Burst IOPS 3000
        * Max IOPs 16000
        * Can use RAID 0
            * Stripes blocks across two EBS volumes
            * Volumes replicated by EBS for HA and durability
            * Raid 0 doubles IOPS, throughput, and size
            * Requires additional software components you must manage
        * RAID 1
            * Mirrors the blocks
            * Replicates - no benefit as EBS already stores redundant copies
        * Uses
            * Boot volumes
            * Small to medium databases
            * Dev and test environments
    * Provisioned IOPS
        * Highest performance volumes for latency sentitive transactional workloads
        * Max IOPS/Volume - 64,000
        * Max provisioned performance to volume size (GB) - 50:1
        * Max Throughput - 1000 MiB/s
        * Size of volume: 4GiB - 16TiB
        * Uses
            * Boot volumes
            * Critical business applications
            * Large databases - cassandra, mongodb, sql server, oracle, postgres sql, mysql
* HDD st1
    * Throughput optimized
        * Low cost volume designed for frequently access throughput intensive workloads
        * Max IOPS/Volume - 500
        * Max throughput - 500 MiB/s
        * Size - 500 GiB - 16 TiB
        * Use
            * Big data
            * Data warehose
            * Log processing
    * Cold sc1
        * Low cost volume desined for infrequently accessed workloads
        * Max IOPS/Volume - 250
        * Max throughput - 250 MiB/s
        * Size - 500 GiB - 16 TiB
        * Use
            * Inexpensive storage
            * Infrequently accessed sequential workloads
            
## Elastic File System

Centralized repository for files

Services

* Elastic File System - file share for Linux EC2 instances on AWS
* FSx for windows - file share for Windows EC2 instances on AWS
* FSx for Lustre - file share optimized for high performance computing Linux EC2 instances, access s3 as a file share (like storage gateway)

Features

* Fully managed, automatically grow and shrink
* High durability and availability - replicated across multiple AZs in a region
* Can be used from on premise via direct connect or vpn

EFS

* NFS file system and traditional file permissions model
* Standard and infrequent access storage tiers
* Life cycle management to move files to lower storage costs

FSx for Windows

* FIle share for windows
* NTFS file system using smb protocol
* Integrated with active directory

FSx for Lustre

* High performance file share for Linux systems
* Optimized for high performance computing and fast processing
* Two modes
    * Standalone file share
    * Link to s3 bucket and access s3 as a file share
    
