# Storage

There are 3 main storage types in AWS.

* `block`: `EBS` (persistent) and instance store (ephemeral)
* `file`: Amazon `EFS` (Elastic File System)
* `object`: Amazon `S3` and `Glacier`

Deciding on which type of storage to use depends on your `data dimensions`.

* Volume, Variety, Velocity
  * volume is how big is the data in size
  * variety is the heterogeneity of the data types (text, videos, pictures, etc.)
  * velocity is how fast data is flowing through
* Temperature
  * `hot` data is actively used
  * `warm` data is actively used but less than hot data
  * `cold` data is occasionally used data
  * `frozen` data is non-actively used data
* Value
  * `transient` data has a short lifespan
  * `reproducible` data is derived data
  * `authoritative` data is ground truth data
  * `critical` data is data that must be kept 

## CIA model

The `CIA` (Confidentiality, Integrity, Availability) model of information security is used by AWS. Confidentiality is accomplished through permissions and encryption to assure data privacy. Integrity centers on the accuracy of data. Availability refers to service availability to store data.

## Block storage

### EBS

Here are some things to consider about EBS.

* Network-attachment: EBS volume are not physically attached to an EC2 instance; instead, they are network-attached
* Resizing: EBS volumes may be resized and are referred to as `elastic volumes`
* SSD vs HDD: SSD are optimized for IOPS (input/output operations) while HDD are optimized for throughput
* Snapshots: Snapshots of EBS volumes are possible and efficient as they are incremental
* EBS-optimized instances: Instances that are EBS-optimized separate data to and from EBS volumes apart from other network traffic
* Encryption: Data stored on EBS volumes are encrypted using [AES-256](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard) and the AWS `KMS` (Key Management Service)

<div class="alert alert-info">
    
**Note:** When EBS volumes are restored from snapshots, the blocks must be pulled down from S3 and written to the volume, a process know as `initialization`, which may take significant time.
    
</div>

### Instance storage

Here are some things to consider about instance store volumes.

* Direct-attachment: instance store volumes are directl attached to an EC2 instance
* Availability: not all EC2 instance types have instance store available

## Object storage

### S3

The fundamental concept of S3 is the `bucket`. A bucket does not behave like a folder, although it may appear to be so. There is a limitation of 100 buckets per account, and buckets cannot be nested (no buckets within buckets). Bucket names are also important as they are global, and it is encouraged to use DNS naming convention when naming buckets. 

`Objects` refer to the items (files) placed in a bucket. When a bucket is created, one may choose to `version` all items in a bucket (this option cannot be changed later). Objects are limited to 5 TB in size, and anything larger must be chunked. 

## Definitions

* `KMS` (Key management Service): AWS service to manage encryption keys
* `data in transit`: data that is moving (on the network or between two locations)
* `data at rest`: data that is stored
* `SSE` (Server-Side Encryption): encryption that happens on the server
* `envelop encryption`: a process of encrypting data with a key, and also encrypting the with another key, called the `key-encrypting key`
* `CSE` (Client-Side Encryption): encryption that happens on the client