# AWS Cloud Practitioner Essentials

## Module 5: Storage and Databases

### Instance Stores
Block-level storage volumes behave like physical hard drives.

An **instance store** provides ***temporary*** block-level storage for an Amazon EC2 instance. An instance store is disk storage that is **physically attached to the host computer** for an EC2 instance, and therefore has the same lifespan as the instance. When the instance is terminated, you lose any data in the instance store. 

**Amazon EC2 instances are virtual servers.** If you start an instance from a stopped state, the instance might start on ***another host***, where the previously used instance store volume does not exist. Therefore, AWS recommends instance stores for use cases that involve temporary data that you do not need in the long term.

### Amazon Elastic Block Store (Amazon ECS)
**Amazon Elastic Block Store (Amazon EBS)** is a service that provides block-level storage volumes that you can use with Amazon EC2 instances. If you stop or terminate an Amazon EC2 instance, all the data on the attached EBS volume **remains available**. To create an EBS volume, you define the configuration (such as volume size and type) and provision it. After you create an EBS volume, it can attach to an Amazon EC2 instance.

Because EBS volumes are for data that needs to persist, it’s important to back up the data. You can take incremental backups of EBS volumes by creating Amazon EBS snapshots. An **EBS snapshot** is an incremental backup. This means that the first backup taken of a volume copies all the data. For subsequent backups, **only the blocks of data that have changed since the most recent snapshot are saved**. Incremental backups are different from full backups, in which all the data in a storage volume copies each time a backup occurs. The full backup includes data that has not changed since the most recent backup.

### Amazon Simple Storage Service (Amazon S3)
In **object storage**, each object consists of ***data, metadata, and a key***. The data might be an image, video, text document, or any other type of file. Metadata contains information about what the data is, how it is used, the object size, and so on. An object’s key is its unique identifier.

***Recall that when you modify a file in block storage, only the pieces that are changed are updated. When a file in object storage is modified, the entire object is updated.***

**Amazon Simple Storage Service (Amazon S3)** is a service that provides **bject-level storage**. Amazon S3 stores data as objects in buckets.

You can upload any type of file to Amazon S3, such as images, videos, text files, and so on. For example, you might use Amazon S3 to store backup files, media files for a website, or archived documents. Amazon S3 offers unlimited storage space. The maximum file size for an object in Amazon S3 is 5 TB.

When you upload a file to Amazon S3, you can set permissions to control visibility and access to it. You can also use the Amazon S3 versioning feature to track changes to your objects over time.

When selecting an Amazon S3 storage class, consider these two factors:
- How often you plan to retrieve your data
- How available you need your data to be

#### S3 Standard
- Designed for frequently accessed data
- Stores data in a minimum of three Availability Zones
S3 Standard provides **high availability** for objects. This makes it a good choice for a wide range of use cases, such as websites, content distribution, and data analytics. S3 Standard **has a higher cost than other storage classes intended for infrequently accessed data and archival storage**.

#### S3 Standard-Infrequent Access (S3 Standard-IA)
- Ideal for infrequently accessed data
- Similar to S3 Standard but has a lower storage price and higher retrieval price
S3 Standard-IA is ideal for data infrequently accessed but requires high availability when needed. Both S3 Standard and S3 Standard-IA store data in **a minimum of three Availability Zones**. S3 Standard-IA provides the same level of availability as S3 Standard but with a lower storage price and ***a higher retrieval price***

#### S3 One Zone-Infrequent Acess (S3 One Zone-1A)
- Stores data in a single Availability Zone
- Has a lower storage price than S3 Standard-IA
Compared to S3 Standard and S3 Standard-IA, which store data in a minimum of three Availability Zones, S3 One Zone-IA stores data in **a single Availability Zone**. This makes it a good storage class to consider if the following conditions apply:
- You want to save costs on storage.
- You can easily reproduce your data in the event of an Availability Zone failure.

#### S3 Intelligent-Tiering
- Ideal for data with unknown or changing access patterns
- Requires **a small monthly monitoring and automation fee per object**
In the S3 Intelligent-Tiering storage class, Amazon S3 **monitors objects’ access patterns**. If you haven’t accessed an object for 30 consecutive days, Amazon S3 automatically moves it to the infrequent access tier, S3 Standard-IA. If you access an object in the infrequent access tier, Amazon S3 automatically moves it to the frequent access tier, S3 Standard.

#### S3 Glacier
- Low-cost storage designed for **data archiving**
- Able to retrieve objects within a few minutes to hours
S3 Glacier is a low-cost storage class that is ideal for data archiving. For example, you might use this storage class to store archived customer records or older photos and video files.

#### S3 Clacier Deep Archive
- Lowest-cost object storage class ideal for archiving
- Able to retrieve objects within 12 hours
When deciding between Amazon S3 Glacier and Amazon S3 Glacier Deep Archive, consider how quickly you need to retrieve archived objects. You can retrieve objects stored in the S3 Glacier storage class within a few minutes to a few hours. By comparison, you can retrieve objects stored in the S3 Glacier Deep Archive storage class **within 12 hours**.

### EBS or Amazon S3

S3 is already **web enabled**. Every object already has a URL that you can control access rights to who can see or manage the image. It's **regionally distributed**, which means that it has 11 nines of durability, so no need to worry about backup strategies. S3 is your backup strategy. Plus the **cost savings is substantial overrunning the same storage load on EBS**. With the additional advantage of being **serverless**, no Amazon EC2 instances are needed. ***(EBS needs to be attached to an EC2 instance).***

Object storage treats any file as a complete, discreet object. Now this is great for documents, and images, and video files that get uploaded and consumed as entire objects, but every time there's a change to the object, you must re-upload the entire file. There are no delta updates. Block storage breaks those files down to small component parts or blocks. This means when you make an edit to one scene in the film (Ex: 80-gigabyte) and save that change, the engine only updates the blocks where those bits live. If you're making a bunch of **micro edits**, using EBS, elastic block storage, is the perfect use case. If you were using S3, every time you saved the changes, the system would have to upload all 80 gigabytes, the whole thing, every time.

**This means, if you are using complete objects or only occasional changes, S3 is victorious. If you are doing complex read, right, change functions, then, absolutely, EBS is your knockout winner.**

### Amazon Elastic File System (Amazon EFS)

In **file storage**, multiple clients (such as users, applications, servers, and so on) can access data that is stored in shared file folders. In this approach, a storage server uses block storage with a local file system to organize files. Clients access data through file paths. Compared to block storage and object storage, file storage is ideal for use cases in which **a large number of services and resources need to access the same data at the same time**.

**Amazon Elastic File System (Amazon EFS)** is a **scalable file system** used with AWS Cloud services and on-premises resources. As you add and remove files, Amazon EFS grows and shrinks automatically. It can scale on demand to petabytes without disrupting applications. 

#### Comparing Amazon EBS and Amazon EFS
- An Amazon EBS volume stores data in a **single** Availability Zone. To attach an Amazon EC2 instance to an EBS volume, both the Amazone EC2 instance and the EBS volume must reside within the same Availability Zone.
- Amazon EFS is a regional service. It stores data in and across **multiple** Availability Zones. The duplicate storage enables you to access data concurrently from all the Availability Zones in the Region where a file system is located. Additionally, on-premises servers can access Amazon EFS using AWS Direct Connect.

### Amazon Relational Database Service

**Amazon Relational Database Service (Amazon RDS)** is a service that enables you to run relational databases in the AWS Cloud.

Amazon RDS is a managed service that automates tasks such as hardware provisioning, database setup, patching, and backups. With these capabilities, you can spend less time completing administrative tasks and more time using data to innovate your applications. You can integrate Amazon RDS with other services to fulfill your business and operational needs, such as using AWS Lambda to query your database from a serverless application.

Amazon RDS provides a number of different security options. Many Amazon RDS database engines offer **encryption at rest (protecting data while it is stored)** and **encryption in transit (protecting data while it is being sent and received)**.

#### Amazon RDS database engines

Amazon RDS is available on six database engines, which optimize for memory, performance, or input/output (I/O). Supported database engines include:

- Amazon Aurora
- PostgreSQL
- MySQL
- MariaDB
- Oracle Database
- Microsoft SQL Server

#### Amazon Aurora

Amazon Aurora is an enterprise-class relational database. It is compatible with MySQL and PostgreSQL relational databases. It is up to five times faster than standard MySQL databases and up to three times faster than standard PostgreSQL databases.

Amazon Aurora helps to reduce your database costs by reducing unnecessary input/output (I/O) operations, while ensuring that your database resources remain reliable and available. 

Consider Amazon Aurora if your workloads require high availability. It replicates **six copies** of your data across three Availability Zones and continuously backs up your data to Amazon S3.

### Amazon DynamoDB

In a **nonrelational database**, you create tables. A table is a place where you can store and query data. Nonrelational databases are sometimes referred to as “NoSQL databases” because they use structures other than rows and columns to organize data. One type of structural approach for nonrelational databases is key-value pairs. With key-value pairs, data is organized into items (keys), and items have attributes (values). You can think of attributes as being different features of your data. In a key-value database, you can add or remove attributes from items in the table at any time. Additionally, not every item in the table has to have the same attributes. 

**Amazon DynamoDB** is a key-value database service. It delivers single-digit millisecond performance at any scale.
- DynamoDB is **serverless**, which means that you do not have to provision, patch, or manage servers. You also do not have to install, maintain, or operate software.
- As the size of your database shrinks or grows, DynamoDB **automatically scales** to adjust for changes in capacity while maintaining consistent performance. This makes it a suitable choice for use cases that require high performance while scaling.

### Amazon Redshift

Amazon Redshift is a data warehousing service (Datawarehouse-as-a-Service) that you can use for big data analytics. It offers the ability to collect data from many sources and helps you to understand relationships and trends across your data.

### AWS Database Migration Service (AWS DMS)

AWS Database Migration Service (AWS DMS) enables you to migrate relational databases, nonrelational databases, and other types of data stores.

With AWS DMS, you move data between a source database and a target database. The source and target databases can be of the same type or different types. During the migration, your source database remains operational, reducing downtime for any applications that rely on the database. 

For example, suppose that you have a MySQL database that is stored on premises in an Amazon EC2 instance or in Amazon RDS. Consider the MySQL database to be your source database. Using AWS DMS, you could migrate your data to a target database, such as an Amazon Aurora database.

#### Other use cases for AWS DMS
- **Development and test database migration**: Enabling developers to test applications against production data without affecting production users
- **Database consolidation**: Combining several databases into a single database
- **Continuous replication**: Sending ongoing copies of your data to other target sources instead of doing a one-time migration

### Additional database services
#### Amazon DocumentDB
Amazon DocumentDB is a document database service that supports MongoDB workloads. (MongoDB is a document database program.)

#### Amazon Neptune
Amazon Neptune is a **graph database service**. 

You can use Amazon Neptune to build and run applications that work with highly connected datasets, such as recommendation engines, fraud detection, and knowledge graphs. (And for social media network data)

#### Amazon Quantum Ledger Database (Amazon QLDB)
Amazon Quantum Ledger Database (Amazon QLDB) is a ledger database service. 

You can use Amazon QLDB to review a complete history of all the changes that have been made to your application data.

#### Amazon Managed Blockchain
Amazon Managed Blockchain is a service that you can use to create and manage blockchain networks with open-source frameworks. 

Blockchain is a distributed ledger system that lets multiple parties run transactions and share data without a central authority.

#### Amazon ElstiCache
Amazon ElastiCache is a service that adds **caching layers** on top of your databases to help **improve the read times** of common requests. 

It supports two types of data stores: Redis and Memcached.

#### Amazon DynamoDB Accelerator
Amazon DynamoDB Accelerator (DAX) is an **in-memory cache** for DynamoDB. 

It helps **improve response times** from single-digit milliseconds to microseconds.