# Designing for Resilience

## AWS Well-Architected

AWS Well-Architected(opens in a new tab) helps you build secure, high-performing, resilient, and efficient infrastructure for a variety of applications and workloads. This framework is built around six pillars—operational excellence, security, reliability, performance efficiency, cost optimization, and sustainability—and provides a consistent approach to evaluate architectures and implement scalable designs. The AWS Well-Architected site provides lens-specific whitepapers, labs, guidance, and pillar-specific documentation so you can focus on specific pillars of the framework. 

## What is resilience?

Resilience is the ability of a workload to recover from infrastructure or service disruptions. The recovery has two dimensions: how long it takes to get the system back online for your users and how long it takes to restore the data you had.

As discussed earlier, Amazon EC2 is resilient by design, because the AWS global infrastructure is built around the concept of AWS Regions and Availability Zones. Regions provide multiple physically separated and isolated Availability Zones, which are connected through low-latency, high-throughput, and highly redundant networking. 

In addition to the AWS global infrastructure, Amazon EC2 offers the following features to support your data resilience:

- Copying AMIs and EBS snapshots across Regions

- Automating EBS-backed AMIs and EBS snapshots using Amazon Data Lifecycle Manager(opens in a new tab)

- Using Amazon EC2 Auto Scaling to maintaining the health and availability of your EC2 instances

- Elastic Load Balancing to distribute incoming traffic across multiple instances

## Instance settings that impact resilience

When building the EC2 instance, you can alter several settings to ensure resilience and recoverability. These settings are configured with the instance or AMI and can decrease the time it takes for the instance to return to a productive state. 

### Termination setting
–
This setting preserves Amazon EBS volumes when the instance terminates. This setting has no effect on instance store volumes as they are ephemeral and are automatically removed when the instance terminates.



The default value for this DeleteOnTermination attribute differs depending on whether the volume is the root volume of the instance or a non-root volume attached to the instance.



Root volume

The default action for a root volume is to delete the root volume when an instance is terminated. You can change this behavior by changing the DeleteOnTermination attribute to false. This attribute can be set by the creator of an AMI or by the person who launches an instance. When the attribute is changed by the creator of an AMI or by the person who launches an instance, the new setting overrides the original AMI default setting. It's important to verify this setting when you launch an instance with an AMI.



Non-root volume

The default action for non-root volumes is to preserve them when an instance is terminated. After the instance terminates, you can take a snapshot of the preserved volume or attach it to another instance. You are charged for unattached volumes, so deleting volumes that you no longer need is a cost-savings task.

### Shutdown behavior
–
To avoid the potential of terminating an instance accidentally, you can set the default instance shutdown behavior to STOP instead of TERMINATE. Doing this will stop an instance and preserve any volumes associated with it. This is one small step you can take to ensure that no accidental deletion or termination occurs.

### Modify termination policy
–
Amazon EC2 Auto Scaling helps you ensure that you have the correct number of Amazon EC2 instances available to handle the load for your application. 

Amazon EC2 Auto Scaling uses termination policies to determine which instances it terminates first during scale-in events. Termination policies define the termination criteria that is used by Amazon EC2 Auto Scaling when choosing which instances to terminate.



Your Auto Scaling groups use a default termination policy, but you can optionally choose or create your own termination policies with your own termination criteria. This lets you ensure that your instances are terminated based on your specific application needs.
 
### Enable termination protection
–
By default, you can terminate your instance using the Amazon EC2 console, command line interface (CLI), or API. To prevent your instance from being accidentally terminated using Amazon EC2, you can turn on termination protection for the instance. The DisableApiTermination attribute controls whether the instance can be terminated using the console, CLI, or API. By default, termination protection is turned off for your instance. You can set the value of this attribute when you launch the instance, while the instance is running, or while the instance is stopped (for Amazon EBS-backed instances).

### Simplified automatic recovery
–
Instances that support simplified automatic recovery are configured by default to recover a failed instance. The default configuration applies to new instances that you launch and existing instances that you previously launched. Simplified automatic recovery is initiated in response to system status check failures.

The settings that you choose when you build your instance is only part of building a resilient and available environment. Making sure that EBS volumes are not accidentally deleted on termination is a good start, but that cannot be the only place where you plan for the day that something fails. Along with planning a backup and recovery strategy, you will also need to ensure that the environment is up, running, and able to be used by the customers. 

Now that John has a confident grasp on the states of an EC2 instance and how these state affect the volumes of data, he's going to map out the best ways to design the environment for availability while keeping resilience at the forefront of his plans. Let's take a look at John's research into designing for availability.