# **Infrastructure as Code Tools** #

Infrastructure was traditionally provisioned using a combination of scripts and manual processes. This often resulted in the creation of new environments that were not repeatable, reliable, or consistent. Additionally, out-of-date scripts and processes became deployment showstoppers.

The solution is to treat infrastructure deployment in the way that software developers treat their code—in a programmatic, descriptive, and replicable way. This is referred to as infrastructure as code (IaC), and it is one of the most powerful and far-reaching aspects of cloud computing.


## **Infrastructure as code** ##

IaC refers to managing and provisioning infrastructure through machine-readable definition files instead of physical hardware configuration or interactive configuration tools.

With the advent of virtual machines in the late 1990s, it became possible to do much of this. However, you still had to purchase and deploy hardware with on-premises systems, and then install operating systems and application software.

With the cloud, you don't need to do any of that.

Consider the following scenario:

- You have an idea for cost optimization of one of your workflows and want to test the idea.
- New serverless options are available, and you would like to test them to see if they would be feasible for your workflows.
- You want to duplicate an operational system for use in another AWS account.
- You want to create a catalog of services and architectures that can be deployed by authorized users.
- You need a disaster recovery system that mimics your production system.

There are multiple proven and reliable IaC tools, including AWS CloudFormation, AWS Cloud Development Kit (AWS CDK), and HashiCorp TerraForm, that can be used to deploy workflows employing AWS services.

The flagship IaC product in AWS is AWS CloudFormation.

## **AWS CloudFormation** ##


Using AWS CloudFormation is very straightforward. You construct a template that defines what you want to build, and then upload the template. AWS deploys the system defined by the template.

Using templates

A template in AWS CloudFormation refers to a JSON or YAML file that defines the AWS infrastructure resources and properties required to deploy your application. The template acts as a blueprint for your infrastructure.

A CloudFormation template is analogous to a cooking recipe as shown in the following table.
![image-2.png](attachment:image-2.png)


Here is an example of a simple CloudFormation template to deploy a single Amazon Elastic Compute Cloud (Amazon EC2) instance. 
![image.png](attachment:image.png)

## **Using CloudFormation for repeatable deployments** ##

A data engineering team is setting up a basic networking standard that all of their applications will follow. They were manually configuring the required AWS resources each time, which was time-consuming and error prone.

To solve this, they use AWS CloudFormation to do the following:

- Define a template that creates the necessary resources like an S3 bucket, CloudFront distribution, and Amazon Route53 records.
- Specify parameters in the template for variables like VPCs, subnets, and security groups.
- Whenever a new network setup is needed, they update the parameters and launch a new stack instance using the same template.
- CloudFormation does the provisioning, configuring, and linking all the resources together, which saves significant time compared to the manual deployment process.
- Any future changes to the template only require updating the template file rather than modifying each environment individually. This creates a consistent, repeatable deployment process.


CloudFormation also can detect and remediate configuration drift. This ensures that the actual state of the infrastructure is aligned to the desired state defined by the template. The team can now rapidly launch network stacks and scale to support hundreds of users without operational overhead. This helps them focus on their core business and expand their offerings.

## **AWS CDK** ## 


AWS Cloud Development Kit (AWS CDK) is an open source software development framework that you can use to define and deploy AWS resources using programming languages like TypeScript, Python, Java, and C#. Instead of using declarative templates, you write imperative code to define your infrastructure. 

AWS CDK internally uses CloudFormation to provision and manage AWS resources. When you synthesize your AWS CDK code, it generates its own CloudFormation templates and then deploys those templates using CloudFormation.

AWS CDK aims to provide a higher-level, more developer-friendly abstraction on top of CloudFormation. You can define infrastructure using familiar programming languages and constructs, while still using the power and capabilities of CloudFormation.

The choice between using CloudFormation directly or AWS CDK often depends on factors such as team preferences, existing skills and workflows, and the complexity of the infrastructure being managed.

## **Summary** ##

By using IaC tools, your data engineering team can build, deploy, and manage data infrastructure in a scalable, reliable, and efficient manner. The team can define their infrastructure (such as data lakes, data warehouses, or data pipelines) as code and provision resources in an automated and repeatable way. IaC tools such as AWS CloudFormation and AWS CDK encourage a modular, reusable approach to defining infrastructure. Your team can share and reuse common patterns across multiple data engineering projects.