This repository serves as a supporting workspace for the proof-of-concept (PoC) implementation related to the Master Thesis "Quantitative Cost Assessment of IaC Testing". While the actual PoC is encapsulated within a submodule, this repository contains the essential guides and tooling required to set up and run the PoC.
This separation is done to facilitate streamlined testing, as the test pipeline can directly check out only the executable code - see Submodule thesis-tf for more information.
- Devcontainer: Optional setup for a standardized development environment supporting the PoC.
- Test Pipeline: Instructions for executing the test pipeline with the provided Jenkinsfile.
- Submodule thesis-tf: The core elements of the PoC including Terraform configuration, Jenkinsfile, and test implementations.
- Data Collection: The approach for collecting test metrics and cost data within the test pipeline.
- Cleanup: Guidelines for using
cloud-nuke
to clean up AWS resources post-testing. - Account Creation: Steps for setting up necessary accounts for using the PoC.
This repository includes a devcontainer definition, providing a consistent and isolated development environment, especially beneficial during the research and development stages of this project. If your focus is on the test pipeline, you may skip ahead to the Test Pipeline section.
DevContainers are essentially containerized development environments, allowing for the encapsulation of tools and libraries required for a project. This approach significantly simplifies the setup process and ensures a uniform development experience. While this guide is tailored for use with VS Code, those opting for GitHub Codespaces should consult the GitHub Codespaces documentation to familiarize themselves with its specifics.
- Docker Desktop or a compatible Docker alternative
- VS Code
- VS Code Dev Containers Extension
For more details, consult the VS Code Guide.
To get started, clone this repository and its submodule using the following command:
git clone --recurse-submodules https://github.com/fex01/thesis-ws.git
Once you have cloned the repository, open the project folder in VS Code. At this stage, you will likely encounter a prompt suggesting to 'Reopen in Container'. Please ignore this prompt for now, as there are initial configuration steps that need to be completed first. These steps are crucial to ensure that the devcontainer is properly set up before proceeding with its execution.
For guidance on obtaining the necessary credentials, please refer to the Account Creation section of this README.
-
AWS Credentials: These credentials are essential for deploying infrastructure to the cloud provider AWS, allowing for the management and operation of cloud resources. You have two options for configuring AWS credentials within the DevContainer:
- .env File: Create a
.env
file based on the .env_template available in the project root directory. PopulateAWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
with your actual AWS credentials. - AWS Host Configuration Integration: Use the host machine's AWS configuration by following the instructions under AWS Host Configuration Integration.
- .env File: Create a
-
InfraCost Credentials: These credentials are required to generate cost breakdowns for the resources configured as part of your infrastructure, facilitating effective cost management and optimization. To configure InfraCost:
- Create a
.env
file based on the .env_template available in the project root directory. PopulateINFRACOST_API_KEY
with your actual InfraCost API key.
- Create a
Refer to ./.devcontainer/devcontainer.json
for information on installed tools and how to individualize your setup.
- Verify that Docker is running:
docker --version
- Open the
thesis-ws
repository in VS Code. - Use the Command Palette (F1) and select
Dev Containers: Reopen in Container
. - The initial container build might take up to 15 minutes. Subsequent launches with an existing container will open within seconds.
Now you are all set to explore the PoC within the devcontainer.
The submodule includes a Jenkinsfile
defining the test pipeline.
This pipeline can be executed on your own Jenkins installation or through our provided dockerized Jenkins setup.
In either case, certain prerequisites like credentials and job configurations are necessary.
For a comprehensive guide on these prerequisites and on setting up a dockerized Jenkins installation, please consult Jenkins Setup.
The submodule is an independent repository that houses the core elements of the PoC. It includes the IaC configuration written in Terraform, a Jenkinsfile for defining the test pipeline, and the actual test implementations. This separation allows for streamlined testing, as the test pipeline can directly check out only the executable code.
For more information on the specifics of the submodule, please refer to its README.
The data collection process for the "Quantitative Cost Assessment of IaC Testing" PoC is methodically embedded within the test pipeline. Each execution of the pipeline is designed to capture detailed metrics regarding test runtimes and associated costs.
The pipeline is configured to execute tests individually or in groups, gathering data for each run. Individual test executions are performed using:
// /terraform/Jenkinsfile
...
steps {
sh """scripts/run_test.sh \\
--build-number ${BUILD_NUMBER} \\
--defect-category ${DEFECT_CATEGORY} \\
--test-approach ${TEST_APPROACH} \\
--test-command '${TEST_COMMAND}' \\
--csv-file ${CSV_FILE}"""
}
...
For grouped tests, the following command is used:
// /terraform/Jenkinsfile
...
steps {
sh """scripts/run_grouped_tests.sh \\
--build-number ${BUILD_NUMBER} \\
--test-folder ${TEST_FOLDER} \\
--test-tool '${TEST_TOOL}' \\
--test-command '${TEST_COMMAND}' \\
--csv-file ${CSV_FILE}"""
}
...
Each build within the test pipeline generates a new Infracost breakdown report for the resources scheduled for deployment. Subsequently, the costs for each test are calculated and added to measured data:
// /terraform/Jenkinsfile
...
steps {
sh """scripts/extend_measurements_with_costs.py \\
--infracost-json ${INFRACOST_JSON} \\
--measurements-csv ${CSV_FILE}"""
}
...
This process ensures that the resulting dataset includes not only the performance metrics of the tests but also a financial dimension, reflecting the cost implications of the deployed infrastructure.
All scripts related to the test runs are located in the /terraform/scripts/
directory within the repository.
The collected metrics for each test include the build number, defect category, test case, test approach, test tool, and runtime in seconds. Furthermore, the cost associated with each test is calculated and appended to the data. This calculation accounts for the billing modalities of different resources and is based on the test runtime and an Infracost breakdown of resource prices. The cost calculation script can be found at calculate_costs.py.
The CSV file containing the measurements, along with the corresponding Infracost breakdown report, is saved as an artifact for each test pipeline build. These artifacts are retrievable via the Jenkins Job Portal, allowing for in-depth analysis of the collected data.
To assist with the aggregation and analysis of data across multiple builds, a helper script is available. This script is designed to collect and merge data, which can be executed on the local Jenkins server or from the Docker host when employing our dockerized Jenkins setup.
The repository includes the raw data that is the foundation of the analysis presented in the thesis. In addition, the specific Infracost breakdown report used to calculate the costs for these measurements is also provided. This dataset is openly shared to promote transparency and to enable peer validation of our research findings. The availability of the raw data along with the cost calculations ensures full traceability, allowing others to understand and verify the data-driven conclusions that support our thesis.
By making this information publicly accessible, we underscore our commitment to an exhaustive and transparent approach to data collection, bolstering the reliability and reproducibility of our work on the costs associated with IaC testing.
During the course of extended testing or in the event of partial failures of terraform destroy
, there may be residual AWS resources that are not properly removed. To address this, we have employed the use of cloud-nuke throughout the implementation phase.
The cloud-nuke
tool is incorporated in our devcontainer and can be utilized as follows:
cloud-nuke aws
: This command will attempt to remove all AWS resources. Exercise caution while using this command, especially in production environments. It is imperative to be fully aware of the ramifications of this action.--config terraform/cloud-nuke.yaml
: Utilizing this flag will incorporate the exemption rules as explained in the Cloud-Nuke Exemption Configuration subsection. This safeguards essential resources like the manually-created initial IAM user and any preexisting AWS roles from unintended deletion.--force
: This flag enables non-interactive execution, bypassing the confirmation dialog.AWS_PROFILE=thesis cloud-nuke aws
: To specify a particular AWS profile, precede thecloud-nuke
command with theAWS_PROFILE
variable set to the desired profile name.
Note that usage of cloud-nuke
is a powerful action that should be taken with full understanding and caution.
In addition to manual invocation, our test pipeline includes a dedicated post
action that is designed to automate the cleanup process. You have the option to enable or disable this step by setting the pipeline parameter nuke
(default: false
).
When enabled, this stage will run cloud-nuke
with the exemption configuration file located at terraform/cloud-nuke.yaml.
The cloud-nuke
exemption configuration file, located at terraform/cloud-nuke.yaml, is instrumental for safeguarding essential resources during both automated pipeline cleanup and manual cloud-nuke
executions. Specifically, the account credentials used for deployment should be included in this exemption list. In our particular setup, these credentials are associated with an account named admin
. If your deployment uses a different account name, it is imperative to update this configuration file accordingly to prevent unintentional deletions.
Whether you are running the "Nuke" stage in the test pipeline or executing cloud-nuke
manually, this exemption configuration ensures a more secure cleanup process, minimizing the risk of accidental resource removal.
If you do not currently have an AWS account, the following is a concise outline for setting up an initial account and user, distilled from AWS's more comprehensive Getting Started Guide.
-
Initial Setup: Create an AWS account, which automatically establishes a root user.
- Security Measures: It is advisable to enable Multi-Factor Authentication (MFA) for the root user immediately for basic security. Navigate to
IAM -> Add MFA
.
- Security Measures: It is advisable to enable Multi-Factor Authentication (MFA) for the root user immediately for basic security. Navigate to
-
IAM User Creation: For operational tasks, avoid using the root user. Instead, create a secondary IAM user with administrative privileges.
- Navigate to
IAM -> Users -> Create user
and proceed to attach theAdministratorAccess
policy.
- Navigate to
-
API Credentials: Subsequent to user creation, generate the necessary API credentials.
- Click on the username and select
Create access key
. Note that although temporary credentials are recommended for heightened security, permanent credentials are deemed sufficient for the purpose of this guide. Ensure that you store theAccess key
andSecret access key
in a secure location.
- Click on the username and select
-
Local CLI Configuration (Optional): Configure the local AWS CLI with the
aws configure
command, if required for your environment outside of the DevContainer. This step involves entering theAccess key ID
andSecret access key
from earlier, and selecting your preferred AWS region (e.g.,eu-west-3
). It results in the creation of~/.aws/credentials
and~/.aws/config
files. Note that this step is not necessary if the credentials are solely for use within the DevContainer.
Should you prefer to use your host machine's AWS configuration, the .devcontainer/devcontainer.json
file can be modified to achieve this. Under the mounts
section, amend the corresponding AWS entry as illustrated below:
{
...
// Configure mounts from host to container.
"mounts": [
...
// To use the host's AWS configuration, uncomment the following line:
// "source=${localEnv:HOME}/.aws,target=/home/vscode/.aws,type=bind"
]
}
Infracost, the chosen cost estimation tool for our test pipeline, requires an API key for operation. This key is necessary both for the DevContainer and within the test pipeline.
To obtain your Infracost API key, follow these steps:
- Via CLI: Typically, you can use the command
infracost auth login
to get an API key. However, if this method is unsuccessful, proceed to the next step. - Sign Up Online: Register at Infracost Dashboard.
- Retrieve API Key: Once signed up, navigate to Org Settings on the dashboard to find your API key.
For more detailed information on Infracost and its setup, refer to the Infracost Quick-Start Guide.