Skip to content

kwame-mintah/terraform-aws-machine-learning-pipeline

Repository files navigation

Terraform AWS Machine Learning Pipeline

The main purpose of this repository is to create resources needed for Machine Learning within AWS. And have a better understanding of Machine Learning Operations (MLOps). Notes can be found in my notes-md repository, data used can be found in ml-data-copy-to-aws-s3.

Additionally, the serverless framework is used to deploy various AWS Lambda functions, this can be found in aws-automlops-serverless-deployment.

Development

Dependencies

Prerequisites

  1. Have a AWS account account and associated credentials

Usage

  1. Navigate to the environment you would like to deploy,

  2. Initialize the configuration with:

    aws-vault exec <profile> --no-session terragrunt init
  3. Plan your changes with:

    aws-vault exec <profile> --no-session terragrunt plan
  4. If you're happy with the changes:

    aws-vault exec <profile> --no-session terragrunt apply

Note

Please note that terragrunt will create an S3 Bucket and DynamoDB table, for storing the remote state. Ensure the account deploying the resources has the appropriate permissions to create or connect to these resources.

GitHub Action (CI/CD)

A IAM user will need to be created within the AWS account, this will be used for the GitHub workflows (.github/workflows) that will deploy resources using terragrunt-action. The following repository actions secrets and variables need to be set

Secret Description
AWS_REGION The AWS Region, can also use AWS_DEFAULT_REGION.
AWS_ACCESS_KEY_ID The AWS access key.
AWS_SECRET_ACCESS_KEY The AWS secret key.

Pre-Commit hooks

Git hook scripts are very helpful for identifying simple issues before pushing any changes. Hooks will run on every commit automatically pointing out issues in the code e.g. trailing whitespace.

To help with the maintenance of these hooks, pre-commit is used, along with pre-commit-hooks.

Please following these instructions to install pre-commit locally and ensure that you have run pre-commit install to install the hooks for this project.

Additionally, once installed, the hooks can be updated to the latest available version with pre-commit autoupdate.

Documentation Generation

Code formatting and documentation for variables and outputs is generated using pre-commit-terraform hooks that in turn uses terraform-docs that will insert/update documentation. The following markers have been added to the README.md:

<!-- {BEGINNING|END} OF PRE-COMMIT-TERRAFORM DOCS HOOK --->

Requirements

Name Version
terraform >= 1.5.4
aws ~> 5.37.0

Providers

Name Version
aws 5.37.0

Modules

Name Source Version
automl_data ./modules/s3_bucket n/a
github_action ./modules/github_action n/a
lambda_data_preprocessing_ecr ./modules/ecr n/a
lambda_model_deployment_ecr ./modules/ecr n/a
lambda_model_evaluation_ecr ./modules/ecr n/a
lambda_model_training_ecr ./modules/ecr n/a
ml_data ./modules/s3_bucket n/a
model_eval_queue ./modules/sqs n/a
model_monitoring ./modules/s3_bucket n/a
model_output ./modules/s3_bucket n/a
sagemaker ./modules/sagemaker n/a
serverless_deployment ./modules/s3_bucket n/a

Resources

Name Type
aws_default_security_group.default_security_group resource
aws_vpc.application_vpc resource
aws_availability_zones.available_zones data source
aws_caller_identity.current_caller_identity data source

Inputs

Name Description Type Default Required
application_vpc_ipv4_cidr_block TThe IPv4 CIDR block for the VPC. string n/a yes
aws_region The AWS region. string n/a yes
env_prefix The prefix added to resources in the environment. string n/a yes
project_name The name of the project. string n/a yes
tags Tags to be added to resources created. map(string) {} no

Outputs

Name Description
availability_zones List of the Availability Zone names available to the account.
current_caller_identity AWS Account ID number of the account that owns or contains the
calling entity.

About

Deploy Machine Learning resources within AWS via Terraform for MLOps.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages