Skip to content

signal-ai/terraform-aws-metaflow

 
 

Repository files navigation

Metaflow Terraform module

Terraform module that provisions AWS resources to run Metaflow in production.

This module consists of submodules that can be used separately as well:

modules diagram

You can either use this high-level module, or submodules individually. See each module's corresponding README.md for more details.

You can find a complete example that uses this module but also includes setting up VPC and other non-Metaflow-specific parts of infra in this repo.

Development

To format documentation:

pipx install pre-commit
pre-commit install --install-hooks
pre-commit run --all-files

Modules

Name Source Version
metaflow-common ./modules/common n/a
metaflow-computation ./modules/computation n/a
metaflow-datastore ./modules/datastore n/a
metaflow-metadata-service ./modules/metadata-service n/a
metaflow-step-functions ./modules/step-functions n/a
metaflow-ui ./modules/ui n/a

Inputs

Name Description Type Default Required
access_list_cidr_blocks List of CIDRs we want to grant access to our Metaflow Metadata Service. Usually this is our VPN's CIDR blocks. list(string) [] no
api_basic_auth Enable basic auth for API Gateway? (requires key export) bool true no
batch_type AWS Batch Compute Type ('ec2', 'fargate', 'spot') string "ec2" no
compute_environment_ami_id The AMI ID to use for Batch Compute Environment EC2 instances. If not specified, defaults to the latest ECS optimised AMI. string null no
compute_environment_desired_vcpus Desired Starting VCPUs for Batch Compute Environment [0-16] for EC2 Batch Compute Environment (ignored for Fargate) number 8 no
compute_environment_egress_cidr_blocks CIDR blocks to which egress is allowed from the Batch Compute environment's security group list(string)
[
"0.0.0.0/0"
]
no
compute_environment_instance_types The instance types for the compute environment list(string)
[
"c4.large",
"c4.xlarge",
"c4.2xlarge",
"c4.4xlarge",
"c4.8xlarge"
]
no
compute_environment_max_vcpus Maximum VCPUs for Batch Compute Environment [16-96] number 64 no
compute_environment_min_vcpus Minimum VCPUs for Batch Compute Environment [0-16] for EC2 Batch Compute Environment (ignored for Fargate) number 8 no
compute_environment_spot_bid_percentage The maximum percentage of on-demand EC2 instance price to bid for spot instances when using the 'spot' AWS Batch Compute Type. number 100 no
compute_environment_user_data_base64 Base64 hash of the user data to use for Batch Compute Environment EC2 instances. string null no
db_instance_type RDS instance type to launch for PostgresQL database. string "db.t2.small" no
ecs_cluster_id The ID of an existing ECS cluster to run services on. If no cluster ID is specfied, a new cluster will be created. string null no
enable_custom_batch_container_registry Provisions infrastructure for custom Amazon ECR container registry if enabled bool false no
enable_step_functions Provisions infrastructure for step functions if enabled bool n/a yes
extra_ui_backend_env_vars Additional environment variables for UI backend container map(string) {} no
extra_ui_static_env_vars Additional environment variables for UI static app map(string) {} no
iam_partition IAM Partition (Select aws-us-gov for AWS GovCloud, otherwise leave as is) string "aws" no
metadata_service_container_image Container image for metadata service string "" no
postgres_engine_version Postgres engine version to use for Metaflow database. string "11" no
resource_prefix string prefix for all resources string "metaflow" no
resource_suffix string suffix for all resources string "" no
subnet1_id First subnet used for availability zone redundancy string n/a yes
subnet2_id Second subnet used for availability zone redundancy string n/a yes
tags aws tags map(string) n/a yes
ui_alb_internal Defines whether the ALB for the UI is internal bool false no
ui_allow_list List of CIDRs we want to grant access to our Metaflow UI Service. Usually this is our VPN's CIDR blocks. list(string) [] no
ui_certificate_arn SSL certificate for UI. If no certificate arn is provided, HTTP will be used. string null no
ui_static_container_image Container image for the UI frontend app string "" no
vpc_cidr_blocks The VPC CIDR blocks that we'll access list on our Metadata Service API to allow all internal communications list(string) n/a yes
vpc_id The id of the single VPC we stood up for all Metaflow resources to exist in. string n/a yes

Outputs

Name Description
METAFLOW_BATCH_JOB_QUEUE AWS Batch Job Queue ARN for Metaflow
METAFLOW_DATASTORE_SYSROOT_S3 Amazon S3 URL for Metaflow DataStore
METAFLOW_DATATOOLS_S3ROOT Amazon S3 URL for Metaflow DataTools
METAFLOW_ECS_S3_ACCESS_IAM_ROLE Role for AWS Batch to Access Amazon S3
METAFLOW_EVENTS_SFN_ACCESS_IAM_ROLE IAM role for Amazon EventBridge to access AWS Step Functions.
METAFLOW_SERVICE_INTERNAL_URL URL for Metadata Service (Accessible in VPC)
METAFLOW_SERVICE_URL URL for Metadata Service (Accessible in VPC)
METAFLOW_SFN_DYNAMO_DB_TABLE AWS DynamoDB table name for tracking AWS Step Functions execution metadata.
METAFLOW_SFN_IAM_ROLE IAM role for AWS Step Functions to access AWS resources (AWS Batch, AWS DynamoDB).
api_gateway_rest_api_id_key_id API Gateway Key ID for Metadata Service. Fetch Key from AWS Console [METAFLOW_SERVICE_AUTH_KEY]
batch_compute_environment_security_group_id The ID of the security group attached to the Batch Compute environment.
datastore_s3_bucket_kms_key_arn The ARN of the KMS key used to encrypt the Metaflow datastore S3 bucket
metadata_svc_ecs_task_role_arn n/a
metaflow_api_gateway_rest_api_id The ID of the API Gateway REST API we'll use to accept MetaData service requests to forward to the Fargate API instance
metaflow_batch_container_image The ECR repo containing the metaflow batch image
metaflow_profile_json Metaflow profile JSON object that can be used to communicate with this Metaflow Stack. Store this in ~/.metaflow/config_[stack-name] and select with $ export METAFLOW_PROFILE=[stack-name].
metaflow_s3_bucket_arn The ARN of the bucket we'll be using as blob storage
metaflow_s3_bucket_name The name of the bucket we'll be using as blob storage
migration_function_arn ARN of DB Migration Function
ui_alb_arn UI ALB ARN
ui_alb_dns_name UI ALB DNS name

Packages

No packages published

Languages

  • HCL 99.3%
  • Python 0.7%