Skip to content

Latest commit

 

History

History
 
 

datastore

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Datastore

Stores Metaflow state, acting as Metaflow's remote Datastore. The data stored includes but is not limited:

  • for each flow
    • for each version
      • conda environments
      • dependencies
      • artifacts
      • input
      • output

No duplicate data is stored thanks to automatic deduplication built into Metaflow.

To read more, see the Metaflow docs

Inputs

Name Description Type Default Required
db_engine n/a string "postgres" no
db_engine_version n/a string "11" no
db_instance_type RDS instance type to launch for PostgresQL database. string "db.t2.small" no
db_name Name of PostgresQL database for Metaflow service. string "metaflow" no
db_username PostgresQL username; defaults to 'metaflow' string "metaflow" no
force_destroy_s3_bucket Empty S3 bucket before destroying via terraform destroy bool false no
metadata_service_security_group_id The security group ID used by the MetaData service. We'll grant this access to our DB. string n/a yes
metaflow_vpc_id ID of the Metaflow VPC this SageMaker notebook instance is to be deployed in string n/a yes
resource_prefix Prefix given to all AWS resources to differentiate between applications string n/a yes
resource_suffix Suffix given to all AWS resources to differentiate between environment and workspace string n/a yes
standard_tags The standard tags to apply to every AWS resource. map(string) n/a yes
subnet1_id First subnet used for availability zone redundancy string n/a yes
subnet2_id Second subnet used for availability zone redundancy string n/a yes

Outputs

Name Description
METAFLOW_DATASTORE_SYSROOT_S3 Amazon S3 URL for Metaflow DataStore
METAFLOW_DATATOOLS_S3ROOT Amazon S3 URL for Metaflow DataTools
database_name The database name
database_password The database password
database_username The database username
datastore_s3_bucket_kms_key_arn The ARN of the KMS key used to encrypt the Metaflow datastore S3 bucket
rds_master_instance_endpoint The database connection endpoint in address:port format
s3_bucket_arn The ARN of the bucket we'll be using as blob storage
s3_bucket_name The name of the bucket we'll be using as blob storage