Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS Step Functions Integration #204

Closed
wants to merge 0 commits into from
Closed

AWS Step Functions Integration #204

wants to merge 0 commits into from

Conversation

savingoyal
Copy link
Collaborator

@savingoyal savingoyal commented May 20, 2020

Please see #211 instead.

Integrates Metaflow with AWS Step Functions.

Introduces a new command step-functions:

  • python my_flow.py step-functions create will compile and export the user-defined Metaflow flow into an AWS Step Functions state machine. This will allow users of Metaflow to move their flows into production seamlessly with AWS.
    • An additional flow level decorator, @schedule, allows users to optionally schedule the execution of their flows by integrating with AWS Event Bridge.
    • All current functionality of Metaflow - containerized job execution on top of AWS Batch through @batch, dependency management via @conda, retrying mechanisms through @retry, @catch and @timeout, parameters, branches, and for-eaches are now available within AWS Step Functions through this integration.
    • Additionally, introduces a notion of production token to ensure flow deployments to AWS Step Functions have proper safeguards against unintended production deployments
  • python my_flow.py step-functions trigger will trigger a deployed workflow on AWS Step Functions

Requirements:

  • METAFLOW_SFN_IAM_ROLE : IAM role that allows AWS Step Functions to interact with AWS Batch and AWS Dynamo DB
  • METAFLOW_EVENTS_SFN_ACCESS_IAM_ROLE : IAM role that allows AWS Event Bridge to send notifications to AWS Step Functions
  • METAFLOW_SFN_DYNAMO_DB_TABLE : AWS Dynamo DB Table which allows AWS Step Functions to keep track of the necessary state for foreaches

Notes:

  • A cloudformation template provisioning all of the above should be available soon.
  • Metaflow service currently doesn't provide for string identifiers for runs and tasks. This PR currently bonks when used when DEFAULT_METADATA is set to service. A subsequent update to the service will fix that issue. For testing, please use - DEFAULT_METADATA=local.

Instructions and detailed docs are on their way.

Links:

Copy link
Contributor

@romain-intel romain-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial review of changed files. I haven't looked at the new files yet (that will take a bit more time :)).

metaflow/cli.py Outdated Show resolved Hide resolved
metaflow/metadata/service.py Outdated Show resolved Hide resolved
metaflow/metadata/service.py Outdated Show resolved Hide resolved
metaflow/metadata/service.py Outdated Show resolved Hide resolved
metaflow/plugins/__init__.py Outdated Show resolved Hide resolved
metaflow/plugins/aws/batch/batch.py Outdated Show resolved Hide resolved
metaflow/plugins/aws/batch/batch_cli.py Outdated Show resolved Hide resolved
metaflow/plugins/aws/batch/batch_cli.py Outdated Show resolved Hide resolved
metaflow/plugins/aws/batch/batch_client.py Outdated Show resolved Hide resolved
metaflow/plugins/aws/batch/batch_decorator.py Outdated Show resolved Hide resolved
Copy link
Contributor

@romain-intel romain-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving comments.

"that currently doesn't support AWS Step Functions. ")
obj.echo("For more information on how to upgrade your "
"service to a compatible version (>= 2.0), visit:")
obj.echo(" [url]", fg='green')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just putting a TODO here :). So easy to forget.

)

def list_executions(self, state_machine_arn, states):
if len(states) > 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, check for uniqueness.

@savingoyal savingoyal deleted the aws-sfn branch June 8, 2020 20:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants