Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Infra updates] Add a workflow to run the model on PRs and workflow_dispatch #23

Closed
Tracked by #175
jeancochrane opened this issue Oct 20, 2023 · 0 comments · Fixed by #26
Closed
Tracked by #175

[Infra updates] Add a workflow to run the model on PRs and workflow_dispatch #23

jeancochrane opened this issue Oct 20, 2023 · 0 comments · Fixed by #26

Comments

@jeancochrane
Copy link
Contributor

Building on #22, we need a GitHub Actions workflow that can run the model. The workflow should:

There are two ways we could define a job to run the model. Try option 1 first, and fall back to option 2 if CML doesn't work as advertised.

Option 1: Use CML self-hosted runners

  • Define a job, launch-runner, to start an AWS spot EC2 instance using cml runner
    • Set sensible defaults for the instance options, but allow them to be overridden via workflow inputs
  • Define a job, run-model, to run the model on the EC2 instance created by CML
    • Set the runs-on key for the job to point at the runner
      • This will cause steps defined in the job to run on the remote runner
    • Run the model using dvc pull and dvc repro

Option 2: Write custom code to run model jobs on AWS Batch

  • Run Terraform to make sure an AWS Batch job queue and job definition exist for the PR
    • The job definition should define the code that will be used to run the model itself, e.g. dvc pull and dvc repro
  • Use the AWS CLI to submit a job to the Batch queue
  • Use the AWS CLI to poll the job status until it has a terminal status (SUCCEEDED or FAILED)
    • Once the job has at least a RUNNING status, use the jobStreamName parameter to print a link to its logs

Depends on #22.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants