Skip to content

Testing cml-cloud-run #116

@elleobrien

Description

@elleobrien

I'm working with the following .yaml:

name: train-my-model

on: [push]

jobs:
  deploy-cloud-runner:
    runs-on: [ubuntu-latest]
    container: docker://dvcorg/cml-cloud-runner

    steps:
      - name: deploy
        env:
          repo_token: ${{ secrets.REPO_TOKEN }} 
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
        run: |
          echo "Deploying..."
          MACHINE="CML-$(openssl rand -hex 12)"
          docker-machine create \
              --driver amazonec2 \
              --amazonec2-instance-type t2.micro \
              --amazonec2-region us-east-2 \
              --amazonec2-zone a \
              --amazonec2-vpc-id vpc-76f1f01e \
              --amazonec2-ssh-user ubuntu \
              $MACHINE
          eval "$(docker-machine env --shell sh $MACHINE)"
          ( 
          docker-machine ssh $MACHINE "sudo mkdir -p /docker_machine && sudo chmod 777 /docker_machine" && \
          docker-machine scp -r -q ~/.docker/machine/ $MACHINE:/docker_machine && \
          docker run --name runner -d \
            -v /docker_machine/machine:/root/.docker/machine \
            -e RUNNER_IDLE_TIMEOUT=120 \
            -e DOCKER_MACHINE=${MACHINE} \
            -e RUNNER_LABELS=cml \
            -e repo_token=$repo_token \
            -e RUNNER_REPO=https://github.com/iterative/cml_base_case \
           dvcorg/cml-cloud-runner && \
          sleep 20 && echo "Deployed $MACHINE"
          ) || (echo y | docker-machine rm $MACHINE && exit 1)
  train:
    needs: deploy-cloud-runner
    runs-on: [self-hosted,cml]
    
    steps:
      - uses: actions/checkout@v2

      - name: cml_run
        env:
          repo_token: ${{ secrets.GITHUB_TOKEN }} 
        run: |
          pip install -r requirements.txt
          python train.py
        
          cat metrics.txt >> report.md
          cml-publish confusion_matrix.png --md >> report.md
          cml-send-comment report.md

Note that the job train depends on deploy-cloud-runner, not deploy as in #108 (deploy threw an error). The first job, deploy-cloud-runner is running fine, but train is throwing an inscrutable error:
Screen Shot 2020-06-10 at 3 25 18 PM
Screen Shot 2020-06-10 at 3 25 26 PM

Project repo is here: https://github.com/andronovhopf/test_cloud

Any ideas?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions