Skip to content

Administering compute workers using Fabric

Eric Carmichael edited this page Jun 6, 2018 · 11 revisions

This section is for server administrators who may be setting up compute workers (machines that process submissions) for Codalab competitions. Competitions run on queues of submissions, these workers will listen to those queues. We'll be using Fabric to log into each machine and point the worker to the right queue.

Setup

You'll need Python2.7 installed to run Fabric (server administration tool)

Clone the repo and install requirements:

$ git clone git@github.com:codalab/codalab-competitions.git
$ cd codalab-competitions/codalab

# Preferably do this in a Python virtual environment!
$ pip install fabric -r requirements/common.txt

To use Fabric we must pass it a list of servers. You can define the servers in the codalab/server_config.yaml file directly next to codalab/fabfile.py

For example:

my_workers:
  hosts:
    - ubuntu@1.2.3.4
    - ubuntu@3.4.5.6

And you could start up fresh workers like so:

# Initialize all of the workers with Docker -- the "True" here is for SSL
$ fab hosts:my_workers compute_worker_init:pyamqp://secrets@competitions.codalab.org/c33dd8e9,True 

# Run the worker to listen for jobs
$ fab hosts:my_workers compute_worker_run

Then you can check their status:

fab hosts:my_new_server_group compute_worker_status

Commands

Command Arguments Description
compute_worker_init queue_url[,ssl_flag] Initializes compute worker by installing docker and the compute worker image, pointing at the given queue_url and using SSL if the flag is set
compute_worker_update Updates compute workers to latest docker image
compute_worker_update_docker Updates base docker installation version to latest
compute_worker_docker_restart Restarts docker
compute_worker_kill Kills compute worker
compute_worker_restart Restarts compute worker
compute_worker_run Runs the actual compute worker.
compute_worker_status Prints out docker ps for each worker
Clone this wiki locally