Support additional Docker configuration options. #375

Closed
slnovak opened this Issue Jan 12, 2016 · 9 comments

Comments

Projects
None yet
5 participants
Contributor

slnovak commented Jan 12, 2016

Hi,

TL;DR

Cromwell should allow for the configuration of Docker resource / environment flags at run-time.


I have a use-case where I'd like to run Cromwell jobs in a cluster environment via Docker Swarm. Since Swarm doesn't require any additional configuration outside of standard docker run commands, it's trivial to distribute Cromwell jobs across Swarm nodes.

However, Swarm provides a series of filters and constrains that control how the scheduler distributes containers to nodes. For example, I might be interested in limiting the execution of a Cromwell job to a specific region / datacenter. This requires you to specify filters in the docker run command with the environment flag, -e. For example, to run a container on Swarm nodes that run in the us-east region:

› docker run -d --name my_image -e constraint:region!=us-east* my_container

Obviously, this configuration should not be managed in the WDL document. Instead, it would be great for the Cromwell command-line tool and REST API to support additional runtime options for specifying Docker environment variables. For example:

› cromwell run --docker-env "constraint:region!=us-east*" my_workflow.wdl -

Hint: Docker supports daemon labels. In the above case, the workflow would
execute on a Swarm node whose Docker daemon that was started with:

    docker daemon --label region=us-east

As for the API, the POST action to /api/workflows/:version would allow for multiple Docker env strings.

The other feature I would like to request is translating memory and cpu configuration options (at the task level) to Docker via --memory and --cpuset-cpus docker run flags, respectively. These options are currently only used for the JES backend, but it seems as though they can also be used for the Local backend if Docker is specified.

So, to summarize:

  1. Allow Docker -e flags to be specified for all tasks in a given workflow.
  2. Allow task memory and cpu options in a WDL document to be translated to --memory and --cpuset-cpus in the docker run command.

Please let me know if there's anything I can do to help this move forward.

Cheers! 🍻

mcovarr was assigned by geoffjentry Jan 12, 2016

Member

geoffjentry commented Jan 12, 2016

@slnovak Thanks for bringing this up. As an aside we're looking at ways to bringing our issue/ticket tracking from our internal server over to github, it'll probably be a little while but once it's in it should make things a lot less opaque for folks such as yourself.

@mcovarr can you look through this and create ticket(s) as appropriate?

Contributor

slnovak commented Jan 13, 2016

Thanks @geoffjentry and @mcovarr. Keep me in the loop if this is something that can be put into one of the upcoming sprints, otherwise I might see if I can find some time to submit a PR to implement this.

Member

geoffjentry commented Jan 13, 2016

@slnovak It'll depend on how soon upcoming is for you :) Over the next few weeks there are a couple of broad products sitting on top of Cromwell going live so I expect the bug tickets to be coming in fast and furious. We've been discussing our direction after the dust settles there and one of the priorities will be community requests. We'd certainly welcome a PR but if you have the time to wait a month or so it should hopefully be something we could get to.

Contributor

mcovarr commented Jan 13, 2016

Thanks @slnovak! Just FYI I've internally ticketed your two points separately.

Contributor

scottfrazer commented Jan 20, 2016

👍

Contributor

mcovarr commented Oct 14, 2016

I believe the config backend system in Cromwell 0.21+ now allows for everything that's asked for here.

Contributor

cjllanwarne commented Oct 17, 2016

@mcovarr if that is so, could we document the how-to somehow? Or at least comment it into this issue thread? In the mean time, I'm adding this to the user driven rotation backlog.

@mcovarr mcovarr added a commit to broadinstitute/wdl that referenced this issue Jan 9, 2017

@mcovarr mcovarr prefix() proposal. Closes broadinstitute/cromwell#375 c967f98
Contributor

mcovarr commented Jan 10, 2017 edited

Much like the sample config presented in #472, the config backend system should allow for passing memory and cpu options something like this:

  default = "Local"
  providers {
    Local {
      actor-factory = "cromwell.backend.impl.sfs.config.ConfigBackendLifecycleActorFactory"
      config {
        runtime-attributes = """
        String? docker
        Integer? docker_memory
        Integer? docker_cpu
        """
        submit-docker = """docker run --rm ${ "--memory " + docker_memory } ${ "--cpuset-cpu " + docker_cpu } -v ${cwd}:${docker_cwd} -i ${docker} /bin/bash ${docker_cwd}/execution/script"""
        .
        .
        .

But environment variables are a bit trickier. There doesn't currently seem to be a nice way in WDL of composing a string like-e key1=value1 -e key2=value2. I've started working up a proposal over here for one way WDL might support this; feel free to chime in there! 😄

@mcovarr mcovarr added a commit to broadinstitute/wdl that referenced this issue Jan 13, 2017

@mcovarr mcovarr prefix() proposal. Closes broadinstitute/cromwell#375 302b16d

@mcovarr mcovarr added a commit to broadinstitute/wdl that referenced this issue Jan 17, 2017

@mcovarr mcovarr Merge pull request #84 from broadinstitute/mlc_join
prefix() proposal.  Closes broadinstitute/cromwell#375
944ebea

@mcovarr mcovarr added a commit to broadinstitute/wdl4s that referenced this issue Jan 18, 2017

@mcovarr mcovarr Merge pull request #66 from broadinstitute/mlc_prefix 3d4a2a0
Contributor

mcovarr commented Jan 19, 2017

Given a Local config like:

runtime-attributes = """
String? docker
String? docker_user
String? docker_env
"""

submit-docker = """docker run ${docker_env} --rm ${"--user " + docker_user} -v ${cwd}:${docker_cwd} -i ${docker} /bin/bash ${docker_cwd}/execution/script"""

Docker environment variables could be passed with a WDL like:

task build_env {
  Array[String] kvs = ["k1=v1", "k2=v2", "k3=v3"]
  Array[String] prefixed = prefix("-e ", kvs)
  command {
    echo "${sep=' ' prefixed}"
  }
  output {
    String out = read_string(stdout())
  }
}

task docker_task {
  String docker_env

  command {
    echo $k1
    echo $k2
    echo $k3
  }

  runtime {
    docker: "ubuntu:latest"
    docker_env: "${docker_env}"
  }

  output {
    Array[String] out = read_lines(stdout())
  }
}

workflow w {
  call build_env
  call docker_task { input: docker_env = build_env.out }
  output {
    Array[String] out = docker_task.out
  }
}

Having to use a separate task to stringify an array of String kv environment pairs is a little clunky, but it looks like the way the ${sep=' '...} expansion works currently requires this to be done in the command section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment