Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow String? for 'docker' RuntimeAttribute #1832

Open
cjllanwarne opened this issue Jan 10, 2017 · 27 comments
Open

Allow String? for 'docker' RuntimeAttribute #1832

cjllanwarne opened this issue Jan 10, 2017 · 27 comments

Comments

@cjllanwarne
Copy link
Contributor

In other words, we should validate docker as an optional string so that we can do things like this:

task foo {
  String? dockerName
  command { ... }
  runtime {
    docker: dockerName
  }
  output { ... }
}

The expected behaviour here is:

  • If dockerName is set, use docker and use the specified image name
  • If dockerName is not set, do not use docker.
@kcibul
Copy link
Contributor

kcibul commented Jan 10, 2017

Each backend could decide how to interpret the optional nature. For example the JES backend could decide that it's required and fail. The SFS backend could decide it's optional and just run without docker.

Specifically, the SFS backend (SGE) should implement this optional docker behavior

@kcibul kcibul added this to the Q1 - User Driven Development (& Bug Fixes) milestone Jan 10, 2017
@geoffjentry
Copy link
Contributor

While useful this is IMO more of the complecting of RAs that make me think they need a redo.

@kcibul
Copy link
Contributor

kcibul commented Jan 10, 2017

you said that in the original thread... that we said "let's do something tactical to solve the problem now" while the complecting continues ;)

@LeeTL1220
Copy link

@kcibul regarding issue #1804 .... Would my wdl and json look as follows?

workflow yo {
 String msg
 String? docker_image

 call task1 {
  input:
    msg=msg,
    docker_image=docker_image
 }
}

# Run a message in an arbitrary docker container (e.g. "broadinstitute/eval-gatk-protected:crsp_validation_latest")
task task1 {
 String msg
 String ? docker_image
 
 command {
  echo ${msg}
 } 
 
 runtime {
  docker: "${docker_image}"
  memory: "1GB"
 }
}

When I want a docker image:

{
 "yo.msg": "foo"
 "yo.docker_image": "broadinstitute/eval-gatk-protected:crsp_validation_latest"
}

No docker image:

{
 "yo.msg": "foo"
}

@LeeTL1220
Copy link

LeeTL1220 commented Jan 10, 2017

@kcibul why not support empty string as null, for docker, in the backend code? Are you worried about discerning an error where the user wants docker, but accidentally specifies empty string?

If my example wdl/json above is correct, I'd rather not have to delete json entries, since this is a bit more difficult to script around, but I can be convinced.

@geoffjentry
Copy link
Contributor

@LeeTL1220 let's not :)

@geoffjentry
Copy link
Contributor

@kcibul something something squeaky wheel

@LeeTL1220
Copy link

@geoffjentry I'm confused... Regardless, if it is a disproportionate amount of work, then I am convinced.

Can someone confirm that my wdl/json is correct?

@geoffjentry
Copy link
Contributor

@LeeTL1220 I'd rather not introduce magic values and billion dollar mistakes into wdl

@LeeTL1220
Copy link

@geoffjentry Geez... such drama... fine then...

@kcibul
Copy link
Contributor

kcibul commented Jan 10, 2017

why do our tickets turn in to long discussions ;) @LeeTL1220 and @geoffjentry let's talk after strat board, decide something and move on

@LeeTL1220
Copy link

@kcibul No need. I'll +1 this if someone can confirm that the above wdl/json (that I posted) is more-or-less correct.

@cjllanwarne
Copy link
Contributor Author

@LeeTL1220 Yes, that's the options file you would use. Perhaps alternatively using a javascript null to signify an active choice of no docker:

{
 "yo.msg": "foo"
 "yo.docker_image": null
}

@LeeTL1220
Copy link

LeeTL1220 commented Jan 10, 2017 via email

@LeeTL1220
Copy link

LeeTL1220 commented Jan 10, 2017 via email

@katevoss
Copy link

After discussing with @LeeTL1220 and redteam this is lower priority, as when using SGE the docker stanza is ignored and this is an acceptable workaround.

@katevoss katevoss added the Low label Mar 27, 2017
@cjllanwarne
Copy link
Contributor Author

@katevoss
Copy link

katevoss commented Sep 14, 2017

Adding this to the Runtime Attributes improvement spec

As a workflow runner on SGE, I want be able to make the Docker attribute to be optional, so that I can avoid rewriting my WDL when I don't use Docker.

  • Effort: @geoffjentry What do you think?
  • Risk: Small?
  • Business value: Small to Medium

@geoffjentry
Copy link
Contributor

@cjllanwarne If this is a path you want to pursue you should put on your commodore cap and do it on the other side of the fence :)

@mcovarr
Copy link
Contributor

mcovarr commented Nov 16, 2017

Commodore WDL

@LeeTL1220
Copy link

@ruchim We are getting a lot of questions about being able to run local/SGE without docker at the Spain workshop.... Not just Spanish people, but from several countries.

@LeeTL1220
Copy link

LeeTL1220 commented Sep 20, 2018

@ruchim I should clarify. There are a lot of people at this workshop (if not all of them) that cannot use the cloud, due to regulations, clinical requirements, etc. Or they need to be able to run WDL on their local on-prem compute cluster for testing on small cohorts, etc. This is a common configuration that prohibits docker. We really do not want these users to be forced to change the WDL that we (DSP methods) write and test. In order to stay backend-agnostic, can we implement a null option for docker as described in this issue (and #1804 )?

@geoffjentry
Copy link
Contributor

Or they could use something like nodocker or singularity

@LeeTL1220
Copy link

Singularity would require changes to the cromwell configuration file, correct? Since the docker command would change. Probably not hard, but would need documentation.

@kshakir
Copy link
Contributor

kshakir commented Sep 20, 2018

This is a common configuration that prohibits docker.

"Local" and SGE/HPC are two separate issues.

SGE (and all HPC) backends can already run without docker. When setting up the backend, just don't add a submit-docker config variable nor a docker runtime attribute to the backend's configuration. Docs for new local/HPC backends are documented under the title "SGE".

Separately, there is the issue that cromwell is pre-loaded with a default "Local" backend. This "Local" backend is docker enabled. The simplest workaround today is to create another backend "Local-NoDocker" or similar. Or if one wants to just change the existing "Local" backend they can use a config like:

include required(classpath("application"))
backend.providers.Local.config.runtime-attributes=""
backend.providers.Local.config.submit-docker=null

@eweitz
Copy link
Member

eweitz commented Mar 4, 2019

I encountered a need for this improvement while developing WDLs intended for FireCloud. Cromwell support for optional Docker runtimes would enable me to write FireCloud WDLs with quicker development iterations. It would also enable faster and more resilient automated testing (i.e., unit testing) of such WDLs.

My current approach to developing WDLs intended for FireCloud is to add Cromwell and my WDL (foo.wdl) into the Docker image that contains my workflow dependencies (e.g. Python and R code).
Then, from my local machine, I execute in my Docker container an equivalent version of my FireCloud WDL with a command like docker exec $containerId java -jar cromwell-36.1.jar run foo_test.wdl --inputs test_inputs.json --options options.json.

Without a way to override the runtime attribute (or ignore its docker key) in foo.wdl, I resort to commenting out the attribute and copying the content to foo_test.wdl. This enables fast development and unit testing, but requires manually syncing foo.wdl and foo_test.wdl. That, of course, has poor maintainability -- my approach is a kludge.

Adam (@aednichols) and I investigated better ways to do this, but found none. See Slack for more details about my issue. In summary, as an engineer using Cromwell to develop and test FireCloud WDLs, support for optional Docker runtimes as proposed here strikes me as the best option for my use case.

@aednichols
Copy link
Collaborator

Another point that Eric and I came across is that a "Docker in Docker" solution - i.e. installing Docker inside the Docker container where he's running Cromwell - is not good either because it necessitates pushing and re-pulling the Docker image he's iterating on, which makes for annoyingly long cycle times and can't work with bad or no Internet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants