Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: docker file #328

Merged
merged 7 commits into from
Sep 19, 2020
Merged

ENH: docker file #328

merged 7 commits into from
Sep 19, 2020

Conversation

cookpa
Copy link
Member

@cookpa cookpa commented Sep 13, 2020

First pass at a docker solution for ANTsR

@dorianps
Copy link
Collaborator

dorianps commented Sep 14, 2020

@cookpa , I see you use R CMD rather than devtools. That might be ok, but with devtools maybe the suggested packages listed in DESCRIPTION can be installed automatically. You use installDepndencies.R , which will require a separate updating of that file each time DESCRIPTION changes.

What size are you getting with this recipe? Maybe some of the tools needed to install are not needed in the final environment (starting with GIT). Staging might help with it, i.e., copying just the R packages folder in the second stage (my ANTs example here).

Last thing, you seem to set ITK threads hardcoded. You may want to consider to set them dynamically upon container run (I added a line to .bachrc and a line for ANTsR here).

@dorianps
Copy link
Collaborator

Also one thing, I would also print something on the screen on the ANTsR version on R startup, to make the users always aware of what they are using.

@muschellij2
Copy link
Collaborator

muschellij2 commented Sep 14, 2020 via email

@dorianps
Copy link
Collaborator

I don't think showing the software version is a problem. It is quite common to inform the user even without asking, e.g.:

image

image

@cookpa
Copy link
Member Author

cookpa commented Sep 14, 2020

Thanks for the feedback.

I have an alternative implementation that uses devtools. I went with R CMD Install in this version because that way the docker build will incorporate the code that the user has checked out. It just seemed like a more normal way to do things. But I'm new at this so maybe that's not a valid reason.

In case I'm not being clear, I was trying to enable the workflow:

  1. git clone ANTsR
  2. git checkout some version
  3. (optional) make some modifications
  4. docker build .

With R CMD Install, I copy the checked-out code to the container. But if I use devtools, the local source code becomes kind of irrelevant, which seemed like not the right default.

Regarding container size, it is huge, 2.2 Gb when installed (the compressed download from dockerhub will be smaller). The devtools version is about 300 Mb smaller, but could probably be made smaller still.

I take your point about having to maintain dependencies in two different places, that is kind of undesirable.

Another thing regarding versioning: with R CMD Install, I had control over the source and so could log what versions of ITKR, ANTsRCore and ANTsR got installed. With devtools, I could use the ref option to install a specific tag of ANTsR but it wasn't clear to me how to figure out what versions of ITKR and ANTsRCore came with it.

@cookpa
Copy link
Member Author

cookpa commented Sep 14, 2020

I don't know about printing things to the screen, I don't know what I would print except the output of packageVersion("ANTsR"), which users can ask for any time they want it.

@dorianps
Copy link
Collaborator

It's probably ok any way you do it. But I think devtools allows to install separately every package (ITK, ANTsRCore, ANTsR), pulling any version you like from each, installing the dependencies at various depths (below is the help file) and stopping the default cascade of upgrades with something like dependencies=FALSE, upgrade=FALSE (don't remember exactly the combination).

dependencies	
Which dependencies do you want to check? Can be a character vector (selecting from "Depends", "Imports", "LinkingTo", "Suggests", or "Enhances"), or a logical vector.

I thought this dockerfile is for the routine build of an official docker container to send to DockerHub, not for users to decide which ANTsR should be built. I think, users can go back in time with the older containers available in DockerHub instead of building an older version from scratch.

@dorianps
Copy link
Collaborator

@cookpa Startup messages are just cosmetic, I show the ANTsR version with a routine in .Rprofile https://github.com/dorianps/docker/blob/master/antsr/Dockerfile.antsr#L50-L57

@muschellij2
Copy link
Collaborator

muschellij2 commented Sep 14, 2020 via email

@cookpa
Copy link
Member Author

cookpa commented Sep 14, 2020

Some really helpful stuff in your Docker code @dorianps , sessioninfo::package_info is much better than what I was doing.

@stnava and @ntustison , do you have any preferences on using an R CMD Install vs devtools::install_github route for the container build?

@ntustison
Copy link
Member

Thanks @cookpa for this. I'll have to defer to you and @dorianps, @muschellij2 . I don't have a preference.

@cookpa
Copy link
Member Author

cookpa commented Sep 14, 2020

OK thanks. I could just include both, and make devtools the default (ie, the one called Dockerfile in the top-level dir). Developers can use the R CMD version, which offers a bit more control over the code.

@dorianps
Copy link
Collaborator

@cookpa , I guess you could go with two solutions, but it will need double maintenance. I don't fully know why would devs use docker to test old versions. To me docker is more to make life easier to users, while devs use local machines to do those occasional tests.

@cookpa cookpa marked this pull request as draft September 15, 2020 06:13
@dorianps
Copy link
Collaborator

@cookpa , I am thinking that installDependencies.R can be turned into a mini script that parses the DESCRIPTION file to find the list of dependencies. This way we don't keep a separate list in /docker/. You can give it a try yourself, but if that's complicated I can try after the pull is merged.

@cookpa
Copy link
Member Author

cookpa commented Sep 15, 2020

That's a good idea @dorianps . I think you were on the right track with devtools, and I'm trying to use it directly. I think it addresses all of the issues:

  1. Package versioning for the dependencies can be found with sessioninfo. This was my main reason for not using devtools, to allow full control and transparency of package versions. But sessioninfo gives all the details including the git revision of the stuff built from Github.

  2. User control over source. After thinking more, I think this is not all that important, as developers can point install_github to a fork if they need to test modifications to the code, or just use their own Dockerfiles.

  3. The devtools build route keeps the DESCRIPTION file as the sole reference for dependencies, which is nice. And the user can pass a build arg to set dependencies = NA, which installs a smaller set of packages.

I was having some trouble building with install_github using dependencies = TRUE. I think I have it figured out now. I had to change some options and remove the package "wmtsa" from the DESCRIPTION file because it has been removed from CRAN.

One other thing: for the ITK threads, I again encoded my bias of not using all the available cores, as this tends to cause problems on clusters. The user can multi-thread by passing the appropriate environment variable to the container or setting it within their R script.

Open to other feedback or ideas from anyone, if this all sounds good then I hope to be ready to merge in a day or so.

@muschellij2
Copy link
Collaborator

muschellij2 commented Sep 15, 2020 via email

@cookpa
Copy link
Member Author

cookpa commented Sep 15, 2020

Welp

Downloading GitHub repo cran/wmtsa@HEAD
splus2R (NA -> 1.2-2) [CRAN]
Skipping 1 packages not available: ifultools
Installing 2 packages: ifultools, splus2R
Installing packages into '/usr/local/lib/R/site-library'
(as 'lib' is unspecified)
Error: Failed to install 'ANTsR' from GitHub:
  Failed to install 'wmtsa' from GitHub:
  (converted from warning) package 'ifultools' is not available (for R version 4.0.2)
Execution halted

I might have to drop the devtools idea. Too many rounds of this.

@muschellij2
Copy link
Collaborator

muschellij2 commented Sep 15, 2020 via email

@dorianps
Copy link
Collaborator

If the package is not in cran install.packages() would fail too, right? That's not a problem created by devtools I guess.

@cookpa
Copy link
Member Author

cookpa commented Sep 16, 2020 via email

@muschellij2
Copy link
Collaborator

muschellij2 commented Sep 16, 2020 via email

@dorianps dorianps marked this pull request as ready for review September 16, 2020 14:11
@cookpa
Copy link
Member Author

cookpa commented Sep 16, 2020

cran/ifultools in DESCRIPTION did not work, maybe if I install it explicitly beforehand...

@cookpa
Copy link
Member Author

cookpa commented Sep 17, 2020

Now I'm getting dinged by tcltk warning about not having DISPLAY. Maybe if I set R_REMOTES_NO_ERRORS_FROM_WARNINGS="true" it will work.

@dorianps
Copy link
Collaborator

dorianps commented Sep 17, 2020 via email

@cookpa
Copy link
Member Author

cookpa commented Sep 17, 2020 via email

@cookpa
Copy link
Member Author

cookpa commented Sep 17, 2020

It works! Many thanks to @muschellij2 for the pointer to R_REMOTES_NO_ERRORS_FROM_WARNINGS

@cookpa
Copy link
Member Author

cookpa commented Sep 17, 2020

I think this is ready now. It's a big container with all the depends, we could shrink it by tidying up the dependency list in DESCRIPTION and also by smarter use of build layers. But I think this is a working first pass.

@cookpa
Copy link
Member Author

cookpa commented Sep 19, 2020

Thanks everyone I will merge now

@cookpa cookpa merged commit 51547dd into ANTsX:master Sep 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants