Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mapping out the component dockerfiles and their dependencies? #1

Closed
cboettig opened this issue Sep 5, 2014 · 13 comments
Closed

Mapping out the component dockerfiles and their dependencies? #1

cboettig opened this issue Sep 5, 2014 · 13 comments

Comments

@cboettig
Copy link
Member

cboettig commented Sep 5, 2014

Wondering if we shouldn't do a bit of brainstorming about how best to nest the different Dockerfiles before we get started. It seems like we have an interest in building two trees: an Ubuntu-based one and a Debian based one.

From the Dockerfiles we already have, the current structure of a given tree looks something like this:

- ubuntu 
  - add-r 
    - add-r-devel
    - add-r-devel-san
    - rstudio
      - ropensci
      - swc
- debian 
  - ...

This isn't a bad structure, but particularly out near the twigs there's some redundancy and room for shuffling. For instance, ropensci has a lot of the compiler/tex dependencies found in add-r-devel (actually it currently has everything in r-devel). swc could use the larger ropensci as a base (e.g. to share the compile tools, or so that the knit2pdf button on rstudio will actually work).

Offering too many images might be overwhelming to users, but smaller more modular dockerfiles might also be easier to maintain. Lots to think about I guess.

@eddelbuettel
Copy link
Member

Agreed on Debian as well as Ubuntu. Another open question how to signal "super and sub-set", ie container re-used by others, with alternates when we have two alternates.

@paulstaab
Copy link

Nice to see that you are thinking about ways to combine R & docker. I'd like to join the discussion, if you don't mint.

What about having many small & modular images and add a way to indicate with ones are supposed to be used by end-users ('exporting' images would also by very R-like...). I'm thinking about something like having two docker-repos point to this github repo, one which builds intermediate images, and one which build end-user images. From the docker side we have something like

- user/r-docker:r
- user/r-docker:r-devel
- user/r-docker:rstudio
[...]

and

- user/r-docker-internal:add-r
- user/r-docker-internal:add-latex
- user/r-docker-internal:add-r-devel
[...]

From the GitHub side, images can still be arranged according to their actual dependency structure.

@eddelbuettel
Copy link
Member

Thanks for joining the discussion. So you are suggesting to (explicitly) build images based on sub-images? I think I only ever started from a (single) image which I then mod'ed. This could work, I guess. Particular docs or examples?

@paulstaab
Copy link

Not sure if I understand your question correctly. Are you asking if a particular image can be based on multiple images (e.g. combine and image containing debian with r and an image containing debian with latex get an image with debian, R and latex). If so, then no, I don't think that is possible (at least not yet; would be awesome though...).

I was just addressing the point that having many images might confuse users (completely independent of how images are build). By having multiple docker-repos, we could indicate which images are indented to be used by users, and which ones are just there to simplify maintenance. Or which ones are designated for R developers, and which ones for R users and so on...

@eddelbuettel
Copy link
Member

Let's address this one by one.

Your chart already showed six different 'leaf nodes' aka images. But now you say _ having many images might confuse users_. And I agree with that.

Where does that leave your proposed idea then? Should we rather concentrate on, say, a handful (at the most) "useful" images: r-devel, r-studio, ... and then let others, or at least other repos, refine this further, say with domain-specific apps?

@paulstaab
Copy link

Sorry, I should have made it clearer that I was thinking about ways how we could have many images without confusing users. The reason for that was that the first post mentioned that many small and modular dockerfiles might be easier to maintain than a few ones having a lot of redundancy.

If we stick to a small number of images however, I agree that maintenance should not be problem and a modular approach is not necessary.

@cboettig
Copy link
Member Author

Okay, following the basic strategy outlined at the top of this issue, I've added Dockerfiles in both Ubuntu and Debian flavors for: r, rstudio, and r-dev (each building on the other).

  • Maybe we want just 2 Dockerfiles instead of 3 (e.g. wrap r into the rstudio dockerfile?
  • Doing these sequentially means that r-dev has rstudio on it, maybe that's not ideal? On the other hand, the r-dev tex libraries etc are much larger than rstudio anyway, so r-dev will never be a small image so maybe it's fine having rstudio available on r-dev. After all, a user would need this image to use the tex features of rmarkdown anyhow.
  • The name r-dev is meant to imply R development generally; it does not include the devel pre-release of R from the svn. Maybe it should, following the example of @eddelbuettel 's add-r-devel-san Dockerfiles?

@eddelbuettel
Copy link
Member

Great, thanks for pushing the cart!

Personally, I find 'r-dev' too confusingly close to 'r-devel'. Maybe we should just define 'r' to be 'r-dev' and maybe (if needed) have a more bare-bones 'r-base'?

@cboettig
Copy link
Member Author

Agreed.

That has the nice side-effect of focusing people on the most complete image, which is probably most useful to beginners even if it has more software than they usually use, since it avoids having to mess around with installation.

I've renamed:

  • r-dev -> r
  • r -> r-base

(in both debian and ubuntu flavors) Thoughts?

@eddelbuettel
Copy link
Member

That's better. Unless someone finds 'r-base' preferable to 'r'. But we can clarify in a README.md ...

@cboettig
Copy link
Member Author

Yup.

So we don't have an image that provides the r-devel pre-release version.
Should I add that to the r image, aliased as Rdevel?

When you get a chance, I'd appreciate your input on what should go into the
r image. The current selection has the popular packages from the
hadleyverse (particularly those with external library dependencies, where a
simple install.packages() would thus fail), but overall is perhaps
arbitrary.

On Thu, Sep 18, 2014 at 3:08 PM, Dirk Eddelbuettel <notifications@github.com

wrote:

That's better. Unless someone finds 'r-base' preferable to 'r'. But we can
clarify in a README.md ...


Reply to this email directly or view it on GitHub
#1 (comment).

Carl Boettiger
UC Santa Cruz
http://carlboettiger.info/

@cboettig
Copy link
Member Author

We've decided to go with three base images:

  • r
  • rstudio
  • r-devel

(in both debian and ubuntu flavors) and then some "use case" ones merging all three plus specific add-ons, e.g.

  • Ie ropensci: r + r-devel + rstudio + hadley + ropensci packages

use cases will be targeted to what we feel the community is most likely to build upon, and won't cover all combinations. Package selection will emphasize packages with non-trivial install (e.g. external library dependencies, long compile times etc). We're open to proposals of particular use case through this repo's issue tracker.

Finally, we'll develop a wiki for tutorials on how to install additional packages on ubuntu/debian system (taking advantage of debian-r.debian.net and marutter repos). This will probably emphasize writing Dockerfiles rather than interactive installation (hey, it's easier than writing a .travis file and is better from a reproducible-research standpoint that individual researchers can generate images specific to their needs or needs of their community).

@cboettig
Copy link
Member Author

cboettig commented Oct 7, 2014

Currently Implemented as outlined above. Plan is now to deprecate ubuntu images.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants