Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linking containers to provide TeX? #78

Closed
cboettig opened this issue Nov 6, 2014 · 6 comments
Closed

linking containers to provide TeX? #78

cboettig opened this issue Nov 6, 2014 · 6 comments

Comments

@cboettig
Copy link
Member

cboettig commented Nov 6, 2014

Been thinking about this for a while, but @benmarwick 's examples with --volumes-from convinced me to give this a try.

While there's an obvious level of convenience in having something like LaTeX bundled into the hadleyverse container so that users can build nice pdfs, if often feels not very docker-esque to me to just throw the kitchen sink into a container. At the risk of some added complexity, we can provide LaTeX from a dedicated TeX container to a container that doesn't have it built in, like rocker/rstudio. Check this out:

docker run --name tex -v /usr/local/texlive texlive
docker run -dP --volumes-from tex \
  -e PATH=$PATH:/usr/local/texlive/2014/bin/x86_64-linux/ \
    rocker/rstudio 

We can now log into RStudio, create a new Rnw file and presto, RStudio discovers the tex compilers and builds us a pdf. This does make our Docker execution lines a bit long, but that's what fig is for. (Or a good ole Makefile).

The above example uses a version I built locally on a debian:testing container, for a whopping 4 GB container; of course the LaTeX in hadleyverse is stripped down to about a 1GB install. Looks like one can also just use this existing docker image from the hub: https://registry.hub.docker.com/u/leodido/texlive/dockerfile/ (even though it's built on centos?)

Note this requires we build texlive in a way that isolates it to it's own path (e.g. /usr/local/texlive). The default installation with apt-get installs everything in separate locations that overlap with existing directories (like /usr/bin), which makes linking clumsy or impossible (we would need separate paths for all the components, e.g. since shared libraries aren't found under the bin path, and we cannot link such a volume to another container without destroying everything in it's /usr/bin, clearly not a good idea). Instead, if we use the standard texlive install script from https://www.tug.org/texlive/, this installs everything into /usr/local/texlive which is much more portable as illustrated above. Not quite sure if it's actually a good idea to build containers this way or not.

@eddelbuettel
Copy link
Member

That is a really tough one.

I am biased in the Debian way: if you need it, install it via apt-get and it works. A suggestion to completely rework that ("hey, let's just do our own texlive") sounds a little nutty to me.

But then your point about fig, and combining containers is really excellent. I don't understand yet why he path have to be different. Can't we join with binaries from another path that are built from the same vintage (say debian/testing) so that the core dynamic libraries are the same. I may miss a Docker detail here.

Barring that I still our idea of somewhat narrow containers onto which users can add/install new components and then 're-commit'. Once an additions becomes very popular ("grassroots votes", somehow magically aggregated...) we add it. A bit more timid as a proposal.

@cboettig
Copy link
Member Author

cboettig commented Nov 6, 2014

Cool, I'm totally in favor of the grass-roots approach. We already install a basic tex setup in hadleyverse and I don't see any reason to remove it, since it works for most common uses without any of this linking magic anyway. Currently if a user wanted tex, I'd tell them to use hadleyverse but I'm just thinking we might document an alternative route that links the tex libraries from a different container. Somehow that feels more docker-esque to me (separate software, separate containers)

a couple quick thoughts:

  • Yeah, wouldn't necessarily need us to make the texlive container, one could just tell users to link an existing texlive container.
  • I mostly thinking about this from the perspective of a user accessing the container exclusively through RStudio, which means no root permissions (we could add the user to sudoers but we don't at the moment, thinking security-wise for a server product). Such a user cannot apt-get texlive.
  • apt-get install texlive-full takes over 30 minutes to run on my desktop, so it's not a convenient fix.
  • Re joining binaries / paths, you could be right, but I don't see how to do it. I tried creating a container with apt-get install texlive, which puts pdflatex in /usr/bin. I link that binary alone with --volumes-from /usr/bin/pdflatex. The rstudio container is happy that it sees pdflatex in the path, but it fails to run because the shared libraries that texlive installed aren't linked. (likewise all the other binaries, like latex aren't linked yet either). You can't just link a path that already exists on the RStudio container, because the contents of that path that are already on the container get clobbered. Make sense? Can you tell apt-get to do this kind of more 'isolated' install?

Anyway, so more of a proposal about documenting how to link than about hosting something new.

@eddelbuettel
Copy link
Member

Maybe we are missing something about how fig or CoreOS combine containers? Seems like it ought to be possible to run binaries from another container, but maybe not.

You can tell dpkg / apt to install (ie "expand the ar archive") somewhere else. But then you own the problem of getting those path covered via $PATH, the dynamic linker etc pp.

texlive-full is overkill. I need to install a fair amount of texlive to build the R packages, and "everything" (unpackaging the chroot, installing everything R needs to build, building, testing, ...) took 14 minutes last week, and my machine is modest. Maybe we "just" need to redefine the problem to not require 1 gb of texlive for everyone? Those who need more esoteric components may have to add them. Or is that too draconian?

@cboettig
Copy link
Member Author

cboettig commented Nov 6, 2014

whoops, my numbers were out of date. that layer in hadleyverse is now down to 302 MB (and includes a few other things than texlive bits). The texlive layer in r-devel is 1.236 GB, though it installs other things in that layer as well. (note that hadleyverse doesn't inherit this since it doesn't build on r-devel). So anyway, not proposing to tweak the existing containers, where what we have works.

If a user did want some extensive texlive suite though, it might still be reasonable to suggest to them they can just 'plug it in' from another container. Potentially this is useful more generally too? (e.g. someone wants to call python libraries on another container, etc). Mostly I'm trying to wrap my head around if there's a more natural way to combine containers providing different functionality than just stacking on top of the previous container with a "FROM" command.

Good question about CoreOS, I have no idea. My impression was that it was aimed at servers running independent services, I suspect it doesn't support the idea of shared libraries between containers the way a normal linux os would, but that's my idle speculation.

@eddelbuettel
Copy link
Member

Yep, I also do not know right know if they do more than just combining "services on ports". Naively thinking one "should" be able to union-fs join several containers just how aufs layers get joined inside one.

@cboettig
Copy link
Member Author

Okay, don't think this is something we're going to implement, though anyone can already use --volumes-from with an existing LaTeX container to do this if they like without us doing anything, e.g.

docker run --name tex -v /usr/local/texlive leodido/texlive true
docker run -dP --volumes-from tex \
 -e PATH=$PATH:/usr/local/texlive/2014/bin/x86_64-linux/ \
 rocker/rstudio

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants