-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Builds on DockerHub #2
Comments
These are my opinions as a docker and ipython user. I am not an official member of your project. Re: (3) Does Docker tagging help us with anything? As a user, it would help if I could rely upon certain numbered version as being unchanging, i.e. designated releases, so I can use it to reliably build functionality that will continue to function for processes or customers that need a frozen environment that always works. :latest should be for jupyter devs, testers, bleeding edgers, who can afford malfunction or investigate it. As to the myriad versions of everything else, the choices seem to be: (a) aggregation into one container, (b) joining various containers, (c) customizable build script left in container (a) aggregate into one container. This is good for end user ease-of-use, until they decide they want something different than the stock arrangement. One issue that this style tends to produce huge containers that take forever to download. Some of the spark/hadoop docker containers released by others suffer from too-big and too-many-layers already. One solution would be to use a suitable base image everyone using docker already has, or should have, but as I write this ubuntu:latest and debian:latest images have python3 and no python2, and centos:latest has python2 and no python3... not to mention the other lesser used components. (b) Split across containers. Here an environment would be built from several docker containers linked to the jupyter container, perhaps as docker volumes. This is often done for tcp/ip linking of a database running in one container with an app like a web front-end running in another container. In those cases the mysql/maria containers provide examples, but that doesn't seem to be the primary problem faced here. Instead, the problem here seems to be "what lang/environment is this jupyter for? can I adjust that without downloading the entirety of jupyter again?" In those cases a -v volume option exists that allows mounting one containers filesystem inside another container. But this configuration would seem to be harder on developers and end users but perhaps more flexible for creating various configurations. I'm unaware of a way for a bunch of containers to each dump executables into /usr/bin of a single container. In the absence of that functionality, some clever planning and setting of PATH, PYTHONPATH, etc., would need to be done. (c) customizable build script left in container The idea is to leave a script in the container that uses root privilege on the container along with apt, yum, pip, and similar tools to customize the environment from the base environment to something that works and then run the resulting environment as an ordinary user. As a potential end user (a) and (c) currently sound best to me. |
I believe we're on track for (c), but please check this assumption. The initial Docker image definitions in this repo install
|
Most of the issue we had with automated builds was how permissions on Docker Hub worked. We weren't getting deterministic builds for I'm a bigger fan of automated trusted (as much as they can be) builds, triggered via the normal github -> docker webhook. |
I'd like to give the automated builds another shot. I started by getting minimal-notebook working properly under my personal namespace on Docker Hub. It built without a hitch. I'll try scipy-notebook too (hacked to point to parente/*) and ensure the automation properly rebuilds scipy-notebook on a new build of minimal finishes. If all that works, would be good to get the builds going under jupyter/*. |
I setup I think we should give the automated builds a shot again and only fall back on a bespoke solution if necessary. Under what org should we build the images on Docker Hub? Should we create a new |
I think they can just be under jupyter. @rgbkrk? |
I think they can be under jupyter. |
Works for me. @rgbkrk can you grant me permissions in the jupyter org to set it up? |
I setup the automated builds for "latest" versions of the stacks we have. Everything went smoothly except r-notebook which has the new behavior (on Dockerhub and local for me) of hanging in conda solving package specs. I'll look into debugging it locally. |
r-notebook problem was related to r-devtools. Bumping it 1.8 and adjusting for newer R 3.2 release and incompatible packages solved the problem. It's now on docker hub. (I'm still not clear why conda was struggling with the older versions, but moving on ...) Along the way I noticed it was installing a second copy of IPy. PR #8 should fix it. @rgbkrk The docker hub webhooks for this git repo don't seem to be enabled. I don't have admin permissions on the repo to check. (And I'm a bit lost on where I would configure a docker hub organization to enable them for an organization on github.) Did you or someone else manage to set them up for ipython/docker-images originally? |
Let me see what I can do there too. |
Want to try setting up hooks now? You now have admin access on this repo instead of just write. |
I'll give it a shot later today. Tnx. |
I think it's now enabled, but we'll have to wait for the next git push / PR to find out. If it doesn't work, it's honestly not too bad at the moment with the few stacks we have to manually go trigger the docker hub builds. Better control too. As it stands, any merge to any stack folder is going to trigger all of them to rebuild, and there's nothing we can do about it if all the stacks are in one repo. At any rate, all the "latest" images are now pushed to Docker Hub using notebook 3.2.1. I plan to now create a 3.2.1 branch, update all the descendants of minimal-notebook to build FROM jupyter/minimal-notebook:3.2.1, and get those tagged builds going too. If that all works, can move on to issue #6 once Conda has a 4.0 build. |
Created branch 3.2.x (following pattern of jupyter/notebook branches). Setup builds for that branch on Docker Hub. Manually triggered the build of jupyter/minimal-notebook:3.2. (Notice no patch number since the conda command allows the patch to vary.) Updated all Dockerfiles in the branch to build FROM that tagged 3.2 image. Pushed that minor change to 3.2.x branch and all images started rebuilding automatically on GitHub. They all finished successfully. |
(1) and (2) from the original description are done. (3) is done enough (tags reflect main process version). I opened issue #12 for the finer points of how to capture versions of installed libraries in a stable manner. |
tickets/DM-11417
try use debian stretch slim in place of ubuntu for pyspark image
I'd like to get stack builds cranking on DockerHub as soon as there's a few more images worth building. (Note to self: hurry up with more images. :) To do so, though, we need to answer a few questions.
minimal-notebook
andr-notebook
stacks as examples. Do they appear asjupyter/minimal-notebook
andjupyter/r-notebook
on DockerHub? The naming is consistent with everything else in the Jupyter project, but will users telljupyter/minimal-notebook
apart fromjupyter/notebook
(setup for dev / test) orjupyter/minimal-demo
(base image for tmpnb.org? Do we put them in a new Docker repository likedocker-stacks/minimal-notebook
,docker-stacks/r-notebook
, etc.? Can we even use a different name Docker repository name or is it linked to the GitHub user name?The text was updated successfully, but these errors were encountered: