-
Notifications
You must be signed in to change notification settings - Fork 435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Construction and Deployment of a Docker image #5
Conversation
This launches a huge download (at least on first time) several GB. I guess this is related to a one-time download of the I also get an error with Choosing a proper port the log gives me a link that should point me to a jupyter server but these links don't work on chrome. Perhaps the 8888 port is hard-coded into the docker image so I'm out of luck if my port 8888 is in use? |
Also, log indicates that this image is for different architecture than my. My Mac is an M1, probably this was built for Intel? Still runs, but not sure if this is an issue -- do the docker images depend on an architecture? |
Overall, I think this can wait until we actually have several people who want an "official" docker image. By simply install
|
Yes, docker images a very much an ubuntu thing. That's a huge advantage as you can use them on Windows, Mac or Ubuntu. I am using a Mac with M1, too |
Yes, loading the image the first time, is a huge operation if you don't have the scicy-notebook layers in cache... |
Got it, you made a tag v0.0.1... https://github.com/tschm/ISLP_labs/releases/tag/v0.0.1
Was just not sure where v0.0.1 came from since the intro-stat-learning repo doesn't have that tag.
Could also change it to a manual dispatch or something else I suppose.
…________________________________
From: Thomas Schmelzer ***@***.***>
Sent: Sunday, August 20, 2023 11:25 PM
To: intro-stat-learning/ISLP_labs ***@***.***>
Cc: Jonathan Taylor ***@***.***>; Comment ***@***.***>
Subject: Re: [intro-stat-learning/ISLP_labs] Construction and Deployment of a Docker image (PR #5)
@tschm commented on this pull request.
________________________________
On .github/workflows/docker.yml<#5 (comment)>:
The docker image constructed in tagged. It is only executed when
on:
release:
types: [published]
Hence the tag is picked up and used to tag the image. If you just do a simple commit no new docker image is constructed.
At the same time the image :latest is updated.
—
Reply to this email directly, view it on GitHub<#5 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AACTM22TTUTYZREEC6ZETVDXWL5NRANCNFSM6AAAAAA3XYYXVQ>.
You are receiving this because you commented.Message ID: ***@***.***>
|
For the port, |
I would not use conda or recommend it :-) Where do you get jupyterlab from? |
Well, Fair enough about
|
Yep, |
I would recommend to keep and document both options. The results of your pip install will not be invariant as you can't control dependencies of your dependencies. Also, some versions you point to may disappear. Once you bake them into an image they are there for eternity. You may not need this level of robustness though. |
I with the community standard would be to setup a virtual environment in the first place as you do. To me it seems people just pip install into their central Python env |
OK, by choosing -p 10000:8888 works for me. So, this is just opens essentially the same thing as this: https://mybinder.org/v2/gh/intro-stat-learning/ISLP_labs/v2.1
So, on the whole, this is "effectively" capturing the docker image that binder builds.
It has more packages due to the FROM docker.io/jupyter/scipy-notebook line. This could lead to conflicts if requirements.txt is not current with that image... Using binder doesn't make that assumption.
…________________________________
From: Jonathan E. Taylor ***@***.***>
Sent: Sunday, August 20, 2023 11:29 PM
To: intro-stat-learning/ISLP_labs ***@***.***>; intro-stat-learning/ISLP_labs ***@***.***>
Cc: Jonathan Taylor ***@***.***>; Comment ***@***.***>
Subject: Re: [intro-stat-learning/ISLP_labs] Construction and Deployment of a Docker image (PR #5)
Got it, you made a tag v0.0.1... https://github.com/tschm/ISLP_labs/releases/tag/v0.0.1
Was just not sure where v0.0.1 came from since the intro-stat-learning repo doesn't have that tag.
Could also change it to a manual dispatch or something else I suppose.
________________________________
From: Thomas Schmelzer ***@***.***>
Sent: Sunday, August 20, 2023 11:25 PM
To: intro-stat-learning/ISLP_labs ***@***.***>
Cc: Jonathan Taylor ***@***.***>; Comment ***@***.***>
Subject: Re: [intro-stat-learning/ISLP_labs] Construction and Deployment of a Docker image (PR #5)
@tschm commented on this pull request.
________________________________
On .github/workflows/docker.yml<#5 (comment)>:
The docker image constructed in tagged. It is only executed when
on:
release:
types: [published]
Hence the tag is picked up and used to tag the image. If you just do a simple commit no new docker image is constructed.
At the same time the image :latest is updated.
—
Reply to this email directly, view it on GitHub<#5 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AACTM22TTUTYZREEC6ZETVDXWL5NRANCNFSM6AAAAAA3XYYXVQ>.
You are receiving this because you commented.Message ID: ***@***.***>
|
You can also create an even bigger image that has both R and Python installed. See jupyter-stack documentation |
But it seems a little heavy-handed to say the solution is to use docker instead of teaching them to manage a virtual environment....
…________________________________
From: Thomas Schmelzer ***@***.***>
Sent: Sunday, August 20, 2023 11:43 PM
To: intro-stat-learning/ISLP_labs ***@***.***>
Cc: Jonathan Taylor ***@***.***>; Comment ***@***.***>
Subject: Re: [intro-stat-learning/ISLP_labs] Construction and Deployment of a Docker image (PR #5)
Overall, I think this can wait until we actually have several people who want an "official" docker image.
By simply install pip -r requirements.txt this really does no more isolation of code than if I were to do:
conda create -n my_islp_env python=3.11 -y
conda activate my_islp_env
pip install -r https://raw.githubusercontent.com/intro-stat-learning/ISLP_labs/v2.1/requirements.txt
I would not use conda or recommend it :-) Where do you get jupyterlab from?
Well, conda is a community standard (even if it has flaws). I typically just use it to create a minimal environment, then pip for everything else. Could use mamba instead. Both are much lighter weight than docker.
Fair enough about jupyterlab. This is generally enough
pip install jupyterlab
I with the community standard would be to setup a virtual environment in the first place as you do. To me it seems people just pip install into their central Python env
—
Reply to this email directly, view it on GitHub<#5 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AACTM23KNWZ3FGC2D67NLQDXWL7RNANCNFSM6AAAAAA3XYYXVQ>.
You are receiving this because you commented.Message ID: ***@***.***>
|
docker/Dockerfile
Outdated
@@ -0,0 +1,10 @@ | |||
FROM docker.io/jupyter/scipy-notebook:lab-4.0.4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implicitly adds more requirements to requirements.txt
that could class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, see
Everything in jupyter/minimal-notebook and its ancestor images
altair, beautifulsoup4, bokeh, bottleneck, cloudpickle, conda-forge::blas=*=openblas, cython, dask, dill, h5py, jupyterlab-git, matplotlib-base, numba, numexpr, openpyxl, pandas, patsy, protobuf, pytables, scikit-image, scikit-learn, scipy, seaborn, sqlalchemy, statsmodel, sympy, widgetsnbextension, xlrd packages
ipympl and ipywidgets for interactive visualizations and plots in Python notebooks
Facets for visualizing machine learning datasets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could replace the spicy-notebook by the minimal-notebook. Smaller image and no unwanted packages
I believe it. It's basically an ubuntu server. You can do tons.
Of course, linking to R introduces more dependency.
…________________________________
From: Thomas Schmelzer ***@***.***>
Sent: Sunday, August 20, 2023 11:44 PM
To: intro-stat-learning/ISLP_labs ***@***.***>
Cc: Jonathan Taylor ***@***.***>; Comment ***@***.***>
Subject: Re: [intro-stat-learning/ISLP_labs] Construction and Deployment of a Docker image (PR #5)
You can also create an even bigger image that has both R and Python installed. See jupyter-stack documentation
—
Reply to this email directly, view it on GitHub<#5 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AACTM26WHYG6ZWTL4WYXAA3XWL7V5ANCNFSM6AAAAAA3XYYXVQ>.
You are receiving this because you commented.Message ID: ***@***.***>
|
You need to fix the version of the spicy-notebook image. I think I am using something like 4.0.4. For binder, there are ways to build the image directly on binder infrastructure and keep it in their cache. Not an expert though... I think your image might be a bit too big for binder. Takes ages to construct it from your requirements |
The virtual environment thing is not that easy. It exposes you to all sorts of OS dependency problems. |
Sigh. Binder is not something we "support". It's a service that people can try. It has limited resources, and has its way of managing them. And yes, a fresh build takes some time.
Docker images are cached on binder, and if you read the documentation, it indicates that repos that get a lot of traffic eventually have quicker startup times.
My comment was that we can think of making this docker image available is going to give users the same experience as launching binder, but it can be faster.
…________________________________
From: Thomas Schmelzer ***@***.***>
Sent: Sunday, August 20, 2023 11:47 PM
To: intro-stat-learning/ISLP_labs ***@***.***>
Cc: Jonathan Taylor ***@***.***>; Comment ***@***.***>
Subject: Re: [intro-stat-learning/ISLP_labs] Construction and Deployment of a Docker image (PR #5)
OK, by choosing -p 10000:8888 works for me. So, this is just opens essentially the same thing as this: https://mybinder.org/v2/gh/intro-stat-learning/ISLP_labs/v2.1 So, on the whole, this is "effectively" capturing the docker image that binder builds. It has more packages due to the FROM docker.io/jupyter/scipy-notebook line. This could lead to conflicts if requirements.txt is not current with that image... Using binder doesn't make that assumption.
…
You need to fix the version of the spicy-notebook image. I think I am using something like 4.0.4. For binder, there are ways to build the image directly on binder infrastructure and keep it in their cache. Not an expert though... I think your image might be a bit too big for binder. Takes ages to construct it from your requirements
—
Reply to this email directly, view it on GitHub<#5 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AACTM25UIDXNNUZ66YOZUXTXWL76JANCNFSM6AAAAAA3XYYXVQ>.
You are receiving this because you commented.Message ID: ***@***.***>
|
I think the order is wrong :-) You should build the image and binder should capture it :-) Binder is somewhat tricky about being pointed to docker images. |
I have updated the underlying image, see https://hub.docker.com/r/tschm/islp_labs/tags. The resulting image is now smaller but still close to 3 GB... let's check the files copied into the image |
I have tried to address the somewhat large size of the resulting images. However, it seems that's a direct consequence of installing the NVidia packages. I did an analysis with SLIM.ai and the constructed Python environment takes several GBs. I kept the Dockerfile somewhat standard and readable. When I build the image locally it tells me it has like 2.1 GB. Doing the roundtrip via Dockerhub the same image after a pull is now 6 GB? Weird... |
Change to manual dispatch, where images will get stored
You have the merge power. I am not sure you do yourself a favor with the manual release of the docker image. The pushed image will have no strong link to a tag then (if I understand the manual workflow correctly)... |
Manual dispatch works fine: Tried to get it to work on push to |
You need to control how can introduce tags. You need your own Dockerhub account...