Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added onbuild container for datascience #152

Closed
wants to merge 9 commits into from
Closed

Conversation

mwaaas
Copy link

@mwaaas mwaaas commented Mar 11, 2016

I have added a container that has "ONBUILD" commands that:

  • installs requirements file
  • adds code to container

@parente
Copy link
Member

parente commented Mar 11, 2016

Very cool thought. Thanks for this.

I wonder if we should put this in examples since it's not a stack we'd build and push to Docker Hub, but one a user would build him/herself. We can expand the readme to say that this really works for any of the stacks. The user just has to switch the FROM line.

What do you think?

@mwaaas
Copy link
Author

mwaaas commented Mar 11, 2016

I think its a good a idea to put it on examples, but I still think it would be good it its build and pushed in dockerhub.

For instance the way python-onbuild (https://hub.docker.com/_/python/) works, if you pull the onbuild image you don't have to write commands to copy your source code and install the requirements it would be done automatically on build

Likewise I would like to just import a juypter datascience obuild container and during build it automatically adds my source code and installs my requirements

What do you think?

@jakirkham
Copy link
Member

Looks cool. Like the idea. Probably could consider one using a conda environment file, as well.

Sorry for this tedious detail, but would you mind adding terminal newlines to the ends of files.

@@ -0,0 +1,8 @@
from jupyter/datascience-notebook
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should be capitalized (i.e. FROM).

@jakirkham
Copy link
Member

If we are going to build and deploy this along with other images, it will need to be moved up a directory and renamed to something unique. That or we will need to make some tweaks to the Makefile. I am leaning towards the former.

@jakirkham
Copy link
Member

Might be worth including a trivial example of how this is used.

@mwaaas
Copy link
Author

mwaaas commented Mar 11, 2016

sure let me do that

@parente
Copy link
Member

parente commented Mar 11, 2016

Likewise I would like to just import a juypter datascience obuild container and during build it automatically adds my source code and installs my requirements

Ah, yes, I forgot how ONBUILD works. Thanks for correcting me. Disregard my "move to examples" comment.

@jakirkham it will need to be moved up a directory and renamed to something unique. That or we will need to make some tweaks to the Makefile

If this passes muster over time, we might end up having one of these per stack (datascience-notebook-onbuild, all-spark-notebook-onbuild, ...). Do you think each should be a top level folder? An alternative would be:

root/
  datascience-notebook/
    Dockerfile
    Dockerfile.onbuild
  ...

We could adapt the make process to test for the *.onbuild files (probably).

@@ -0,0 +1,7 @@
FROM jupyter/datascience-notebook
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the Jupyter copyright at the top of this file? You can copy/paste it from the other Dockerfiles.

@jakirkham
Copy link
Member

Thanks @mwaaas. Few minor comments.

@jakirkham
Copy link
Member

@parente, I like your idea of having them in the same directory with extensions. Should we do that with this one or do you have some other thoughts about how it should be placed?

Another thought occurs to me that having one with that uses conda and an environment.yml might be nice or this could be made to be dual purpose (preferring pip and then conda or the other way around). Do either of you have thoughts on this, @parente @mwaaas?

@parente
Copy link
Member

parente commented Mar 11, 2016

Considering we're pretty much based on conda, both sounds good if it's possible to write the command so that if one or the other does not exist, the build doesn't fail. (Last I tried, the ADD/COPY require the files to be there and error if they're not.)

Let's start with this one in the same folder with the naming if we're all agreed. For now, let's just do the datascience-notebook/Dockerfile.onbuild as submitted. Over time, if the pattern proves useful, we can contribute them to the other stacks in the same manner.

@jakirkham
Copy link
Member

Considering we're pretty much based on conda, both sounds good if it's possible to write the command so that if one or the other does not exist, the build doesn't fail. (Last I tried, the ADD/COPY require the files to be there and error if they're not.)

So, I was suggesting that as we are adding the source directory that we just do that. Then it doesn't matter which files exists in it as far as the ADD/COPY command is concerned. Only the RUN command will need some tweaking depending on which case we prefer and such.

Let's start with this one in the same folder with the naming if we're all agreed. For now, let's just do the datascience-notebook/Dockerfile.onbuild as submitted. Over time, if the pattern proves useful, we can contribute them to the other stacks in the same manner.

Sounds good.

Another thought. Currently, we are installing this into the Python 3 environment. Should there be some mechanism for selecting Python 2 or both?

@mwaaas
Copy link
Author

mwaaas commented Mar 12, 2016

@parente @jakirkham
How does it look now ?


ONBUILD COPY . /home/jovyan/work

ONBUILD RUN [ -e requirements.txt ] && pip install --no-cache-dir -r requirements.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should put in parentheses and add an || true afterward to ensure the exit code is always 0. So it would be...

( [ -e requirements.txt ] && pip install --no-cache-dir -r requirements.txt ) || true

@jakirkham
Copy link
Member

@parente, any thoughts on the Python 2/3 issue?


ONBUILD RUN [ -e requirements.txt ] && pip install --no-cache-dir -r requirements.txt

ONBUILD RUN [ -e environment.yml ] && conda install --file environment.yml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a development install line somewhere (e.g. pip install -e /home/jovyan/work)?

@parente
Copy link
Member

parente commented Mar 14, 2016

any thoughts on the Python 2/3 issue?

We also have to think about R which is in the datascience-notebook image and managed by conda. We don't necessarily have to implement support for installing to the R environment yet but we do have to leave room in our naming conventions for it later.

How about:

  • requirements.txt and environment.yml install to both the python2 and 3 environments
  • requirements-py2.txt and environment-py2.yml installs to the python2 environment only
  • requirements-py3.txt and environment-py3.yml installs to the python3 environment only
  • environment-r.yml installs to the R conda environment

with all this doc'ed in the README.

@mwaaas
Copy link
Author

mwaaas commented Mar 16, 2016

@parente I agree with the naming convention to leave space for all environments

quick question if I want to install to python2 do I need to activate the environment

@parente
Copy link
Member

parente commented Mar 17, 2016

quick question if I want to install to python2 do I need to activate the environment

Yes. I think there's an example in the Dockerfile of this already. Something to the effect of:

RUN bash -c ". activate python2 && whatever"

@Carreau
Copy link
Member

Carreau commented Apr 8, 2016

what's the status of this PR ?

@parente
Copy link
Member

parente commented Apr 9, 2016

Hi @mwaaas. Do you still plan to implement the support for the various requirement.txt files? If not, would you mind if we open a new PR starting from your branch with the extra commits implementing it and documenting it in the README? You'll still get credit for the work you did because your commits will be retained.

@mwaaas
Copy link
Author

mwaaas commented Apr 9, 2016

@parente Hi sorry I had taken time to reply, will work on it this weekend

@parente
Copy link
Member

parente commented Apr 9, 2016

@mwaaas No rush or problem at all! Take your time.

@parente
Copy link
Member

parente commented Jun 10, 2017

Hi @mwaaas. This PR has been open for a long time. Sorry about that. Are you still interested in getting this merged?

I think the most common pattern people are using is to define a Dockerfile that installs whatever they need starting FROM one of the images here. If that suits your needs, I think it's better if we close this out rather than having to maintain and document more options for building custom images.

@parente parente closed this Jun 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants