Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dockerfile #83

Merged
merged 8 commits into from
Nov 8, 2016
Merged

Add dockerfile #83

merged 8 commits into from
Nov 8, 2016

Conversation

dsakuma
Copy link
Contributor

@dsakuma dsakuma commented Oct 11, 2016

This PR adds a Dockerfile to the project and update the documentation explaining how to use it.

The docker-compose was already created in version 2 syntax (as suggested by @gwmoura)

And I also fixed some typos in documentation.

p.s.: There is a problem when reading csv in the current notebooks. But I think it is an issue for another pull request. I'm working on that.

@dsakuma dsakuma mentioned this pull request Oct 11, 2016
@cuducos
Copy link
Collaborator

cuducos commented Oct 11, 2016

Many thanks for that, Daniel.

I'd suggest minor changes before merging: as far as I could check, your PR don't cover all the steps from the setup script. The config.ini part seems to be missing. Maybe adding COPY config.ini.example and RUN cp config.ini.example config.ini to the Dockerfile is enough.

And what is the error you refereed to when reading the CSV, can you share it?


To achieve this goal, unprecedented, we invite everyone to train the intelligence, collect information, cross databases, validate hyphotheses and apply Machine Learning with models competing against each other and getting combined in ensembles with higher precision than any previous option.
To achieve this goal, unprecedented, we invite everyone to train the intelligence, collect information, cross databases, validate hypotheses and apply Machine Learning with models competing against each other and getting combined in ensembles with higher precision than any previous option.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix typo of a typo fix: hypothesis (not hypotheses) hahaha… (and the same for line 11).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xD

@dsakuma
Copy link
Contributor Author

dsakuma commented Oct 13, 2016

@cuducos I've added in the documentation the instruction to copy the config.ini. I did it because if we add the COPY command in Dockerfile, we will need to rebuild the image every time we change the config.ini. What do you think?

About the problem when reading the csv file, that is the error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfd in position 0: invalid start byte. I will try to check it tomorrow

And thanks for fixing my typo... hahaha

@cuducos
Copy link
Collaborator

cuducos commented Oct 16, 2016

Many thanks again, @dsakuma — copying the config file seems good.

About the UnicodeDecodeError it's is a typical error of trying to run a Python 3 script in a Python 2 environment (surely that's not the only cause, but just my guess). Can you check whether we have Python 3 or 2 running the script in the container?

@gwmoura
Copy link

gwmoura commented Oct 16, 2016

@cuducos @dsakuma jupyter/datascience-notebook:latest image is based on image debian@sha256:32a225e412babcd54c0ea777846183c61003d125278882873fb2bc97f9057c51 - https://github.com/jupyter/docker-stacks/blob/master/base-notebook/Dockerfile

probabily the python version is 2.7. @cuducos the project uses Python 3.5 or 2.7?

@gwmoura
Copy link

gwmoura commented Oct 16, 2016

guys, jupyter/datascience-notebook:latest has env for python 3 and python 2. @dsakuma try run your tests calling python3, the command python, by default executes python2

@cuducos
Copy link
Collaborator

cuducos commented Oct 16, 2016

@gwmoura the project is based on Pyhon 3, maybe that's not that clear, but this is what the python-3 in the conda create stands for.

@dsakuma using python3 is an option, as @gwmoura suggested, but IMHO using conda create and activate (as in Environment) before running any script is the proper way to do it — this strategy will ensure we're always using Anaconda's Python.

@dsakuma dsakuma force-pushed the add_dockerfile branch 2 times, most recently from 8d0cc63 to 6e9d606 Compare October 22, 2016 11:09
@cuducos
Copy link
Collaborator

cuducos commented Oct 23, 2016

Well… I'm not with my laptop this weekend so I can't help but ask: has activating the environment fixed the UnicodeDecodeError?

@dsakuma
Copy link
Contributor Author

dsakuma commented Oct 24, 2016

@cuducos, @gwmoura, sorry I was very busy this week and I could not work on this PR before...

But I finally figure out what was going on with the UnicodeDecodeError. It was because 'xz' compression being used in the datasets, that pandas was not able to read becuase it was outdated. The PR #88 added the pandas version requirement and fixed my problem.

I've added RUN conda update --yes conda to update to the last minor version of python 3.4. It was not necessary to create a serenata_de_amor environment. This container uses by default the python 3. So we can have a smaller Dockerfile :)

We can check the python version in bash by running:
docker-compose run --rm jupyter python --version

And in the notebook by running:
import sys
sys.version

Now I was able to run the entire notebook (translate and descriptive analysis notebooks). What do you think?

@cuducos
Copy link
Collaborator

cuducos commented Oct 24, 2016

Yay! That works

$  docker-compose run --rm jupyter python -V
Python 3.5.2 :: Continuum Analytics, Inc. 🎉 

(Ok, I confess, I added the emoji myself…)

The only thing I think we whould fix before merging is the instructions. It points users to localhost:8888 but in my case I had to figure out the IP the container received to be able to access the Jupyter Notebook server in my browser.

Any cleaver way to do it? I just checked docker-machine env as I'm in macOS, but we could try to find something that would work for all platforms. Jupyter Notebook output is not helpful here:

...
[I 20:50:35.764 NotebookApp] The Jupyter Notebook is running at: http://[all ip addresses on your system]:8888/
[I 20:50:35.765 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

@gwmoura
Copy link

gwmoura commented Oct 25, 2016

@cuducos, I think in this case the problem can be resolved configuring the Docker for mac ou for windows, Docker Compose or Docker Engine can not resolve this. Because Docker containers runs in virtual machines on Windows and Mac.

@cuducos
Copy link
Collaborator

cuducos commented Oct 25, 2016

I think in this case the problem can be resolved configuring the Docker for mac ou for windows

I know, but that's not the point, @gwmoura — the priority is offering beginners an easy way to get started (even if we have to offer instructions per platform) and I don't feel good in just saying get you Docker configured properly, n00b hahaha…

Is docker inspect <container> of any help in giving us the IP (not on my computer to test right now, sorry)?

@gwmoura
Copy link

gwmoura commented Oct 25, 2016

@cuducos I sorry, understand now! 😄
I gonna find instructions for a easy way to get started.

docker inspect can help find a IP ($ docker inspect [container_name] | grep IPAddress) but not is a better way, because the returned IP is a internal IP from VM (if running on Windows or Mac), the better option is run $ docker-machine ip [machine alias]

@felipeclopes
Copy link

felipeclopes commented Oct 25, 2016

@cuducos and @gwmoura using "native" Docker in Mac I was able to access the notebooks using http://127.0.0.1:8888 similar to the instructions. I'm not sure why I don't have an alias to localhost in my Mac, is there a chance it was your problem too @cuducos?

The result of a docker ps in my computer is the following:

CONTAINER ID        IMAGE                                                                                            COMMAND                  CREATED             STATUS              PORTS                                      NAMES
88cad32a0e76        serenatadeamor_jupyter                                                                           "tini -- start-notebo"   10 hours ago        Up 10 hours         0.0.0.0:8888->8888/tcp                     serenatadeamor_jupyter_1

Maybe you could send the information from yours @cuducos, in order for us to take a look.

@dsakuma
Copy link
Contributor Author

dsakuma commented Oct 26, 2016

Good news, @felipeclopes! But I'm curious why localhost does not work in your Mac...

I can change the instructions to point to 127.0.0.1:8888. It will probably work in all platforms. I will test in other platforms tomorrow.

Is this the best solution?

@dsakuma
Copy link
Contributor Author

dsakuma commented Oct 26, 2016

I was testing, and I think it will be necessary create 2 instructions, one for docker native and another for docker toolbox. I will write something today.

@cuducos
Copy link
Collaborator

cuducos commented Oct 26, 2016

I think it will be necessary create 2 instructions, one for docker native and another for docker toolbox

I see no problem in having instructions for Linux and for macOS/Windows users separately.

Unfortunately neither localhost:8888 nor 127.0.0.1:8888 works here. So far just the internal IP works (now 192.168.99.101:8888).

@cuducos
Copy link
Collaborator

cuducos commented Oct 26, 2016

BTW if anyone wants to explore this issue using my computer (the problematic one) we can pair on that. Maybe that's useful as I'm a total n00b in Docker.

@felipeclopes
Copy link

@dsakuma and @cuducos I think the run instructions should apply only to the environment which the instructions were provided by the installation guide, in this case the installation guide is not specific if you should install Toolbox or Native in Mac.

My opinion is that we should focus only in the Native solution, since it is the newest and the long-term product for Docker on Mac.

@cuducos do you have the native already or are you using toolbox, with docker-machine and other tools?

@cuducos
Copy link
Collaborator

cuducos commented Oct 26, 2016

@felipeclopes many thanks for this message. As I said, I'm a n00b in Docker. And being a n00b I wasn't aware that there is a native solution running on macOS.

I keep Docker updated through brew, but I guess they are not updated. Does this help?

~/serenata-de-amor(branch:add_dockerfile) » docker --version
Docker version 1.12.2, build bb80604
~/serenata-de-amor(branch: add_dockerfile) » docker-compose --version
docker-compose version 1.8.1, build 878cff1
~/serenata-de-amor(branch: add_dockerfile) » docker-machine --version
docker-machine version 0.8.2, build e18a919

Anyway, if there's a cross-platform (Linux/macOS/Win) native solution, and if this native solution makes the Jupyter server accessible through localhost:8888 I think we're done here — good to merge. Can you all confirm if this is the case?

If it's OK for @dsakuma I'll edit the instructions after merging because for me it's is quite awkward the fact that Docker instructions ended up more verbose than the regular ones. I guess this there's nothing wrong in the instructions themselves (so editing after merging), but the thing is I feel we're being super careful with Docker instructions (explaining each step) and super careless with regular instructions (run this, then that, then this other thing here, no question asked). So I might put my dirty hands on instructions as a whole to find a better balance.

@dsakuma
Copy link
Contributor Author

dsakuma commented Oct 26, 2016

Yes, @cuducos! I'm totally ok with that. I wrote each step too detailed... It will be great a more concise instruction!

@cuducos cuducos merged commit ee6f9f9 into okfn-brasil:master Nov 8, 2016
@cuducos
Copy link
Collaborator

cuducos commented Nov 8, 2016

Many thanks @dsakuma @gwmoura @felipeclopes It's official now! (Fix #80)

Irio added a commit that referenced this pull request Feb 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants