Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Kaggle notebook Dockerfile #1109

Merged
merged 1 commit into from
Jul 15, 2018
Merged

Conversation

pdmack
Copy link
Member

@pdmack pdmack commented Jun 30, 2018

Related to #1057 - pytorch notebook image
Related to #258 - Kaggle image
PTAL
/cc @jlewi
/cc @ankushagarwal
/cc @johnugeorge

Launched for me but obviously not exhaustively tested. Created a contrib dir to give it that YMMV vibe.

Notes (IMHO):

  • py3 only
  • we should just manually build/push to gcr.io once per release, no other support
  • definitely should not be in any argo or release workflows
  • waaaay too big an image, there may be some slimming that could be done with my mods but almost certainly it would remain north of 20 Gb

This change is Reviewable

@pdmack
Copy link
Member Author

pdmack commented Jun 30, 2018

/retest

@jlewi
Copy link
Contributor

jlewi commented Jul 2, 2018

This is great.

Do you have a prebuilt image that I can push to our public registry?

You should be able to push to gcr.io/kubeflow-dev

/lgtm
/approve

@pdmack
Copy link
Member Author

pdmack commented Jul 2, 2018

@jlewi I'll try to push something later today.

@pdmack
Copy link
Member Author

pdmack commented Jul 2, 2018

It's up there at kubeflow-dev but getting a SIGABRT running MNIST hello world in d.k.o. Stay tuned.

@pdmack
Copy link
Member Author

pdmack commented Jul 2, 2018

Heh, guess the only place I can test this is on my i7-4870HQ:

>>> import tensorflow as tf
2018-07-02 19:36:50.944006: F tensorflow/core/platform/cpu_feature_guard.cc:36] The TensorFlow library was compiled to useAVX2 instructions, but these aren't available on your machine.
Aborted (core dumped)

@pdmack
Copy link
Member Author

pdmack commented Jul 3, 2018

/retest

@jlewi, MNIST hello world works on my 2015 MBP (AVX2). Weighs in at a svelte 27 Gb!

gcr.io/kubeflow-dev/kubeflow-kaggle-notebook:latest

@pdmack
Copy link
Member Author

pdmack commented Jul 3, 2018

/test all

@jlewi
Copy link
Contributor

jlewi commented Jul 5, 2018

GPU test failures look unrelated.
#1126

@jlewi
Copy link
Contributor

jlewi commented Jul 5, 2018

/test all

1 similar comment
@pdmack
Copy link
Member Author

pdmack commented Jul 5, 2018

/test all

@jlewi
Copy link
Contributor

jlewi commented Jul 5, 2018

#1128 for the latest test flake.

/test all

@jlewi
Copy link
Contributor

jlewi commented Jul 5, 2018

Latest failure looks like a timeout waiting for the minikube workflow; but that ultimately ran successfully.

/test all

@pdmack
Copy link
Member Author

pdmack commented Jul 5, 2018

This is fun

@jlewi
Copy link
Contributor

jlewi commented Jul 6, 2018

Another timeout but looks like workflows succeeded. Let me bump the timeout.

jlewi added a commit to jlewi/testing that referenced this pull request Jul 6, 2018
* In kubeflow/kubeflow#1109 We are seeing lots of timeouts waiting for the
  test workflows too complete.

* Bump the timeout to 1 hour.
@pdmack
Copy link
Member Author

pdmack commented Jul 6, 2018

minikube is the bane of our e2e existence

k8s-ci-robot pushed a commit to kubeflow/testing that referenced this pull request Jul 6, 2018
* In kubeflow/kubeflow#1109 We are seeing lots of timeouts waiting for the
  test workflows too complete.

* Bump the timeout to 1 hour.
@pdmack
Copy link
Member Author

pdmack commented Jul 11, 2018

@jlewi I think we now have an MVP at gcr.io/kubeflow-dev/kubeflow-kaggle-notebook:latest

PTAL and cancel hold if you feel it's ready

Name: tensorflow
Version: 1.9.0rc0

Built for AVX2

@pdmack
Copy link
Member Author

pdmack commented Jul 11, 2018

/retest

@pdmack pdmack force-pushed the kaggle-nb-image branch 2 times, most recently from 9e9e4af to 75bba1c Compare July 12, 2018 01:20
@jlewi
Copy link
Contributor

jlewi commented Jul 12, 2018

Retagged it gcr.io/kubeflow-images-public/kaggle-notebook:v20180711
Testing it now.

@pdmack
Copy link
Member Author

pdmack commented Jul 12, 2018

Be gentle

@jlewi
Copy link
Contributor

jlewi commented Jul 12, 2018

When I tried gcr.io/kubeflow-images-public/kaggle-notebook:v20180711 I got the error
gcr.io/kubeflow-images-public/kaggle-notebook@sha256:30061303799b3f7927e3b6b684b228360d7a8370e2c797972519be407c4d1a75

checking if /home/jovyan volume needs init...
...creating /home/jovyan/work
...load initial content into /home/jovyan...
...done
Execute the command
/usr/local/bin/start.sh: line 62: exec: jupyterhub-singleuser: not found

@pdmack
Copy link
Member Author

pdmack commented Jul 12, 2018

I took out too much in that last cut. Stay tuned.

@pdmack
Copy link
Member Author

pdmack commented Jul 13, 2018

Update: Just trying to wrangle the addition of the jupyterhub package. It is surprisingly more disruptive than expected.

@pdmack
Copy link
Member Author

pdmack commented Jul 13, 2018

@jlewi please retry gcr.io/kubeflow-dev/kubeflow-kaggle-notebook:latest
feel a little more confident in this one
but adding jupyterhub 0.8.1 triggers some incompats

tensorflow 1.9.0rc0 has requirement setuptools<=39.1.0, but you'll have setuptools 40.0.0 which is incompatible.
kmodes 0.9 has requirement scikit-learn<0.20.0,>=0.19.0, but you'll have scikit-learn 0.20.dev0 which is incompatible.
kmeans-smote 0.1.0 has requirement imbalanced-learn<0.4,>=0.3.1, but you'll have imbalanced-learn 0.4.0.dev0 which is incompatible.
kmeans-smote 0.1.0 has requirement scikit-learn<0.20,>=0.19.0, but you'll have scikit-learn 0.20.dev0 which is incompatible.
hep-ml 0.5.0 has requirement theano==0.8.2, but you'll have theano 1.0.2 which is incompatible.
fastai 0.7.0 has requirement torch<0.4, but you'll have torch 0.4.0 which is incompatible.

Actually I'm assuming that, but maybe this happens in the original also.

@jlewi
Copy link
Contributor

jlewi commented Jul 13, 2018 via email

@jlewi
Copy link
Contributor

jlewi commented Jul 15, 2018

Pushed to:
gcr.io/kubeflow-images-public/kaggle-notebook:v20180713

Trying it now.

@jlewi
Copy link
Contributor

jlewi commented Jul 15, 2018

Was able to successfully launch a notebook after changing the home directory

ks param set kubeflow-core jupyterNotebookPVCMount /home/jovyan/work

I loaded up this notebook. The imports worked. I was missing the data for the notebook so it didn't work.

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jlewi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jlewi
Copy link
Contributor

jlewi commented Jul 15, 2018

Thank you so much @pdmack for al the hard working making this work.

@pdmack
Copy link
Member Author

pdmack commented Jul 15, 2018

PTL

@pdmack
Copy link
Member Author

pdmack commented Jul 15, 2018

/hold cancel

@k8s-ci-robot k8s-ci-robot merged commit 3b832cb into kubeflow:master Jul 15, 2018
@pdmack pdmack deleted the kaggle-nb-image branch July 30, 2018 15:43
pdmack referenced this pull request in Kaggle/docker-python Aug 28, 2018
This enables use cases such as this one: kubeflow/website#180
saffaalvi pushed a commit to StatCan/kubeflow that referenced this pull request Feb 11, 2021
yanniszark pushed a commit to arrikto/kubeflow that referenced this pull request Feb 15, 2021
* Modify prow config

* Add examples to prow
surajkota pushed a commit to surajkota/kubeflow that referenced this pull request Jun 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants