-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[discussion] How can we play well with Kaggle? #258
Comments
It already seems to install Jupyter, so it would need to install the
`jupyterhub` python package as well, and that *should* work.
However, it seems to be putting config in $HOME, which often has a PVC
mounted over, so things might be weird if that happens.
…On Fri, Feb 16, 2018 at 1:37 PM, Jeremy Lewi ***@***.***> wrote:
Kaggle is an amazing resource.
Can we make it easy for people to use datasets and code they find on
Kaggle?
The docker container for Kaggle kernels is publicly available
https://hub.docker.com/r/kaggle/python
https://github.com/kaggle/docker-python
and provides a vast array of libraries.
What would it take to turn that into an image we could launch via
JupyterHub?
/cc @yuvipanda <https://github.com/yuvipanda> @aronchick
<https://github.com/aronchick>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#258>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAB23qj0pSku-sQcuAAgEDoDmNyfonPAks5tVfUMgaJpZM4SI9XR>
.
--
Yuvi Panda T
http://yuvi.in/blog
|
If it's helpful we could add JupyterHub to our Docker builds |
Looks like the DockerHub image is very outdated |
@pdmack created a Docker image in #258 I retagged into gcr.io/kubeflow-images-public/kaggle-notebook:v20180629 I retagged it using Google Container Builder; trying to use gcloud container add tag choked. I was able to launch a Jupyter image successfully using JupyterHub on Kubeflow. I randomly tried this notebook It failed on trying to import matplotlib with matplotlib not found. I wonder if this is an issue with the home directory / conda install in the Kaggle image. It looks like Kaggle does a pip install on matplotlib here I guess next step would be to check whether matplotlib is present in the kaggle image we are using as a base image. /cc @pdmack |
Looks like you were yellow carded |
I published @pdmack 's latest image to gcr.io/kubeflow-images-public/kaggle-notebook:v20180713 I did a smoke test loading this notebook The imports worked but I didn't have the data so wasn't able to run the notebook I think this is ready for people try. Note For this to work people need to change the default for the PVC mount e.g.
because I think the Kaggle image is installing some things in the home directory that would be overwritten. But we might want to verify that. |
Hmmm, not sure I'm following but maybe I'll inquire on the Tuesday call |
Ah nevermind |
I thought you'd considered that and said it was infeasible because it would lead to extra layers in Docker that double the image size? |
No, it's changing ownership that creates the monster layer. But saving a few files from |
@jlewi just poking this again. How did you discover that the PVC needed to be at |
I may have drawn an incorrect conclusion from a previous experiment. When I used
With PV mounted at /home/jovyan When I tried it with PV mounted at /home/jovyan/work I got the error
Its worth trying again with the latest image that we know works and using PV mounted at /home/jovyan. I would be very happy to be proven wrong and will apologize in advance for the confusion. |
Yeah i think that was one of my misfires in those Dockerfile iterations |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
* Add jiahaoc1993 as kubeflow org member Signed-off-by: jiahaoc <jiahaoc@vmware.com> * Add haozheng95 and owlet42
Kaggle is an amazing resource.
Can we make it easy for people to use datasets and code they find on Kaggle?
The docker container for Kaggle kernels is publicly available
https://hub.docker.com/r/kaggle/python
https://github.com/kaggle/docker-python
and provides a vast array of libraries.
What would it take to turn that into an image we could launch via JupyterHub?
/cc @yuvipanda @aronchick
The text was updated successfully, but these errors were encountered: