New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JupyterHub fails to load image properly, but starts a notebook anyway #226
Comments
When you say no libraries are installed you mean python libraries? |
@bkungfoo ping? Any more info? |
Jupyter starts running, but tf-gpu is not properly installed. Here is what I get when I create a notebook and run "import tensorflow as tf": ImportError Traceback (most recent call last) /opt/conda/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py in () /opt/conda/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py in swig_import_helper() /opt/conda/lib/python3.6/imp.py in load_module(name, file, filename, details) /opt/conda/lib/python3.6/imp.py in load_dynamic(name, path, file) ImportError: libcuda.so.1: cannot open shared object file: No such file or directory During handling of the above exception, another exception occurred: ImportError Traceback (most recent call last) /opt/conda/lib/python3.6/site-packages/tensorflow/init.py in () /opt/conda/lib/python3.6/site-packages/tensorflow/python/init.py in () /opt/conda/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py in () ImportError: Traceback (most recent call last): Failed to load the native TensorFlow runtime. See https://www.tensorflow.org/install/install_sources#common_installation_problems for some common reasons and solutions. Include the entire stack trace |
This usually means GPUs aren't properly configured. I'm assuming you are running on GKE?
|
It's an interesting question - is there something we can do it K6w to
enforce running something when you run on cloud <X>. Eg if we detect you're
running on GCP, you need to install/run x,y,z; if we detect you're running
on Azure, you need to install/run t,u,v; if we can't detect, we run
nothing.
…On Mon, Feb 12, 2018 at 3:04 PM Jeremy Lewi ***@***.***> wrote:
This usually means GPUs aren't properly configured.
I'm assuming you are running on GKE?
1. Did you follow the GKE instructions
<https://cloud.google.com/kubernetes-engine/docs/concepts/gpus#installing_drivers>
to install the NVIDIA drivers via daemonset?
2. When you spawned the Jupyter server via JupyterHub did you specify
GPUs in the resource requirements?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#226 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AADIdQq1D04H2mSh4Gz-D0ga8-jUMAFzks5tUMKpgaJpZM4R_PQs>
.
|
This problem is likely due to not following the instructions here to deploy an nvidia driver daemon on the GKE cluster. Closing the issue. |
/cc @jessiezcc /assign @jlewi remove extra new line remove extra new line
I've encountered this several times during deployment, both on minikube and gke. When starting jupyterhub, sometimes starting a server with a valid image (e.g. gcr.io/kubeflow/tensorflow-notebook-gpu:8fbc341245695e482848ac3c2034a99f7c1e5763) creates a container without any libraries installed.
kubectl logs tf-hub-0 -n $NAMESPACE
shows the following error:[W 2018-02-08 23:51:44.573 JupyterHub configurable:168] Config option
singleuser_image_spec
not recognized byKubeFormSpawner
. Did you mean one of:singleuser_image_pull_policy, singleuser_image_pull_secrets, singleuser_node_selector
?The text was updated successfully, but these errors were encountered: