New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update README to reflect install experience #617
Conversation
user_guide.md
Outdated
@@ -9,7 +9,7 @@ This guide will walk you through the basics of deploying and interacting with Ku | |||
## Requirements | |||
* Kubernetes >= 1.8 [see here](https://github.com/kubeflow/tf-operator#requirements) | |||
* ksonnet version [0.9.2](https://ksonnet.io/#get-started) or later. (See [below](#why-kubeflow-uses-ksonnet) for an explanation of why we use ksonnet) | |||
|
|||
* An existing 2-node (at least) kubernate cluster. Nodes need to have storage >= 20 GB due to the ML libraries and third party packages being bundled in Kubeflow Docker images |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need 2 node? I've tested Kubeflow with 1 node minikube multiple times with success. Granted, you need a lot of disk storage mostly because of Jupyter notebook image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@inc0 after looking at the logs, my wording should be more specific to CPUs instead of nodes. I used a AWS t2.medium
node for going through the kubeflow user guide.
If the node had had more than 2 CPUs, spawning would have been fine. t2.medium
was documented to have 2 'vCPU', but it just wasn't enough for spawning another pod.
Please let me know if I should update the wordings or I safely can close the PR, assuming they are common prerequisites in Kubernetes/Jupyter worlds.
Name: jupyter-mimi
Namespace: kubeflow
Node: <none>
Labels: app=jupyterhub
component=singleuser-server
heritage=jupyterhub
hub.jupyter.org/username=mimi
Annotations: <none>
Status: Pending
IP:
Containers:
notebook:
Image: gcr.io/kubeflow-images-staging/tensorflow-1.7.0-notebook-gpu:v20180403-1f854c44
Port: 8888/TCP
Args:
start-singleuser.sh
--ip="0.0.0.0"
--port=8888
--allow-root
Requests:
cpu: 1
memory: 1G
Environment:
JUPYTERHUB_API_TOKEN: X
JPY_API_TOKEN: X
JUPYTERHUB_CLIENT_ID: user-mimi
JUPYTERHUB_HOST:
JUPYTERHUB_OAUTH_CALLBACK_URL: /user/mimi/oauth_callback
JUPYTERHUB_USER: mimi
JUPYTERHUB_API_URL: http://tf-hub-0:8081/hub/api
JUPYTERHUB_BASE_URL: /
JUPYTERHUB_SERVICE_PREFIX: /user/mimi/
MEM_GUARANTEE: 1G
CPU_GUARANTEE: 1.0
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from no-api-access-please (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
no-api-access-please:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.alpha.kubernetes.io/notReady:NoExecute for 300s
node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 1m (x37 over 11m) default-scheduler No nodes are available that match all of the predicates: Insufficient cpu (2), PodToleratesNodeTaints (1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to specify minimum requirements about number of CPUs.
@ykevinc Can you sync and make the requested changes please? |
6de576e
to
8983912
Compare
Thank you. Sorry the review process was so lengthy. /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jlewi The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/ok-to-test |
Hi,
I learned about kubeflow recently and gave it a run, but it occurred to me that:
kubectl port-forward tf-hub-0
process will have issue.This updates the user guide to get around the issues. At the same time I want to raise them if maybe they can be solved without updating the user guide (pre-check cluster size before deploying pod, potentially make kubeflow image smaller)
This change is