Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"getting-started-gke" installer fails #1201

Closed
mmatiaschek opened this issue Jul 13, 2018 · 8 comments · Fixed by kubeflow/website#92
Closed

"getting-started-gke" installer fails #1201

mmatiaschek opened this issue Jul 13, 2018 · 8 comments · Fixed by kubeflow/website#92

Comments

@mmatiaschek
Copy link

I had these issues on google cloud shell so i tried with a fresh install on Bash on Ubuntu on Windows, Ubuntu 16.04, and got the same problems.

  1. install and initialize gcloud
  2. install ksonnet v0.11.0 from github
  3. install kubeflow

First issue with the kubeflow documentation/installer was that the tmp folder was not found, my workaround was to mv /tmp/kubeflow-0.2.1/ /tmp/kubeflow-v0.2.1 and start again.

+ mv /tmp/kubeflow-v0.2.1 /mnt/c/Users/mmatiaschek/cmder/kubeflow_repo
mv: Aufruf von stat für '/tmp/kubeflow-v0.2.1' nicht möglich: Datei oder Verzeichnis nicht gefunden
$ mv /tmp/kubeflow-0.2.1/ /tmp/kubeflow-v0.2.1
$ curl https://raw.githubusercontent.com/kubeflow/kubeflow/v${KUBEFLOW_VERSION}/scripts/gke/deploy.sh | bash

The next error was not recoverable for me, any help would be greatly appreciated!!

kubeflow exists
+ gcloud deployment-manager --project=*********** deployments update kubeflow --config=cluster-kubeflow.yaml
The fingerprint of the deployment is ***************==
Waiting for update [operation-1531486959504-570e118330a81-c859d6b5-78eef300]...
....failed.
ERROR: (gcloud.deployment-manager.deployments.update) Error in Operation [operation-1531486959504-570e118330a81-c859d6b5-78eef300]: errors:
- code: CONDITION_NOT_MET
  location: /deployments/kubeflow/resources/kubeflow-gpu-pool-v1->$.properties
  message: '"": domain: validation; keyword: properties; message: required property(ies)
    not found; missing: ["nodePoolId"]; required: ["clusterId","nodePoolId","zone"]'
@mmatiaschek mmatiaschek changed the title Installer fails "getting-started-gke" installer fails Jul 13, 2018
@mmatiaschek
Copy link
Author

it seem like kubectl 1.11.0 is buggy and responsible for the failed deployment.
1.10.5 seems to work.
A fix for the tmp issue would be nice.

@ankushagarwal
Copy link
Contributor

/assign me

@k8s-ci-robot
Copy link
Contributor

@ankushagarwal: GitHub didn't allow me to assign the following users: me.

Note that only kubeflow members and repo collaborators can be assigned.

In response to this:

/assign me

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ankushagarwal
Copy link
Contributor

/assign ankushagarwal

@ankushagarwal
Copy link
Contributor

@mmatiaschek : Do you know what the problem with ksonnet was? I've not had issues with 1.11.0

Regarding the gcloud deployment-manager error. Can you first delete the deployment and it's config directory and try again?

DEPLOYMENT_NAME=kubeflow
gcloud deployment-manager deployments delete --quiet \
 ${DEPLOYMENT_NAME} --project=${PROJECT}

KUBEFLOW_DM_DIR="${DEPLOYMENT_NAME}_deployment_manager_configs"
rm -rf ${KUBEFLOW_DM_DIR}

@jlewi
Copy link
Contributor

jlewi commented Jul 16, 2018

@mmatiaschek The tmp issue should be fixed in 0.2.2-rc.0

@jlewi
Copy link
Contributor

jlewi commented Jul 16, 2018

0.2.2-rc.0 has been promoted to 0.2.2.

jlewi added a commit to jlewi/website that referenced this issue Jul 16, 2018
* 0.2.2 includes some fixes to deploy.sh.

Fix kubeflow/kubeflow#1201
k8s-ci-robot pushed a commit to kubeflow/website that referenced this issue Jul 16, 2018
* Bump the Kubeflow version from 0.2.1 to 0.2.2.

* 0.2.2 includes some fixes to deploy.sh.

Fix kubeflow/kubeflow#1201

* Update the user_guide.

* Fix the user_guide.md
@mmatiaschek
Copy link
Author

Thank you, it works... i still have some problems though, i will open another issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants