Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[gcp] bootstrapper fails to create CloudEndpoints resource #954

Closed
jlewi opened this issue Jun 8, 2018 · 6 comments · Fixed by #976
Closed

[gcp] bootstrapper fails to create CloudEndpoints resource #954

jlewi opened this issue Jun 8, 2018 · 6 comments · Fixed by #976

Comments

@jlewi
Copy link
Contributor

jlewi commented Jun 8, 2018

When I deploy on GKE I'm seeing the following error and no cloudendpoints resource is created.

{"filename":"app/server.go:243","level":"info","msg":"Using existing namespace: kubeflow","time":"2018-06-08T05:01:00Z"}
{"filename":"app/server.go:402","level":"info","msg":"Cluster version: v1.9.6-gke.1","time":"2018-06-08T05:01:00Z"}
{"filename":"app/server.go:217","level":"info","msg":"Storage class: standard","time":"2018-06-08T05:01:00Z"}
{"filename":"app/server.go:231","level":"info","msg":"StorageClass standard is default true","time":"2018-06-08T05:01:00Z"}
{"filename":"app/server.go:417","level":"info","msg":"Using K8s host https://10.7.240.1:443","time":"2018-06-08T05:01:00Z"}
{"filename":"env/create.go:78","level":"info","msg":"Creating environment \"default\" with namespace \"kubeflow\", pointing to cluster at address \"https://10.7.240.1:443\"","time":"2018-06-08T05:01:00Z"}
{"filename":"lib/lib.go:129","level":"info","msg":"Generating ksonnet-lib data at path '/opt/bootstrap/default/lib/v1.7.0'","time":"2018-06-08T05:01:01Z"}
{"filename":"app/server.go:441","level":"info","msg":"Successfully initialized the app /opt/bootstrap/default.","time":"2018-06-08T05:01:01Z"}
{"filename":"app/server.go:296","level":"info","msg":"Installing package kubeflow/core","time":"2018-06-08T05:01:01Z"}
{"filename":"registry/cache.go:83","level":"info","msg":"Retrieved 33 files","time":"2018-06-08T05:01:01Z"}
{"filename":"app/server.go:296","level":"info","msg":"Installing package kubeflow/tf-serving","time":"2018-06-08T05:01:01Z"}
{"filename":"registry/cache.go:83","level":"info","msg":"Retrieved 5 files","time":"2018-06-08T05:01:01Z"}
{"filename":"app/server.go:296","level":"info","msg":"Installing package kubeflow/tf-job","time":"2018-06-08T05:01:01Z"}
{"filename":"registry/cache.go:83","level":"info","msg":"Retrieved 6 files","time":"2018-06-08T05:01:01Z"}
{"filename":"app/server.go:263","level":"info","msg":"Creating Component: kubeflow-core ...","time":"2018-06-08T05:01:01Z"}
{"filename":"component/create.go:91","level":"info","msg":"Writing component at '/opt/bootstrap/default/components/kubeflow-core.jsonnet'","time":"2018-06-08T05:01:01Z"}
{"filename":"app/server.go:263","level":"info","msg":"Creating Component: cloud-endpoints ...","time":"2018-06-08T05:01:01Z"}
{"filename":"component/create.go:91","level":"info","msg":"Writing component at '/opt/bootstrap/default/components/cloud-endpoints.jsonnet'","time":"2018-06-08T05:01:01Z"}
{"filename":"app/server.go:263","level":"info","msg":"Creating Component: cert-manager ...","time":"2018-06-08T05:01:01Z"}
{"filename":"component/create.go:91","level":"info","msg":"Writing component at '/opt/bootstrap/default/components/cert-manager.jsonnet'","time":"2018-06-08T05:01:01Z"}
{"filename":"app/server.go:263","level":"info","msg":"Creating Component: iap-ingress ...","time":"2018-06-08T05:01:01Z"}
{"filename":"component/create.go:91","level":"info","msg":"Writing component at '/opt/bootstrap/default/components/iap-ingress.jsonnet'","time":"2018-06-08T05:01:01Z"}
{"filename":"app/server.go:511","level":"info","msg":"App root /opt/bootstrap/default","time":"2018-06-08T05:01:01Z"}
Initialized app /opt/bootstrap/default
{"filename":"app/server.go:520","level":"info","msg":"Apply kubeflow Components...","time":"2018-06-08T05:01:01Z"}
{"filename":"app/server.go:529","level":"info","msg":"stderr \u003e\u003e\u003e exit status 1: unable to recognize \"STDIN\": no matches for kind \"Issuer\" in version \"certmanager.k8s.io/v1alpha1\"\nunable to recognize \"STDIN\": no matches for kind \"LambdaController\" in version \"metacontroller.k8s.io/v1alpha1\"\nunable to recognize \"STDIN\": no matches for kind \"Certificate\" in version \"certmanager.k8s.io/v1alpha1\"\nunable to recognize \"STDIN\": no matches for kind \"CloudEndpoint\" in version \"ctl.isla.solutions/v1\"\n","time":"2018-06-08T05:01:28Z"}
{"filename":"bootstrap/main.go:48","level":"error","msg":"Bootstrapper failed with error: exit status 1\n","time":"2018-06-08T05:01:28Z"}
{"filename":"bootstrap/main.go:49","level":"info","msg":"Keeping pod alive so user can ssh in and check error status.","time":"2018-06-08T05:01:28Z"}

I wonder if this is an issue with the ordering of resource creation.

/kind bug
/priority p1

@jlewi
Copy link
Contributor Author

jlewi commented Jun 8, 2018

I exec'd into the bootstrapper and reran

ks show default > /tmp/manifests
kubectl apply -f /tmp/manifests 

The cloudendpoints resource was created. So I think it is an ordering issue.

/assign @kunmingg

@kunmingg
Copy link
Contributor

kunmingg commented Jun 8, 2018

OK seems ks show | kubectl apply is a bit hacky, will change to ks apply directly.

@jlewi
Copy link
Contributor Author

jlewi commented Jun 8, 2018

Using ks directly might be better but not sure that will solve the problem.

Maybe you just need to explicitly order the components?

e.g.

ks show -c metacontroller | kubectl apply
ks show -c cloudendpoints | kubectl apply
...

@kunmingg
Copy link
Contributor

kunmingg commented Jun 8, 2018

Control oder from yaml config?
Define dependency for each component and execute following dag order.
Make it extendable.

@jlewi
Copy link
Contributor Author

jlewi commented Jun 10, 2018

@kunmingg Do we need to introduce new variables? Could we just create the components in the order specified in the YAML file?

@kunmingg
Copy link
Contributor

@jlewi
Saw same error even creating 1 component at a time, so use ks apply instead of kubectl apply, which takes care of creation order for us.

yanniszark pushed a commit to arrikto/kubeflow that referenced this issue Feb 15, 2021
* add a gauge metric for current experiments

Signed-off-by: yeya24 <yb532204897@gmail.com>

* fmt & fix test

Signed-off-by: yeya24 <yb532204897@gmail.com>
surajkota pushed a commit to surajkota/kubeflow that referenced this issue Jun 13, 2022
…c0 (kubeflow#954)

* image gcr.io/kubeflow-images-public/notebook-controller:vmaster-gf39279c0
* Image built from kubeflow/kubeflow@f39279c0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants