Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kfctl apply fails to create k8s resources; tries to deploy to 127.0.0.1 #2791

Closed
jlewi opened this issue Mar 24, 2019 · 7 comments · Fixed by #2813
Closed

kfctl apply fails to create k8s resources; tries to deploy to 127.0.0.1 #2791

jlewi opened this issue Mar 24, 2019 · 7 comments · Fixed by #2813
Assignees
Projects

Comments

@jlewi
Copy link
Contributor

jlewi commented Mar 24, 2019

When I first run kfctl apply -v all it fails with the following error

ERRO[0054] (Will retry) Component metacontroller apply failed; Error: Attempting to deploy to environment 'default' at '127.0.0.1', but cannot locate a server at that address  filename="ksonnet/ksonnet.go:191"

If I rerun kfctl apply -v all it works.

So it looks like there is a problem getting the cluster to talk when we first run kfctl apply and call ksonnet apply.

I'm using version:
v0.4.0-rc.1-264-g20a80512

/cc @gabrielwen @kkasravi

@jlewi jlewi added this to New in 0.5.0 via automation Mar 24, 2019
@kkasravi
Copy link
Contributor

I believe this was fixed by #2763 (and #2719)

@jlewi
Copy link
Contributor Author

jlewi commented Mar 24, 2019

@kkasravi Doesn't look like it. I'm using v0.4.0-rc.1-264-g20a80512
20a80512

Which should include those fixes.

@gabrielwen @kkasravi How does the ksonnet KfApp get the credentials and cluster info to talk to?

My conjecture is that when we first call kfctl apply we initialize the ksonnet KfApp with a dummy address like 127.0.0.1 because the cluster doesn't exist yet; but on rerunning it the cluster exists and we now have the cluster configured correctly.

Are we relying on the KUBECONFIG file and gcloud container clusters get-credentials?

@gabrielwen
Copy link
Contributor

  • credentials is from either .config/gcloud/application_default.json or GOOGLE_APPLICATION_CREDENTIALS.
  • we don't really use gcloud container clusters get-credentials to get cluster info but Ksonnet still uses KUBECONFIG anyway. This might be the root cause of the problem.
  • I've created a function that could get rid of KUBECONFIG file: link

@kkasravi
Copy link
Contributor

@jlewi what were your kfctl init and kfctl generate args?

we don't really use gcloud container clusters get-credentials to get cluster info but Ksonnet still uses KUBECONFIG anyway. This might be the root cause of the problem.

Ksonnet shouldn't use KUBECONFIG (and use the config from the platform if the platform has set it - either by expanding the KfDefSpec to include credentials or a function pointer). This is the pattern in general - so that

  • adding new platforms will be transparent to ksonnet (or kustomize).
  • not specifying a platform won't break default behavior in ksonnet

I can pick this up

@kkasravi
Copy link
Contributor

/assign @kkasravi

@jlewi jlewi moved this from New to kfctl Go binary is ready for users in 0.5.0 Mar 26, 2019
@jlewi
Copy link
Contributor Author

jlewi commented Mar 26, 2019

I looked at a recent successful postsubmit run.
https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/kubeflow_kubeflow/kubeflow-postsubmit/1294

Logs are attached below. You can see in the logs that the error occurs.

kubeflow-postsubmit-kfctl-go-cefb791-1294-5f69-1259696250.log.txt

@kkasravi
Copy link
Contributor

The error is due to the following:

  • we set the ksonnet serverURL to 127.0.0.1 during generate
  • in gcp.apply, we update .kube/config with a new entry of the deployment cluster using gcloud
  • in ksonnet.apply we call ksonnet envSet with the new serverUrl but without an override flag and w/o reloading the env.

0.5.0 automation moved this from kfctl Go binary is ready for users to Done Mar 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
0.5.0
  
Done
Development

Successfully merging a pull request may close this issue.

3 participants