Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error 'the server has asked for the client to provide credentials' #677

Closed
ShengjieLuo opened this issue Jul 21, 2016 · 21 comments
Closed

Comments

@ShengjieLuo
Copy link
Contributor

Hi. I deploy pachyderm on another server. The installation is successful with:

COMPONENT           VERSION
pachctl             1.1.0
pachd               1.1.0

However, when I begin the pipeline fruit stand

ID                                 OUTPUT                                     STARTED             DURATION             STATE
d6ddca2dfd72e6a0da7053ba5151b4cb   filter3/1dd4428dcc4d40359e7bcc3cdb594f3b   8 seconds ago       Less than a second   failure
2610bae2936923f0ce850c04f2cedad3   filter2/d58ac2a1231d4db4bb2487554bf36273   25 minutes ago      Less than a second   failure
dfe6acbbcd241d55a394a95077df5d1e   filter/a4934ebe280c4e2cae2e6cfb4b1c4c04    2 hours ago         Less than a second   failure
e9a26e0594f6bd00bacefa33c1b9850a   filter/78a4cea8361c41f3a9a2c8e2f9679bb0    2 hours ago         Less than a second   failure

I tried it several times, but all failure.

See pipeline information here

NAME                INPUT               OUTPUT              STATE
filter              data                filter              running
sum3                filter2             sum3                running
filter2             data                filter2             running
sum2                filter2             sum2                running
filter3             data                filter3             running
sum                 filter              sum                 running

I check the log for the problem

pachctl get-logs d6ddca2dfd72e6a0da7053ba5151b4cb
the server has asked for the client to provide credentials (get pods)

I can delete the repo in this pipeline, but I can't delete this pipeline
See here,

pachctl delete-pipeline filter3
error from DeletePipeline: the server has asked for the client to provide credentials (get pods)

I searched the information in source code
See here

src/server/vendor/k8s.io/kubernetes/pkg/api/errors/errors.go 
case http.StatusUnauthorized:
        reason = unversioned.StatusReasonUnauthorized
        message = "the server has asked for the client to provide credentials"

Unfortunately, I have hit so many problems these days... I have to ask for problems everyday...

@jdoliner
Copy link
Member

@ShengjieLuo what version of kubernetes are you using. I just had some issues getting pachyderm to work on 1.3 and had to downgrade to 1.2. Also could you grab logs from the server? make logs from within our repo will grab them for you.

@ShengjieLuo
Copy link
Contributor Author

@jdoliner I use the latest kubernetes stable version 1.3.2 and cause this problem. I haven't tried to use kubernetes 1.2.2 before. Since the official guide of kubernetes has all upgraded to 1.3 version, therefore I can't find the method to deploy 1.2.2. However, I use two method to deploy it, but they all failed.

@ShengjieLuo
Copy link
Contributor Author

Method 1 I use docker to deploy kubernetes.

export K8S_VERSION=v1.2.2
export ARCH=amd64
docker run -d \
    --volume=/:/rootfs:ro \
    --volume=/sys:/sys:rw \
    --volume=/var/lib/docker/:/var/lib/docker:rw \
    --volume=/var/lib/kubelet/:/var/lib/kubelet:rw \
    --volume=/var/run:/var/run:rw \
    --net=host \
    --pid=host \
    --privileged \
    gcr.io/google_containers/hyperkube-${ARCH}:${K8S_VERSION} \
    /hyperkube kubelet \
        --containerized \
        --hostname-override=127.0.0.1 \
        --api-servers=http://localhost:8080 \
        --config=/etc/kubernetes/manifests \
        --cluster-dns=10.0.0.10 \
        --cluster-domain=cluster.local \
        --allow-privileged --v=2

When I finish the deploying, the kubectl doesn't work. See here

 kubectl version
Client Version: version.Info{Major:"1", Minor:"2", GitVersion:"v1.2.2", GitCommit:"528f879e7d3790ea4287687ef0ab3f2a01cc2718", GitTreeState:"clean"}
The connection to the server localhost:8080 was refused - did you specify the right host or port?

I think that the 8080 of api-server is blocked. But I am not sure.

@ShengjieLuo
Copy link
Contributor Author

Method2 I use make launch-kube in \pavhyderm\Makefile

launch-kube: check-kubectl
        etc/kube/start-kube-docker.sh

See the script here

#!/bin/sh

set -Ee

docker run \
    -d \
    --volume=/:/rootfs:ro \
    --volume=/sys:/sys:ro \
    --volume=/dev:/dev \
    --volume=/var/lib/docker/:/var/lib/docker:rw \
    --volume=/var/lib/kubelet/:/var/lib/kubelet:rw \
    --volume=/var/run:/var/run:rw \
    --net=host \
    --pid=host \
    --privileged=true \
    gcr.io/google_containers/hyperkube:v1.2.2 \
    /hyperkube kubelet \
        --containerized \
        --hostname-override="127.0.0.1" \
        --address="0.0.0.0" \
        --api-servers=http://localhost:8080 \
        --config=/etc/kubernetes/manifests \
        --allow-privileged=true
#until kubectl version 2>/dev/null >/dev/null; do sleep 5; done

Unfortunately, the same problem occur

# kubectl version
Client Version: version.Info{Major:"1", Minor:"2", GitVersion:"v1.2.2", GitCommit:"528f879e7d3790ea4287687ef0ab3f2a01cc2718", GitTreeState:"clean"}
The connection to the server localhost:8080 was refused - did you specify the right host or port?

@ShengjieLuo
Copy link
Contributor Author

When I use kubernetes 1.3.0, no such problems. I can use kubernetes successfully.

@jdoliner
Copy link
Member

I think right now it's got to be 1.2 due to the pending bugs. Are you proxying port 8080? What does kubectl version get you? You might need to downgrade it. What's in ~/.kube/config?

@ShengjieLuo
Copy link
Contributor Author

See ~/.kube/config here

apiVersion: v1
clusters:
- cluster:
    server: http://localhost:8080
  name: test-doc
contexts:
- context:
    cluster: test-doc
    user: ""
  name: test-doc
current-context: test-doc
kind: Config
preferences: {}
users: []

@jdoliner
Copy link
Member

Is the port proxied or Docker running on localhost?

@ShengjieLuo
Copy link
Contributor Author

ShengjieLuo commented Jul 22, 2016

It's my network situation here.
The server's IP : 172.16.6.55
The server use the http_proxy:http://<proxy_name>:914/
The server also uses https_proxy: https://<proxy_name>:914/
I use ssh to connect the server.
The netstat information here:

Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.1:7001          0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.1:4001          0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.1:10248         0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.1:10249         0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.1:2380          0.0.0.0:*               LISTEN
tcp6       0      0 :::22                   :::*                    LISTEN
tcp6       0      0 :::4194                 :::*                    LISTEN
tcp6       0      0 :::10250                :::*                    LISTEN
tcp6       0      0 :::10251                :::*                    LISTEN
tcp6       0      0 :::10255                :::*                    LISTEN

8080 is not bind.
See a successfully deployed kubernetes 1.3.0

Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.1:7001          0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.1:4001          0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.1:10248         0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.1:10249         0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.1:2380          0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.1:8080          0.0.0.0:*               LISTEN
tcp6       0      0 :::22                   :::*                    LISTEN
tcp6       0      0 :::30650                :::*                    LISTEN
tcp6       0      0 :::4194                 :::*                    LISTEN
tcp6       0      0 :::10250                :::*                    LISTEN
tcp6       0      0 :::10251                :::*                    LISTEN
tcp6       0      0 :::6443                 :::*                    LISTEN
tcp6       0      0 :::10252                :::*                    LISTEN
tcp6       0      0 :::10255                :::*                    LISTEN
tcp6       0      0 :::32080                :::*                    LISTEN
tcp6       0      0 :::32081                :::*                    LISTEN
tcp6       0      0 :::32082                :::*                    LISTEN 

8080 and 30650 is bind. Need I disconnect the proxy server when deploying kubernetes?

@ShengjieLuo
Copy link
Contributor Author

Also see docker list here,

CONTAINER ID        IMAGE                                             COMMAND                  CREATED             STATUS              PORTS               NAMES
a458370bc041        gcr.io/google_containers/hyperkube-amd64:v1.2.2   "/hyperkube scheduler"   About an hour ago   Up About an hour                        k8s_scheduler.fc12fcbe_k8s-master-127.0.0.1_default_4c6ab43ac4ee970e1f563d76ab3d3ec9_4aef43e8
c27171089e72        gcr.io/google_containers/etcd:2.2.1               "/usr/local/bin/etcd "   About an hour ago   Up About an hour                        k8s_etcd.7e452b0b_k8s-etcd-127.0.0.1_default_1df6a8b4d6e129d5ed8840e370203c11_59ad41dc
e84737652501        gcr.io/google_containers/hyperkube-amd64:v1.2.2   "/hyperkube proxy --m"   About an hour ago   Up About an hour                        k8s_kube-proxy.9a9f4853_k8s-proxy-127.0.0.1_default_5e5303a9d49035e9fad52bfc4c88edc8_1f8959d0
a5d601e99e00        gcr.io/google_containers/pause:2.0                "/pause"                 About an hour ago   Up About an hour                        k8s_POD.6059dfa2_k8s-etcd-127.0.0.1_default_1df6a8b4d6e129d5ed8840e370203c11_e66abc6a
3be82a84ada9        gcr.io/google_containers/pause:2.0                "/pause"                 About an hour ago   Up About an hour                        k8s_POD.6059dfa2_k8s-master-127.0.0.1_default_4c6ab43ac4ee970e1f563d76ab3d3ec9_9157d23f
6193e7999d3c        gcr.io/google_containers/pause:2.0                "/pause"                 About an hour ago   Up About an hour                        k8s_POD.6059dfa2_k8s-proxy-127.0.0.1_default_5e5303a9d49035e9fad52bfc4c88edc8_4ba5f669
05d1fa8c86c2        gcr.io/google_containers/hyperkube:v1.2.2         "/hyperkube kubelet -"   About an hour ago   Up About an hour                        admiring_hamilton

@ShengjieLuo
Copy link
Contributor Author

I have worked on deploying Kubernetes v1.2.2 for a whole day, but do not solve this problem. The docker hyperkube cannot dial to 127.0.0.1:8080. Both of methods doesn't work for this task.

I use nmap localhost to find the network port situation.
In a unsuccessfully deployed 1.2.2 system, it can be

 nmap localhost

Starting Nmap 6.40 ( http://nmap.org ) at 2016-07-22 16:38 CST
Nmap scan report for localhost (127.0.0.1)
Host is up (0.000023s latency).
Other addresses for localhost (not scanned): 127.0.0.1
Not shown: 997 closed ports
PORT     STATE SERVICE
22/tcp   open  ssh
4001/tcp open  newoak
7001/tcp open  afs3-callback

Nmap done: 1 IP address (1 host up) scanned in 2.06 seconds

In a successfully deployed kubernetes 1.3.0 it is

 nmap localhost

Starting Nmap 6.40 ( http://nmap.org ) at 2016-07-22 16:39 CST
Nmap scan report for localhost (127.0.0.1)
Host is up (0.000023s latency).
Other addresses for localhost (not scanned): 127.0.0.1
Not shown: 996 closed ports
PORT     STATE SERVICE
22/tcp   open  ssh
4001/tcp open  newoak
7001/tcp open  afs3-callback
8080/tcp open  http-proxy

Nmap done: 1 IP address (1 host up) scanned in 2.08 seconds

See that 8080 is bind, and http-proxy use this port.
So I think the main problem in v1.2.2 is how to proxy the port 8080 and then it can works. Although I use similar methods to deploy v1.2.2 and v1.3.0, it seems that the proxy problem has been solved by the kubernetes upgrade. But now I have to face the problem because of the old version.

I search for the solution,
see here http://kubernetes.io/docs/getting-started-guides/docker/

If you are behind a proxy, you need to pass the proxy setup to curl in the containers to pull the certificates. Create a .curlrc under /root folder (because the containers are running as root) with the following line: proxy = <your_proxy_server>:

However, I insert the proxy in the /root/.curlrc of the server, but no effects. Also I use docker run -p and kubelet -proxy=<proxy>, no improvement at all.

@ShengjieLuo
Copy link
Contributor Author

Mentioned in trouble shooting of pachyderm,

If you see the following:

$ kubectl get all
The connection to the server localhost:8080 was refused - did you specify the right host or port?

You probably have not enabled port forwarding. You can start port forwarding by running something like:
ssh <HOST> -fTNL 8080:localhost:8080 -L 30650:localhost:30650
Where is one of the names of your docker hosts. You can see a list by running:
docker-machine ls

But I deploy the kubernetes on local server instead a docker machine. If I use docker-machine ls, the server note me it is an error command.

@ShengjieLuo
Copy link
Contributor Author

ShengjieLuo commented Jul 22, 2016

Note:
I think maybe we can transform the way of thinking. I speculate that possibly the bug is because of the K8S_apiserver doesn't run !

docker ps -a | grep api
a096a57c6395        gcr.io/google_containers/hyperkube-amd64:v1.2.2   "/hyperkube apiserver"   55 seconds ago       Exited (255) 53 seconds ago                           k8s_apiserver.78ec1de_k8s-master-127.0.0.1_default_4c6ab43ac4ee970e1f563d76ab3d3ec9_bf5d73be
05bb8901a245        gcr.io/google_containers/hyperkube-amd64:v1.2.2   "/hyperkube apiserver"   2 minutes ago        Exited (255) 2 minutes ago                            k8s_apiserver.78ec1de_k8s-master-127.0.0.1_default_4c6ab43ac4ee970e1f563d76ab3d3ec9_8b14199a
cc4aeb5414e2        gcr.io/google_containers/hyperkube-amd64:v1.2.2   "/hyperkube apiserver"   3 minutes ago        Exited (255) 3 minutes ago                            k8s_apiserver.78ec1de_k8s-master-127.0.0.1_default_4c6ab43ac4ee970e1f563d76ab3d3ec9_85583fc9

See the apiserver is exited each minute.

@jdoliner
Copy link
Member

That seems like a likely cause of the problem. What do logs from the apiserver say?

@ShengjieLuo
Copy link
Contributor Author

Hi @jdoliner The api-server is down in the deployment.
See apiserver logs here,

  sv: batch/v1
  mv: extensions/__internal
I0725 09:01:28.421642       1 genericapiserver.go:82] Adding storage destination for group batch
W0725 09:01:28.421682       1 server.go:383] No RSA key provided, service account token authentication disabled
F0725 09:01:28.421704       1 server.go:410] Invalid Authentication Config: open /srv/kubernetes/basic_auth.csv: no such file or directory

I haven't get the solution for this problem. BTW, what do you mean by 'proxying port 8080', sorry I didn't catch it.

@jdoliner
Copy link
Member

Proxying the port is something you have to do if you're using docker-machine but you're not so that shouldn't be an issue. I'm not familiar with the kubernetes issue unfortunately. I'd try the k8s slack channel they're normally quite helpful.

@ShengjieLuo
Copy link
Contributor Author

@jdoliner I have tested the deployment of Kubernetes and Pachyderm on four servers with different versions.
The conclusion go here that
1 for kubernetes >= v.1.3.0 (including v1.3.0 v1.3.2 v1.3.3), the deployment of pachyderm dose not have problems. We can create and delete the repo. But the pipeline job is failure with the information the server has asked for the client to provide credentials (get pods) The new issues will be post soon to solve this problem by Pachyderm team.

2 for kubenetes < v1.3.0 (including v1.2.0 v.1.2.2), the official guide http://kubernetes.io/docs/getting-started-guides/docker/ cannot be followed to deploy kubernetes, since the apiserver does not start. The information in the apiserver docker log can be seen here. I have join the kubernetes team in Slack to ask for help.

I0725 09:01:28.421642       1 genericapiserver.go:82] Adding storage destination for group batch
W0725 09:01:28.421682       1 server.go:383] No RSA key provided, service account token authentication disabled
F0725 09:01:28.421704       1 server.go:410] Invalid Authentication Config: open /srv/kubernetes/basic_auth.csv: no such file or directory

I attempt each version for several times on my servers to avoid the operation error. And I regard it as a common phenomenon instead of an accidental error. If pachyderm team has time to work on it, please re-verify the deployment process from the beginning and make sure it can work with a suitable version. Thanks a lot if you can do it, I so appreciate your help in this topic.

@ShengjieLuo
Copy link
Contributor Author

@jdoliner I have contacted the Kubernetes team on Slack. They help me to check the logs. The reason for the 1.2.2 failure is the Setup Pod is not triggered. However, the remote check hasn't found the boot reason of the phenomenon. Also, they give a suggestion to change the OS system from ubuntu14.04 to other os. Would you mind verify the deployment process on the local machine ubuntu 14.04 behind the proxy server with Kubernetes 1.2.2? Thanks a lot if you can do this verification.

@jdoliner
Copy link
Member

@ShengjieLuo I can confirm that I'm currently running Kubernetes 1.2.2 on Ubuntu 14.04 without any issues. I'm using a GCE VM via docker-machine and deploying kubernetes using docker. I haven't encountered any issues with it.

@ShengjieLuo
Copy link
Contributor Author

ShengjieLuo commented Aug 1, 2016

@jdoliner Thank you for your patience. Since my local servers cannot be used in deployment. I try to use AWS EC2 instead of the local server. The ubuntu14.04+kubernetes 1.2.2+Pachyderm1.1.0 works on AWS servers, and examples can run.

For the solution to this problem, I think the proxy port is not open for some application, since the only different difference between the EC2 and local server is the network configuration. However, it is just a speculation ,I am not sure about it.

I would still work on Pachyderm. The idea of Pachyderm to use the docker as a job is really an exciting idea different from hadoop or spark.

If anyone meet similar problems in the future, this issue can be referenced. And you can close it now. Thanks for your help.

@jdoliner
Copy link
Member

jdoliner commented Aug 1, 2016

Glad that worked.
Closing.

@jdoliner jdoliner closed this as completed Aug 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants