New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed to load system roots and no roots provided - TLS error #36

Closed
cdenneen opened this Issue Feb 14, 2018 · 15 comments

Comments

Projects
None yet
6 participants
@cdenneen

cdenneen commented Feb 14, 2018

{"generation.metadata":0,"level":"error","msg":"error warming credentials: RequestError: send request failed\ncaused by: Post https://sts.amazonaws.com/: x509: failed to load system roots and no roots provided","pod.iam.role":"arn:aws:iam::###########:role/chrisiamtest1","pod.name":"aws-cli3","pod.namespace":"default","pod.status.ip":"100.112.74.130","pod.status.phase":"Running","resource.version":"4849725","time":"2018-02-14T17:59:54Z"}

{"generation.metadata":0,"level":"error","msg":"error warming credentials: RequestError: send request failed\ncaused by: Post https://sts.amazonaws.com/: x509: failed to load system roots and no roots provided","pod.iam.role":"chrisiamtest1","pod.name":"aws-cli","pod.namespace":"default","pod.status.ip":"100.112.74.129","pod.status.phase":"Running","resource.version":"4849455","time":"2018-02-14T17:59:54Z"}

{"level":"error","msg":"error requesting credentials: RequestError: send request failed\ncaused by: Post https://sts.amazonaws.com/: x509: failed to load system roots and no roots provided","pod.iam.role":"arn:aws:iam::############:role/chrisiamtest1","time":"2018-02-14T18:00:54Z"}

I've tried with just using the role name and the full ARN in the pod deployment.
Can someone help me understand what this error means?
Is there documentation on how to specify the base-arn or is autodetect the best solution?

@cdenneen

This comment has been minimized.

cdenneen commented Feb 14, 2018

I've logged into agent/server and the TLS secrets are in /etc/kiam/tls as they are supposed to be: https://github.com/uswitch/kiam/blob/master/docs/TLS.md

@pingles

This comment has been minimized.

Contributor

pingles commented Feb 16, 2018

@cdenneen The published docker image doesn't have any trusted CAs included so it's failing to interact with the AWS APIs. The container expects these to be in /etc/ssl/certs.

We use CoreOS Container Linux and so mount the CAs from the host into the container (so updates to the host propagate to the containers).

The relevant bits in the deployment manifests are:

Hope that helps. I'm going to close this for now but if something else happens please comment and we'll try to help.

@pingles pingles closed this Feb 16, 2018

@cdenneen

This comment has been minimized.

cdenneen commented Feb 16, 2018

@pingles Thanks for sending me down the correct path. (the kops hosts have a /usr/share/ca-certificates with a bunch of mozilla certificates but it doesn't seem to be what is needed here so update is necessary. Maybe include ca-certificates in the docker image to avoid different underlying hosts having ca-certificates in the wrong places or create config-map for easier/documented update depending upon the k8s install you might have?

For anyone who might find this issue and needs help here is what I ended up needing to do in order to get this to work with kops

updates to server.yaml

diff --git a/deploy/server.yaml b/deploy/server.yaml
index 9cd82ad..6d08710 100644
--- a/deploy/server.yaml
+++ b/deploy/server.yaml
@@ -17,10 +17,14 @@ spec:
       serviceAccountName: kiam-server
       nodeSelector:
         kubernetes.io/role: master
+      tolerations:
+      - key: "node-role.kubernetes.io/master"
+        effect: "NoSchedule"
+        operator: "Exists"
       volumes:
         - name: ssl-certs
           hostPath:
-            path: /usr/share/ca-certificates
+            path: /etc/ssl/certs
         - name: tls
           secret:
             secretName: kiam-server-tls

trust-policy.json

{
    "Version": "2012-10-17",
    "Statement": [
      {
        "Sid": "",
        "Effect": "Allow",
        "Principal": {
          "Service": "ec2.amazonaws.com"
        },
        "Action": "sts:AssumeRole"
      },
      {
        "Sid": "",
        "Effect": "Allow",
        "Principal": {
          "AWS": "arn:aws:iam::{{Account ID}}:role/masters.k8s.clusterdomain.com"
        },
        "Action": "sts:AssumeRole"
      }
    ]
  }
@pingles

This comment has been minimized.

Contributor

pingles commented Feb 16, 2018

Maybe include ca-certificates in the docker image to avoid different underlying hosts having ca-certificates in the wrong places or create config-map for easier/documented update depending upon the k8s install you might have?

I'd rather continue to mount from the host. I know this is somewhat host/cluster specific but it means that updates can be processed as the nodes are upgraded, rather than relying on each container having to

@pingles

This comment has been minimized.

Contributor

pingles commented Feb 16, 2018

Remember that the /etc/ssl/certs is the location that the Go HTTP client is using to check for certificates to ensure it can talk to AWS and isn't related to the agent accessing the server.

@ewbankkit ewbankkit referenced this issue Apr 24, 2018

Closed

Helm charts #53

@luispollo

This comment has been minimized.

luispollo commented May 4, 2018

@cdenneen Your comments and patches saved the day for me. Thank you very much!

@josselin-c

This comment has been minimized.

josselin-c commented Oct 10, 2018

For the record, on a Kubernetes 1.11 created via Kops, I had to use /usr/share/ca-certificates/mozilla instead of /usr/share/ca-certificates

@edgar-humberto

This comment has been minimized.

edgar-humberto commented Nov 13, 2018

I am getting that error as well.

error warming credentials: RequestError: send request failed\ncaused by: Post https://sts.amazonaws.com/: x509: failed to load system roots and no roots provided

Here is the full error.

{"generation.metadata":0,"level":"error","msg":"error warming credentials: RequestError: send request failed\ncaused by: Post https://sts.amazonaws.com/: x509: failed to load system roots and no roots provided","pod.iam.role":"kiam_sample","pod.name":"nodeapp-54b89ff69d-zk65q","pod.namespace":"iam-example","pod.status.ip":"","pod.status.phase":"Pending","resource.version":"2695","time":"2018-11-13T07:09:33Z"}

However when I exec into the pod kubectl exec -it -n kube-system kiam-server-88f4j -- /bin/sh

I could see the certificate being mounted and the contents of the files were as expected.

screenshot 2018-11-12 23 26 15

This was odd because on the host I could not find the certificates for the life of me in either

  • /etc/ssl/certs
  • /usr/share/ca-certificates

Clearly I am missing something here, can someone help me understand what I am doing wrong ?

@pingles

This comment has been minimized.

Contributor

pingles commented Nov 13, 2018

@edgar-humberto it'll depend on your host distribution- what are you using?

@edgar-humberto

This comment has been minimized.

edgar-humberto commented Nov 13, 2018

@pingles I am using Kops to build the k8s cluster. The nodes are running Debian Jessie.

BTW I noticed that it is recommended that the server and the agent be on different nodes. Not sure if this is affecting this, but I am mentioning it in case this is meaningful.

@edgar-humberto

This comment has been minimized.

edgar-humberto commented Nov 14, 2018

Am I correct in thinking that I need to have the server and the agent on different hosts?

@Joseph-Irving

This comment has been minimized.

Member

Joseph-Irving commented Nov 14, 2018

@edgar-humberto Yes that the agents and server have to be on different hosts.

This is because the agent writes iptables rules so that any aws metadata call on a host goes via the agent process which talks to the kiam server.
So if you have the agent on the same host as the server, when the server tries to get AWS creds it will go via the agent and get forwarded to a kiam server, so you'll get a cyclic dependency and never get any creds for the server to work.

@edgar-humberto

This comment has been minimized.

edgar-humberto commented Nov 14, 2018

@Joseph-Irving thanks for the reply. I will create a new instance group in Kops. I hope this solves my issue.

@edgar-humberto

This comment has been minimized.

edgar-humberto commented Nov 14, 2018

Yup, that fixed it for me. Ran the server and agent on different instance groups, also I had to change my interface configuration to crb0 for the agent.

          args:
            - agent
            - --iptables
            - --host-interface=cbr0

The default interface in the example is cali+, which prevented my pods from actually communicating with the kiam-agent.

Thank you for the help and for kiam, saved me a ton of work.

@edgar-humberto

This comment has been minimized.

edgar-humberto commented Nov 16, 2018

OK, so I have this same issue popping up again when I switch my kops topology to private and set the networking interface to weave.

Is there something more that I need to do in order to get kiam to work with weave other than set the - --host-interface=weave?

Am I missing some other setting?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment