Configure Flannel networking tasks fails on Ubuntu 20.04 #115

NiftyMist · 2021-09-28T19:31:45Z

I'm following along in the Ansible for Kubernetes book to stand up a 5 node cluster. The cluster is running on Ubuntu 20.04 across the board. Node 1 (master) completes this task just fine, however all 4 worker nodes fail on this task with the following:

Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")

The only override vars I'm using are as follow:

# Kubernetes configuration.
kubernetes_version: '1.20'
kubernetes_allow_pods_on_master: false
kubernetes_apiserver_advertise_address: '10.0.0.10'
kubernetes_kubelet_extra_args: '--node-ip={{ ansible_host }}'

I will add any further info here as I continue to troubleshoot this issue.

The text was updated successfully, but these errors were encountered:

NiftyMist · 2021-09-30T13:04:51Z

I have verified that all nodes have the same certificate-authority-data in their /root/.kube/config.

NiftyMist · 2021-09-30T13:38:51Z

A shot in the dark, but I added a full package update and reboot to see if that solved the issue. It was unsuccessful.

---
- hosts: kube
  become: true
  handlers:
    - name: reboot
      reboot:

  pre_tasks:
    # adding to see if updating all packages will resolve the issue of 
    # Configure Flannel networking task failing on worker nodes.
    - name: update all packages # noqa 403
      apt:
        name: '*'
        state: latest
        update_cache: true
      notify: reboot

    # ensure handlers are flushed before moving on to geerlingguys roles. 
    - name: flush handlers
      meta: flush_handlers

  # Geerlingguy's roles per Ansible for Kubernetes page 77 (2021Sep30).
  roles:
    - geerlingguy.security
    - geerlingguy.docker
    - geerlingguy.swap
    - geerlingguy.kubernetes

NiftyMist · 2021-09-30T13:42:36Z

I'm thinking maybe this issue should be in https://github.com/geerlingguy/ansible-for-kubernetes?

NiftyMist · 2021-09-30T15:34:54Z

I ran an ansible ad-hoc command to get all of the /etc/kubernetes/admin.conf files from my nodes so I could inspect them all on my local machine:

ansible -m fetch -a "src=/etc/kubernetes/admin.conf dest=/tmp/fetch" -i inventory/hosts.yml all -b

I did a diff across all the files and saw that I was mistaken. The certificate-authority-data was different across the board. As a quick test I copied the certificate-authority-data from my master node's admin.conf on my local to my second node's admin.conf on my local and then pushed that file back out to node02. I sshed to node02 and switched to the root user. Then just a kubectl get nodes and boom, no certificate errors. However, I did get an error about not being a logged in user.

root@node05:~# kubectl get nodes
error: You must be logged in to the server (Unauthorized)

NiftyMist · 2021-09-30T16:05:24Z

Replaced all worker nodes with the exact same /etc/kubernetes/admin.conf I fetched to my local from node01 one with a quick script:

#!/bin/bash
for i in 2 3 4 5
do
ansible -m copy -a "src=/tmp/fetch/node01/etc/kubernetest/admin.conf dest=/etc/kubernetes/admin.conf" -i inventory/hosts.yml all -b --limit node0$1
done

Then ran the playbook again and was met with a completed execution but still only seeing node01 when I check on all the nodes:

root@node01:~# kubectl get nodes
NAME                        STATUS   ROLES    AGE   VERSION
node01.test.local       Ready      <none>    45h     v1.20.11

NiftyMist · 2021-09-30T16:53:32Z

I completely missed the kubernetes_role in the inventory on page 74 of Ansible for Kubernetes. I delete and redeployed my nodes in my test environment. I modified my inventory like so:

all:
  children:
    kube:
      children:
        kubemaster:
        kubeworker:
    kubemaster:
      hosts:
        node01:
    kubeworker:
      hosts:
        node0[2:5]:

inventory/group_vars/kubemaster.yml

---
# Kubernetes master configuration.
kubernetes_role: master

inventory/group_vars/kubeworker.yml

---
# Kubernetes worker configuration.
kubernetes_role: node

I reran the playbook and logged back in node one and I could see all of the worker nodes in the cluster! 🎉

root@node01:~# kubectl get nodes
NAME                    STATUS   ROLES                       AGE   VERSION
node01.test.local   Ready      control-plane,master       60s   v1.20.11
node02.test.local   Ready      <none>                       32s   v1.20.11
node03.test.local   Ready      <none>                       33s   v1.20.11
node04.test.local   Ready      <none>                       33s   v1.20.11
node05.test.local   Ready      <none>                       31s   v1.20.11

Sorry for the confusion and opening up a ticket unnecessarily. Thanks for all the work you do!

NiftyMist closed this as completed Sep 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configure Flannel networking tasks fails on Ubuntu 20.04 #115

Configure Flannel networking tasks fails on Ubuntu 20.04 #115

NiftyMist commented Sep 28, 2021

NiftyMist commented Sep 30, 2021

NiftyMist commented Sep 30, 2021

NiftyMist commented Sep 30, 2021

NiftyMist commented Sep 30, 2021

NiftyMist commented Sep 30, 2021

NiftyMist commented Sep 30, 2021 •

edited

Loading

Configure Flannel networking tasks fails on Ubuntu 20.04 #115

Configure Flannel networking tasks fails on Ubuntu 20.04 #115

Comments

NiftyMist commented Sep 28, 2021

NiftyMist commented Sep 30, 2021

NiftyMist commented Sep 30, 2021

NiftyMist commented Sep 30, 2021

NiftyMist commented Sep 30, 2021

NiftyMist commented Sep 30, 2021

NiftyMist commented Sep 30, 2021 • edited Loading

NiftyMist commented Sep 30, 2021 •

edited

Loading