Skip to content
This repository has been archived by the owner on Dec 8, 2023. It is now read-only.

Improve documentation on k3os install for High Availability with Embedded DB #666

Closed
cdemi opened this issue Feb 25, 2021 · 13 comments
Closed
Labels
kind/feature A new feature

Comments

@cdemi
Copy link

cdemi commented Feb 25, 2021

I am experimenting with k3s in my homelab. I have managed to setup 1 server and 2 agents using the k3os install wizard. I have seen that in v0.19.5-rc.1, a multi-server or High Availability with Embedded DB configuration is supported.

I have been figuring out how to do this using k3os. I have spun up a new VM and it looks like the k3os install does not yet support this configuration, which is understandable since it's still in RC.

Is there any way I can achieve multi-server install using k3os? Or maybe promote my current agents to masters as well? If so, can you guide me on how to achieve this? Basically for my setup I plan to have all nodes act as server and master

@cdemi cdemi added the kind/feature A new feature label Feb 25, 2021
@zimme
Copy link
Contributor

zimme commented Mar 2, 2021

First HA server node config

k3os:
  token: <token>
  k3sArgs:
    - server
    - --cluster-init

Other HA server nodes config

k3os:
  token: <token>
  k3sArgs:
    - server
    - --server
    - https://<first-server-url>:6443

This should be the bare minimum for embedded etc HA using latest k3os RC.

Here's some info on the k3s args used in the example.

https://rancher.com/docs/k3s/latest/en/installation/ha-embedded/

Tried this a while back and got it up and running, however wrote this post from memory so double check the args, etc.
Use at own risk, yada yada.

@cdeadspine
Copy link

cdeadspine commented Apr 6, 2021

Last confirmed with v0.19.5-rc.1
"Other HA server nodes config"

k3os:
  k3s_args:
  token: token
  server_url: https://192.168.10.43:6443

I think you have to use a hard coded ip or external dns of some type because mDNS is not standard?

Also you must have exactly 3 or more running. 2of2 running is failure condition, 1of3 running is failure condition, 2of3 running is ok (???)
(something about how etcd works i think)

Also i don't remember why now, but I had to set for sure

k3os:
  dns_nameservers:
  - something
  ntp_servers:
  - something

@binayakd
Copy link

Can confirm the following configs are working for me with v0.21.1-k3s1r0.

FIrst HA serve node:

k3os:
  token: <token>
  
  k3s_args:
    - server
    - "--cluster-init"
    
  ntp_servers:
    # ntp servers if your choice
    - 0.us.pool.ntp.org
    - 1.us.pool.ntp.org

  dns_nameservers:
    # dns servers of your choice
    - 8.8.8.8
    - 1.1.1.1

Rest of the HA nodes:

k3os:
  token: <token>
  
  k3s_args:
    - server
    - "--server"
    - "https://<first HA server node IP>:6443"
    
  ntp_servers:
    # ntp servers if your choice
    - 0.us.pool.ntp.org
    - 1.us.pool.ntp.org

  dns_nameservers:
    # dns servers of your choice
    - 8.8.8.8
    - 1.1.1.1

the k3os.server_url prop is not needed for any of the nodes.

@dweomer dweomer closed this as completed Sep 1, 2021
@gprakosa
Copy link

Hi,

Just curious if first node is unreachable, how we can get to contact the API cluster?

Thanks.

@cdemi
Copy link
Author

cdemi commented Jan 21, 2022

If you are running in HA, you can contact any master node for the API

@gprakosa
Copy link

Oh I see, just to make sure. Let say I have:

1st node: 192.168.1.1/24
2nd node: 192.168.1.2/24

I can call both of them on http://192.168.1.1:6443 or http://192.168.1.2:6443 anytime?

@cdemi
Copy link
Author

cdemi commented Jan 21, 2022 via email

@gprakosa
Copy link

Great, thanks!

Thinking about register both IP to be have same name on DNS, but load-balancer looks more elegant.

@cdemi
Copy link
Author

cdemi commented Jan 21, 2022

If you go down the DNS route your clients may still send requests to nodes that are down. Another option is to use a floating IP address type configuration (like keepalived), where the virtual IP address can move from one node to the other if the node is down

@gprakosa
Copy link

I turn-off my first node, then got Error from server: etcdserver: request timed out.

image

What I've missed?

@cdemi
Copy link
Author

cdemi commented Jan 21, 2022

did you update your .kubeconfig? Or is it still pointing to the first node?

@gprakosa
Copy link

Oops, my bad...

Spin-up the 3rd node, re-turnoff the 1st node, then I got like this. Perfect, thanks!

image

Now, the rest is about to find what's the best to be set (load-balancer/keepalived) on K3OS, so I can access the cluster seamlessly with only one kubeconfig. I'll update later.

@cdemi
Copy link
Author

cdemi commented Jan 21, 2022

If you want a really HA environment. You should have at least 2 load-balancers with keepalived and the load-balancers point to Kubernetes similar to:

image

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/feature A new feature
Projects
None yet
Development

No branches or pull requests

6 participants