About multi-node support implementation options #11

olljanat · 2023-08-19T09:05:52Z

First of all, very interesting project. I see a lot of potential on this 🚀

I understand that main use case for k2d is single node deployments in edge but as far I see, it would be very easy to add at least some level multi-node support to it.

Possibilities which comes to my mind are:

Swarm mode support

so basically instead of converting deployments directly to containers, k2d would convert those to service and swarm handles the rest.

Pros:

Simple for existing Swarm users.

Cons:

Might be tricky to implement because of all corner cases.
Introduces Swarm bugs and limitations to here too.

Utilizing Swarm overlay networks and DNS only

This actually already works if you deploy stack like this to Swarm first:

version: "3.8"
services:
  pause:
    image: k8s.gcr.io/pause:3.9
    networks:
    - k2d_net
    deploy:
      mode: global
networks:
  k2d_net:
    name: k2d_net
    driver: overlay
    attachable: true

Then on each Swarm node there would be still standalone k2d but containers deployed with it can find others with DNS names and communicate inside of that overlay network.

Supporting this case would be very simple, basically you just need add following logic:

If Swarm mode is enabled and node is Swarm manager -> create k2d_net network with overlay driver if it does not exists.
If Swarm mode is enabled and node is Swarm worker -> deploy containers without checking if k2d_net network exists.

and then it would be very simple to add namespace support too because instead of k2d_net you would create overlay network k2d_<namespace name>

Pros:

Very simple to implement.
Does not introduce Swarm scheduler bugs.
Together with GitOps tools it makes possible to implement low-cost multi-device/multi-cluster high available solution for applications which needs better availability than single single device can provide.

Cons:

Users need to use separate solution for multi-cluster configuration management.
Docker overlay network and internal DNS bugs/limitations are still there.

Bridge networks and custom service discovery

This is also possible already. Bascially user need create custom bridge network k2d_net without outgoing NAT and add static routes between nodes like this:

# Node1
docker network create \
  --driver bridge \
  --subnet 192.168.101.0/24 \
  --gateway 192.168.101.1 \
  -o com.docker.network.bridge.enable_ip_masquerade=false \
  k2d_net
sudo route add -net 192.168.102.0/24 gw <NODE 2 IP>

# Node2 
docker network create \
  --driver bridge \
  --subnet 192.168.102.0/24 \
  --gateway 192.168.102.1 \
  -o com.docker.network.bridge.enable_ip_masquerade=false \
  k2d_net
sudo route add -net 192.168.101.0/24 gw <NODE 1 IP>

Then deploy k2d normally to each node and they containers deployed with it can communicate with IP addresses.
For actual service discovery however something like https://github.com/kevinjqiu/coredns-dockerdiscovery is needed on top of this.

Pros:

Works already with k2d.
Very stable networking because of no overlay networks used (very similar than K8s + Calico without overlay)

Cons:

Users need to use separate solution for multi-cluster configuration management.
Sets lot of requirements for infractructure configuration (which might be good or bad thing depending on use case).
No complete service discovery solution available yet (or needs are at least more testing how to do it).

EDIT: Looks that Tailscale together its split DNS can solve both connectivity between nodes, even over internet and service discovery between nodes.

The text was updated successfully, but these errors were encountered:

ncresswell · 2023-08-20T22:42:46Z

Thanks for this... noting, and will look into it for the BETA

deviantony · 2023-08-21T09:51:17Z

Thanks for the feedback @olljanat

One advantage of having direct Swarm support is that we don't have to re-implement any form of orchestration, such as container scheduling on nodes. The main benefit of using a multi-node cluster is to let the orchestrator handle container re-scheduling when something goes wrong on a node.

This is missing from options 2 and 3, which only implement cross-node communications.

I think it's acceptable to have Swarm limitations as long as users are aware of them, similar to the limitations we currently have on Docker standalone.

However, we need to carefully consider this issue. Initially, we should focus on enhancing our support for Docker standalone and later think about creating an interface with a specific implementation for Swarm.

olljanat · 2023-08-21T17:42:12Z

Thanks for this... noting, and will look into it for the BETA

I'm happy I can help. I see a lot of potential on k2d but also challenging corner cases which why you need be careful to when deciding that which use cases will be supported.

One advantage of having direct Swarm support is that we don't have to re-implement any form of orchestration, such as container scheduling on nodes.

To be honest, direct Swarm support would be my dream come true. Then all those K8s GitOps tools would be available for Swarm users too 🎉

However, we need to carefully consider this issue. Initially, we should focus on enhancing our support for Docker standalone and later think about creating an interface with a specific implementation for Swarm.

This was answer which I expect to get and one of the reason for thinking of those other options.

One thing to consider of course is that how critical Podman support is? Because even single node environments can run Swarm mode. That would also provide better secrets handling and top of it only something similar than Bitnami Sealed Secrets would be needed and we would have very nice GitOps solution. In additionally because of latest developments on Docker, it would mean that many of K8s CSI plugins would be available for these (look https://github.com/olljanat/csi-plugins-for-docker-swarm ).

This was referenced Aug 21, 2023

ContainerD Support #18

Open

Thank you for the initiative olljanat/csi-plugins-for-docker-swarm#1

Open

olljanat changed the title ~~About multi-node support implentation options~~ About multi-node support implementation options Aug 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About multi-node support implementation options #11

About multi-node support implementation options #11

olljanat commented Aug 19, 2023 •

edited

Loading

ncresswell commented Aug 20, 2023

deviantony commented Aug 21, 2023 •

edited

Loading

olljanat commented Aug 21, 2023

About multi-node support implementation options #11

About multi-node support implementation options #11

Comments

olljanat commented Aug 19, 2023 • edited Loading

Swarm mode support

Utilizing Swarm overlay networks and DNS only

Bridge networks and custom service discovery

ncresswell commented Aug 20, 2023

deviantony commented Aug 21, 2023 • edited Loading

olljanat commented Aug 21, 2023

olljanat commented Aug 19, 2023 •

edited

Loading

deviantony commented Aug 21, 2023 •

edited

Loading