# Setup a K8s Cluster with Calico

The objective of this notebook is to setup a K8s cluster and a Calico CNI (container network interface) on the [Fabric Testbed](https://portal.fabric-testbed.net/) with the base OS as Ubuntu 22.04.

It refers to the tutorials at:
- [ChatGPT guide](https://docs.google.com/document/d/14d6HMI5jW8NLFe0K4Yx1_bO0DV44ynKMJYQe71ikX7c)
- [Calico quickstart](https://docs.tigera.io/calico/latest/getting-started/kubernetes/quickstart)
- [A video tutorial](https://youtu.be/k3iexxiYPI8)

## Preamble: get a Fabric slice

Our slice contains 3 nodes:
1. The `cpnode` for the [K8s Control Plane](https://kubernetes.io/docs/concepts/overview/components/#control-plane-components).
2. 2 worker nodes for the [K8s Node](https://kubernetes.io/docs/concepts/overview/components/#node-components): `wknode1`, `wknode2`.

### Define the node properties

We configure a L2 network on Fabric so we can manually setup the IPv4 addresses.

In [1]:
# Define the network of the slice
FABRIC_NIC_STR = 'NIC_Basic'  # do not update
FABRIC_SUBNET_STR = "192.168.0.0/24"  # so the node IPs would be 192.168.0.1-25x
FABRIC_L2NET_STR = 'site_bridge_net'  # do not update

We need extra storage for the `cpnode`.

In [2]:
# Define the nodes of the slice
node_config = {
    'cpnode': {
        'ip':'192.168.0.1',
        'cores': 8,
        'ram': 24,
        'disk': 100 },
    'wknode1': {
        'ip':'192.168.0.2',
    },
    'wknode2': {
        'ip':'192.168.0.3',
    },
}

### Fabric headers and helper funtions

In [3]:
from datetime import datetime
from datetime import timezone
from datetime import timedelta

from fabrictestbed_extensions.fablib.fablib import FablibManager as fablib_manager

from ipaddress import ip_address, IPv4Address, IPv6Address, IPv4Network, IPv6Network
import ipaddress

import json

fablib = fablib_manager()
fablib.show_config()

0,1
Orchestrator,orchestrator.fabric-testbed.net
Credential Manager,cm.fabric-testbed.net
Core API,uis.fabric-testbed.net
Token File,/home/fabric/.tokens.json
Project ID,bbe0d94c-736b-477a-a2e6-fef9fe7ac9ca
Bastion Host,bastion.fabric-testbed.net
Bastion Username,xmei_0000124604
Bastion Private Key File,/home/fabric/work/fabric_config/fabric-bastion-key
Slice Public Key File,/home/fabric/work/fabric_config/slice_key.pub
Slice Private Key File,/home/fabric/work/fabric_config/slice_key


0,1
Orchestrator,orchestrator.fabric-testbed.net
Credential Manager,cm.fabric-testbed.net
Core API,uis.fabric-testbed.net
Token File,/home/fabric/.tokens.json
Project ID,bbe0d94c-736b-477a-a2e6-fef9fe7ac9ca
Bastion Host,bastion.fabric-testbed.net
Bastion Username,xmei_0000124604
Bastion Private Key File,/home/fabric/work/fabric_config/fabric-bastion-key
Slice Public Key File,/home/fabric/work/fabric_config/slice_key.pub
Slice Private Key File,/home/fabric/work/fabric_config/slice_key


In [4]:
FABRIC_SITE_OVERRIDE = "UCSD"
FABRIC_SLICENAME_PREFIX = 'k8s_calico_'
FABRIC_OS_STR = 'default_ubuntu_22'

# Define the Fabric slice name with user_id as the suffix
user_info = fablib.get_user_info()
slice_name = FABRIC_SLICENAME_PREFIX + FABRIC_SITE_OVERRIDE + "_" + user_info['bastion_login']

# Write selected site into node attributes
for n in node_config:
    node_config[n]['site'] = FABRIC_SITE_OVERRIDE
    node_config[n]['image'] = FABRIC_OS_STR

Build the Fabric slice.

In [5]:
slice = fablib.new_slice(name=slice_name)

# Create the network
net1 = slice.add_l2network(name=FABRIC_L2NET_STR, subnet=IPv4Network(FABRIC_SUBNET_STR))

In [6]:
# Create nodes using subnet address assignment
skip_keys = ['ip']

nodes = dict()
for node_name, node_attr in node_config.items():
    print(f"{node_name=}, {node_attr['ip']}")
    nodes[node_name] = slice.add_node(
        name=node_name,
        **{x: node_attr[x] for x in node_attr if x not in skip_keys}
    )
    nic_interface = nodes[node_name].add_component(
        model=FABRIC_NIC_STR,
        name='_'.join([node_name, FABRIC_NIC_STR, 'nic'])
    ).get_interfaces()[0]
    net1.add_interface(nic_interface)
    nic_interface.set_mode('config')
    nic_interface.set_ip_addr(node_attr['ip'])

print(f'Creating a slice named "{slice_name}" with nodes in {FABRIC_SITE_OVERRIDE}')

node_name='cpnode', 192.168.0.1
node_name='wknode1', 192.168.0.2
node_name='wknode2', 192.168.0.3
Creating a slice named "k8s_calico_UCSD_xmei_0000124604" with nodes in UCSD


In [7]:
slice.submit()


Retry: 11, Time: 286 sec


0,1
ID,cfca4285-0a65-42b8-8eaa-c3b49e8c5876
Name,k8s_calico_UCSD_xmei_0000124604
Lease Expiration (UTC),2025-02-01 18:58:19 +0000
Lease Start (UTC),2025-01-31 18:58:19 +0000
Project ID,bbe0d94c-736b-477a-a2e6-fef9fe7ac9ca
State,StableOK


ID,Name,Cores,RAM,Disk,Image,Image Type,Host,Site,Username,Management IP,State,Error,SSH Command,Public SSH Key File,Private SSH Key File
9bb6372a-3033-40f7-934d-b392596a2cc3,cpnode,8,32,100,default_ubuntu_22,qcow2,ucsd-w1.fabric-testbed.net,UCSD,ubuntu,132.249.252.178,Active,,ssh -i /home/fabric/work/fabric_config/slice_key -F /home/fabric/work/fabric_config/ssh_config ubuntu@132.249.252.178,/home/fabric/work/fabric_config/slice_key.pub,/home/fabric/work/fabric_config/slice_key
996643ee-8c87-476d-9c32-df55fc14132c,wknode1,2,8,10,default_ubuntu_22,qcow2,ucsd-w1.fabric-testbed.net,UCSD,ubuntu,132.249.252.141,Active,,ssh -i /home/fabric/work/fabric_config/slice_key -F /home/fabric/work/fabric_config/ssh_config ubuntu@132.249.252.141,/home/fabric/work/fabric_config/slice_key.pub,/home/fabric/work/fabric_config/slice_key
7a5e1138-bdce-4add-901d-fc6fb7116c65,wknode2,2,8,10,default_ubuntu_22,qcow2,ucsd-w1.fabric-testbed.net,UCSD,ubuntu,132.249.252.162,Active,,ssh -i /home/fabric/work/fabric_config/slice_key -F /home/fabric/work/fabric_config/ssh_config ubuntu@132.249.252.162,/home/fabric/work/fabric_config/slice_key.pub,/home/fabric/work/fabric_config/slice_key


ID,Name,Layer,Type,Site,Subnet,Gateway,State,Error
09d94b49-4aa2-4a31-b1ce-bd0a19e257e5,site_bridge_net,L2,L2Bridge,UCSD,192.168.0.0/24,,Active,


Name,Short Name,Node,Network,Bandwidth,Mode,VLAN,MAC,Physical Device,Device,IP Address,Numa Node,Switch Port
cpnode-cpnode_NIC_Basic_nic-p1,p1,cpnode,site_bridge_net,100,config,,02:DC:BE:A5:82:1D,enp7s0,enp7s0,192.168.0.1,6,HundredGigE0/0/0/5
wknode1-wknode1_NIC_Basic_nic-p1,p1,wknode1,site_bridge_net,100,config,,06:13:70:7C:6E:FE,enp7s0,enp7s0,192.168.0.2,6,HundredGigE0/0/0/5
wknode2-wknode2_NIC_Basic_nic-p1,p1,wknode2,site_bridge_net,100,config,,0A:19:42:43:96:03,enp7s0,enp7s0,192.168.0.3,6,HundredGigE0/0/0/5



Time to print interfaces 286 seconds


'cfca4285-0a65-42b8-8eaa-c3b49e8c5876'

Get the slice details for the existing slice.

In [8]:
slice = fablib.get_slice(name=slice_name)
slice.show()

nets = slice.list_networks()
nodes = slice.list_nodes()

cpnode = slice.get_node(name="cpnode")    
wknode1 = slice.get_node(name="wknode1")
wknode2 = slice.get_node(name="wknode2")

# Get node IP addresses
cpnode_addr = cpnode.get_interface(network_name=FABRIC_L2NET_STR).get_ip_addr()
wknode1_addr = wknode1.get_interface(network_name=FABRIC_L2NET_STR).get_ip_addr()
wknode2_addr = wknode2.get_interface(network_name=FABRIC_L2NET_STR).get_ip_addr()

wknode1_iface = wknode1.get_interface(network_name=FABRIC_L2NET_STR)
wknode2_iface = wknode2.get_interface(network_name=FABRIC_L2NET_STR)

print(f"{cpnode_addr = } \n{wknode1_addr = } \n{wknode2_addr = }")

0,1
ID,cfca4285-0a65-42b8-8eaa-c3b49e8c5876
Name,k8s_calico_UCSD_xmei_0000124604
Lease Expiration (UTC),2025-02-01 18:58:19 +0000
Lease Start (UTC),2025-01-31 18:58:19 +0000
Project ID,bbe0d94c-736b-477a-a2e6-fef9fe7ac9ca
State,StableOK


ID,Name,Layer,Type,Site,Subnet,Gateway,State,Error
09d94b49-4aa2-4a31-b1ce-bd0a19e257e5,site_bridge_net,L2,L2Bridge,UCSD,192.168.0.0/24,,Active,


ID,Name,Cores,RAM,Disk,Image,Image Type,Host,Site,Username,Management IP,State,Error,SSH Command,Public SSH Key File,Private SSH Key File
9bb6372a-3033-40f7-934d-b392596a2cc3,cpnode,8,32,100,default_ubuntu_22,qcow2,ucsd-w1.fabric-testbed.net,UCSD,ubuntu,132.249.252.178,Active,,ssh -i /home/fabric/work/fabric_config/slice_key -F /home/fabric/work/fabric_config/ssh_config ubuntu@132.249.252.178,/home/fabric/work/fabric_config/slice_key.pub,/home/fabric/work/fabric_config/slice_key
996643ee-8c87-476d-9c32-df55fc14132c,wknode1,2,8,10,default_ubuntu_22,qcow2,ucsd-w1.fabric-testbed.net,UCSD,ubuntu,132.249.252.141,Active,,ssh -i /home/fabric/work/fabric_config/slice_key -F /home/fabric/work/fabric_config/ssh_config ubuntu@132.249.252.141,/home/fabric/work/fabric_config/slice_key.pub,/home/fabric/work/fabric_config/slice_key
7a5e1138-bdce-4add-901d-fc6fb7116c65,wknode2,2,8,10,default_ubuntu_22,qcow2,ucsd-w1.fabric-testbed.net,UCSD,ubuntu,132.249.252.162,Active,,ssh -i /home/fabric/work/fabric_config/slice_key -F /home/fabric/work/fabric_config/ssh_config ubuntu@132.249.252.162,/home/fabric/work/fabric_config/slice_key.pub,/home/fabric/work/fabric_config/slice_key


cpnode_addr = IPv4Address('192.168.0.1') 
wknode1_addr = IPv4Address('192.168.0.2') 
wknode2_addr = IPv4Address('192.168.0.3')


In [None]:
# Some helper functions
def execute_single_node(node, commands):
    for command in commands:
        print(f'\tExecuting "{command}" on node {node.get_name()}')
        #stdout, stderr = node.execute(command, quiet=True, output_file=node.get_name() + '_install.log')
        stdout, stderr = node.execute(command)
    if not stderr and len(stderr) > 0:
        print(f'Error encountered with "{command}": {stderr}')

def execute_commands(node, commands):
    if isinstance(node, list):
        for n in node:
            execute_single_node(n, commands)
    else:
        execute_single_node(node, commands)

## Start the control plane

In [None]:
try:
    file_attributes = cpnode.upload_file(local_file_path="config_control_plane.sh", remote_file_path="config_control_plane.sh")
    
    stdout, stderr = cpnode.execute(f"chmod +x config_control_plane.sh && ./config_control_plane.sh {FABRIC_SUBNET_STR} {cpnode_addr}")

except Exception as e:
    print(f"Exception: {e}")
 

## Start the worker nodes

In [None]:
join_cmd = "sudo kubeadm join 10.146.4.2:6443 --token l255bp.wqi5i6br0jg7f4z2 --discovery-token-ca-cert-hash sha256:069646097e377bcd9e6d66ee7779a0d3e0930d95d1e6a79f73f22f70b1098b5e"

try:
    # Skip node1 (control plane) and loop through worker nodes
    for i in ["wknode1", "wknode2"]:
        node = globals()[i]
        print(f"\nConfiguring {i}...")
        
        # Upload and execute config script
        try:
            print(f"Uploading and executing config_worker_node.sh on {i}...")   
            file_attributes = node.upload_file(
                local_file_path="config_worker_node.sh", 
                remote_file_path="config_worker_node.sh"
            )
            stdout, stderr = node.execute("chmod +x config_worker_node.sh && ./config_worker_node.sh ${join_cmd}")
            print(f"Config output for {i}:", stdout)
        except Exception as e:
            print(f"Failed to configure {i}: {e}")
            continue  # Skip to next node if configuration fails

except Exception as e:
    print(f"Main exception: {e}")