# BGP Reconvergence Experiment

A BGP experiment consists of turning off an active control interface (i.e., not an interface on a compute node or an interface on a leaf attached to a compute subnet). The resulting behavior from the generated BGP UPDATE messages that withdraw the necessary routes are collected via:
    
1. log files to show how the FRR BGP implementation handles the updates.
2. Packet captures that collect the BGP UPDATE messages that the BGP implementation uses to make modifications.

The steps this book takes are as follows:

1. <span style="color: #de4815"><b>Store test infrastructure information</b></span>

2. <span style="color: #de4815"><b>Clear all existing bgpd (FRR BGP-4 daemon) logs</b></span>
---
```bash
sudo truncate -s 0 /location/of/frr/bgpd/log # Delete the bgpd UPDATE log entries
sudo rm /location/of/scripts/captures        # Delete the bgpd traffic entries
sudo rm /location/of/scripts/logs            # Delete additional bgpd-related logs
```
---

3. <span style="color: #de4815"><b>Bring the interface down.</b></span>

This can be acomplished in two different ways, either by already knowing the interface name (ethX), or querying FABRIC to determine the interface name.

---
```bash
sudo ip link set dev ethX down # X = interface number (ex: X = 1, eth1)
```
---

4. <span style="color: #de4815"><b>Collect the logs</b></span>
---
```bash
# Route withdraw
2024/04/13 19:57:01.336 BGP: [PAPP6-VDAWM] 172.16.8.1(S-1-1) rcvd UPDATE about 192.168.2.0/24 IPv4 unicast -- withdrawn

# Route announcement?
2024/04/13 19:56:58.639 BGP: [Z38CW-7NYWG] group_announce_route_walkcb: afi=IPv4, safi=unicast, p=192.168.3.0/24
```
---

## Infrastructure Information

In [None]:
# Slice information
SLICE_NAME = "1pod_test_bgp"
NETWORK_NODE_PREFIXES = "T,S,L"
COMPUTE_NODE_PREFIXES = "C"

# Failure point
NODE_TO_FAIL_INTF = "L-1"
INTF_IS_ETH = False
INTF_NAME = None
NEIGHBOR_LOST = "T-1"

# Local directory location (where to download remote logs)
LOG_DIR_PATH = "../logs/first_test_logs"

In [None]:
import os
import time

# Remote log locations
lOG_FRR_NAME = "/var/log/frr/bgpd.log"
LOG_CAP_NAME = "~/bgp_scripts/bgp_update_only.pcap"
LOG_OVERHEAD_NAME = "~/bgp_scripts/overhead.txt"
# TRAFFIC_RESULTS
# INTF_DOWN_TIME


# If the logs directory does not already exist, create it
subdirs = ["captures", "overhead", "convergence"]
if not os.path.exists(LOG_DIR_PATH):
    for subdir in subdirs:
        os.makedirs(os.path.join(LOG_DIR_PATH, subdir)) 

In [None]:
from FabUtils import FabOrchestrator

try:
    manager = FabOrchestrator(SLICE_NAME)
    
except Exception as e:
    print(f"Exception: {e}")

## Run Experiment

In [None]:
# Start data collection
startLoggingCmd = "bash ~/bgp_scripts/bgp_data_collection.sh"
manager.executeCommandsParallel(startLoggingCmd, prefixList=NETWORK_NODE_PREFIXES)
print("BGP data collection started.")

In [None]:
print("Giving the nodes time to get configured...")
time.sleep(10)

In [None]:
# Take the interface down
if(INTF_IS_ETH):
    intfName = NODE_TO_FAIL_INTF
else:
    fabricIntf = manager.slice.get_interface(f"{NODE_TO_FAIL_INTF}-intf-{NEIGHBOR_LOST}-p1")
    intfName = fabricIntf.get_device_name()

failIntfCmd = f"sudo ip link set dev {intfName} down"

# Run this command only on node NODE_TO_FAIL_INTF 
manager.executeCommandsParallel(failIntfCmd, prefixList=NODE_TO_FAIL_INTF)

In [None]:
print("Giving the nodes time to get reconverged...")
time.sleep(10)

In [None]:
stopLoggingCmd = "tmux kill-session -t bgp"
manager.executeCommandsParallel(stopLoggingCmd, prefixList=NETWORK_NODE_PREFIXES)
print("BGP data collection stopped.")

## Collect Logs

In [None]:
# Download BGP message capture file
overheadCaptureFileLocation = "/home/rocky/bgp_scripts/bgp_update_only.pcap"
manager.downloadFilesParallel(os.path.join(LOG_DIR_PATH, "captures", "{name}_update.pcap" ), overheadCaptureFileLocation, prefixList=NETWORK_NODE_PREFIXES, addNodeName=True)

# Download BGP traffic overhead analysis file
overheadLogFileLocation = "/home/rocky/bgp_scripts/overhead.log"
manager.downloadFilesParallel(os.path.join(LOG_DIR_PATH, "overhead", "{name}_overhead.log" ), overheadLogFileLocation, prefixList=NETWORK_NODE_PREFIXES, addNodeName=True)

## Cleanup

In [None]:
# Bring the interface back up.
restoreIntfCmd = f"sudo ip link set dev {intfName} up"

# Run this command only on node NODE_TO_FAIL_INTF 
manager.executeCommandsParallel(restoreIntfCmd, prefixList=NODE_TO_FAIL_INTF)