# TCP Congestion Control

In this notebook you will: 

-   Reserve resources for this experiment
-   Configure your reserved resources
-   Access your reserved resources over SSH
-   Retrieve files saved on a FABRIC resources
-   Visualize the data retrieved
-   Extend your FABRIC reservation (in case you need more time) or delete it (in case you finish early)

### Exercise: Reserve resources

In this exercise, we will reserve resources on FABRIC: two hosts on two different network segments, connected by a router.

In [1]:
from fabrictestbed_extensions.fablib.fablib import FablibManager as fablib_manager
fablib = fablib_manager() 
conf = fablib.show_config()

0,1
Credential Manager,cm.fabric-testbed.net
Orchestrator,orchestrator.fabric-testbed.net
Token File,/home/fabric/.tokens.json
Project ID,40ba6253-c257-4d9c-8658-7406bb437eb5
Bastion Username,mim7995_0000088864
Bastion Private Key File,/home/fabric/work/fabric_config/fabric_bastion_key
Bastion Host,bastion.fabric-testbed.net
Bastion Private Key Passphrase,
Slice Public Key File,/home/fabric/work/fabric_config/slice_key.pub
Slice Private Key File,/home/fabric/work/fabric_config/slice_key


In [2]:
!chmod 600 {fablib.get_bastion_key_filename()}
!chmod 600 {fablib.get_default_slice_private_key_file()}

In [3]:
import os
slice_name="CongestionAvoidance_" + os.getenv('NB_USER')

In [4]:
try:
    slice = fablib.get_slice(slice_name)
    print("You already have a slice by this name!")
    print("If you previously reserved resources, skip to the 'log in to resources' section.")
except:
    print("You don't have a slice named %s yet." % slice_name)
    print("Continue to the next step to make one.")
    slice = fablib.new_slice(name=slice_name)

You don't have a slice named CongestionAvoidance_mim7995 yet.
Continue to the next step to make one.


Next, we’ll select a random FABRIC site for our experiment. We’ll make sure to get one that has sufficient capacity for the experiment we’re going to run.

Once we find a suitable site, we’ll print details about available resources at this site.

In [5]:
exp_requires = {'core': 3*2, 'nic': 4}
while True:
    site_name = fablib.get_random_site()
    if ( (fablib.resources.get_core_available(site_name) > 1.2*exp_requires['core']) and
        (fablib.resources.get_component_available(site_name, 'SharedNIC-ConnectX-6') > 1.2**exp_requires['nic']) ):
        break

fablib.show_site(site_name)

0,1
Name,CERN
State,Active
Address,"Meyrin site, Auvergne-Rhone-Alpes, Metropolitan France, 01280, France"
Location,"(46.23783255, 6.048546179253215)"
Hosts,6
CPUs,12
Cores Available,120
Cores Capacity,768
Cores Allocated,648
RAM Available,1644


'<pandas.io.formats.style.Styler object at 0x7f4e9c59ff70>'

In [6]:
# this cell sets up the hosts and router
node_names = ["juliet", "router", "romeo"]
for n in node_names:
    slice.add_node(name=n, site=site_name, cores=2, ram=4, disk=10, image='default_ubuntu_20')

In [7]:
# this cell sets up the network links
nets = [
    {"name": "net0",   "nodes": ["juliet", "router"]},
    {"name": "net1",  "nodes": ["router", "romeo"]}
]
for n in nets:
    ifaces = [slice.get_node(node).add_component(model="NIC_Basic", name=n["name"]).get_interfaces()[0] for node in n['nodes'] ]
    slice.add_l2network(name=n["name"], type='L2Bridge', interfaces=ifaces)

The following cell submits our request to the FABRIC site. The output of this cell will update automatically as the status of our request changes.

While it is being prepared, the “State” of the slice will appear as “Configuring”.
When it is ready, the “State” of the slice will change to “StableOK”.

In [8]:
slice.submit()


Retry: 8, Time: 294 sec


0,1
ID,f3ec234a-a8b8-44c1-a07b-c9231897f7d3
Name,CongestionAvoidance_mim7995
Lease Expiration (UTC),2023-07-19 11:58:03 +0000
Lease Start (UTC),2023-07-18 11:58:03 +0000
Project ID,40ba6253-c257-4d9c-8658-7406bb437eb5
State,StableOK


ID,Name,Cores,RAM,Disk,Image,Image Type,Host,Site,Username,Management IP,State,Error,SSH Command,Public SSH Key File,Private SSH Key File
3c05620a-6b16-4160-95ee-353d070b82ad,juliet,2,4,10,default_ubuntu_20,qcow2,cern-w2.fabric-testbed.net,CERN,ubuntu,2001:400:a100:3090:f816:3eff:fe9d:63ae,Active,,ssh -i /home/fabric/work/fabric_config/slice_key -F /home/fabric/work/fabric_config/ssh_config ubuntu@2001:400:a100:3090:f816:3eff:fe9d:63ae,/home/fabric/work/fabric_config/slice_key.pub,/home/fabric/work/fabric_config/slice_key
b39283dc-78d7-48c0-9865-737ea8e31650,romeo,2,4,10,default_ubuntu_20,qcow2,cern-w2.fabric-testbed.net,CERN,ubuntu,2001:400:a100:3090:f816:3eff:fe16:2c2e,Active,,ssh -i /home/fabric/work/fabric_config/slice_key -F /home/fabric/work/fabric_config/ssh_config ubuntu@2001:400:a100:3090:f816:3eff:fe16:2c2e,/home/fabric/work/fabric_config/slice_key.pub,/home/fabric/work/fabric_config/slice_key
d2b5c64a-fce8-4a57-b3c0-9f538a6c4d22,router,2,4,10,default_ubuntu_20,qcow2,cern-w5.fabric-testbed.net,CERN,ubuntu,2001:400:a100:3090:f816:3eff:fea7:b0c9,Active,,ssh -i /home/fabric/work/fabric_config/slice_key -F /home/fabric/work/fabric_config/ssh_config ubuntu@2001:400:a100:3090:f816:3eff:fea7:b0c9,/home/fabric/work/fabric_config/slice_key.pub,/home/fabric/work/fabric_config/slice_key


ID,Name,Layer,Type,Site,Subnet,Gateway,State,Error
729edd1a-aef6-410f-a9e0-dc7f42e9a687,net0,L2,L2Bridge,CERN,,,Active,
3bdd57ff-e299-4949-854b-4ce5f560bd6f,net1,L2,L2Bridge,CERN,,,Active,


Name,Short Name,Node,Network,Bandwidth,Mode,VLAN,MAC,Physical Device,Device,IP Address,Numa Node
juliet-net0-p1,p1,juliet,net0,100,config,,0E:D9:C9:B3:C3:76,ens7,ens7,,4
router-net0-p1,p1,router,net0,100,config,,16:4C:CA:94:B9:B2,ens8,ens8,,4
router-net1-p1,p1,router,net1,100,config,,0E:E6:8D:DD:E7:2C,ens7,ens7,,4
romeo-net1-p1,p1,romeo,net1,100,config,,12:2A:3F:58:CA:55,ens7,ens7,,4



Time to print interfaces 302 seconds


'f3ec234a-a8b8-44c1-a07b-c9231897f7d3'

In [9]:
slice.wait_ssh(progress=True)

Waiting for slice . Slice state: StableOK
Waiting for ssh in slice . ssh successful


True

### Exercise: Configure resources

Next, we need to configure our resources - assign IP addresses to network interfaces, enable forwarding on the router, and install any necessary software.

First, we’ll configure IP addresses and add the IP addresses and hostnames to the host files:

In [10]:
from ipaddress import ip_address, IPv4Address, IPv4Network

if_conf = {
    "romeo-net1-p1":   {"addr": "10.10.1.100", "subnet": "10.10.1.0/24", "hostname": "romeo"},
    "router-net1-p1":  {"addr": "10.10.1.1", "subnet": "10.10.1.0/24", "hostname": "router"},
    "router-net0-p1":  {"addr": "10.10.2.1", "subnet": "10.10.2.0/24", "hostname": "router"},
    "juliet-net0-p1":  {"addr": "10.10.2.100", "subnet": "10.10.2.0/24", "hostname": "juliet"}
}

for iface in slice.get_interfaces():
    if_name = iface.get_name()
    hostname = if_conf[if_name]['hostname']
    iface.ip_addr_add(addr=if_conf[if_name]['addr'], subnet=IPv4Network(if_conf[if_name]['subnet']))
    

slice.get_node(name='romeo').execute(f"echo '{if_conf['juliet-net0-p1']['addr']}\t{if_conf['juliet-net0-p1']['hostname']}' | sudo tee -a /etc/hosts > /dev/null")
slice.get_node(name='juliet').execute(f"echo '{if_conf['romeo-net1-p1']['addr']}\t{if_conf['romeo-net1-p1']['hostname']}' | sudo tee -a /etc/hosts > /dev/null")
slice.get_node(name='router').execute(f"echo '{if_conf['romeo-net1-p1']['addr']}\t{if_conf['romeo-net1-p1']['hostname']}' | sudo tee -a /etc/hosts > /dev/null")
slice.get_node(name='router').execute(f"echo '{if_conf['juliet-net0-p1']['addr']}\t{if_conf['juliet-net0-p1']['hostname']}' | sudo tee -a /etc/hosts > /dev/null")

('', '')

Let’s make sure that all of the network interfaces are brought up:

In [11]:
for iface in slice.get_interfaces():
    iface.ip_link_up()

And, we’ll enable IP forwarding on the router:

In [12]:
for n in ['router']:
    slice.get_node(name=n).execute("sudo sysctl -w net.ipv4.ip_forward=1")

net.ipv4.ip_forward = 1


Then, we’ll add routes so that romeo knows how to reach juliet, and vice versa.

In [13]:
rt_conf = [
    {"name": "romeo",   "addr": "10.10.2.0/24", "gw": "10.10.1.1"},
    {"name": "juliet",  "addr": "10.10.1.0/24", "gw": "10.10.2.1"}
]
for rt in rt_conf:
    slice.get_node(name=rt['name']).ip_route_add(subnet=IPv4Network(rt['addr']), gateway=rt['gw'])


Finally, we’ll install some software. For this experiment, we will need to install the net-tools package (which provides the ifconfig command).

In [14]:
for n in ['romeo', 'router', 'juliet']:
    slice.get_node(name=n).execute("sudo apt update; sudo apt -y install net-tools", quiet=True)

### Exercise: Log in to resources
Now, we are finally ready to log in to our resources over SSH! Run the following cells, and observe the table output - you will see an SSH command for each of the nodes in your topology.

In [15]:
import pandas as pd
pd.set_option('display.max_colwidth', None)
ssh_str = 'ssh -i ' + slice.get_slice_private_key_file() + \
    ' -J ' + fablib.get_bastion_username() + '@' + fablib.get_bastion_public_addr() + \
    ' -F /home/fabric/work/fabric_config/ssh_config '
slice_info = [{'Name': n.get_name(), 'SSH command': ssh_str + n.get_username() + '@' + str(n.get_management_ip())} for n in slice.get_nodes()]
pd.DataFrame(slice_info).set_index('Name')

Unnamed: 0_level_0,SSH command
Name,Unnamed: 1_level_1
juliet,ssh -i /home/fabric/work/fabric_config/slice_key -J mim7995_0000088864@bastion.fabric-testbed.net -F /home/fabric/work/fabric_config/ssh_config ubuntu@2001:400:a100:3090:f816:3eff:fe9d:63ae
router,ssh -i /home/fabric/work/fabric_config/slice_key -J mim7995_0000088864@bastion.fabric-testbed.net -F /home/fabric/work/fabric_config/ssh_config ubuntu@2001:400:a100:3090:f816:3eff:fea7:b0c9
romeo,ssh -i /home/fabric/work/fabric_config/slice_key -J mim7995_0000088864@bastion.fabric-testbed.net -F /home/fabric/work/fabric_config/ssh_config ubuntu@2001:400:a100:3090:f816:3eff:fe16:2c2e


Now, you can open an SSH session on any of the nodes as follows:

-   In Jupyter, from the menu bar, use File \> New \> Terminal to open a new terminal.
-   Copy an SSH command from the table, and paste it into the terminal. (Note that each SSH command is a single line, even if the display wraps the text to a second line! When you copy and paste it, paste it all together.)

You can repeat this process (open several terminals) to start a session on each host and the router. Each terminal session will have a tab in the Jupyter environment, so that you can easily switch between them.

Now you can continue to perform the TCP congestion control experiment on these host sessions.

### Exercise: Extend the slice's end time

In [16]:
#Check the current end time of your slice in the output of the following cell:
slice.show()

0,1
ID,f3ec234a-a8b8-44c1-a07b-c9231897f7d3
Name,CongestionAvoidance_mim7995
Lease Expiration (UTC),2023-07-19 11:58:03 +0000
Lease Start (UTC),2023-07-18 11:58:03 +0000
Project ID,40ba6253-c257-4d9c-8658-7406bb437eb5
State,StableOK


0,1
ID,f3ec234a-a8b8-44c1-a07b-c9231897f7d3
Name,CongestionAvoidance_mim7995
Lease Expiration (UTC),2023-07-19 11:58:03 +0000
Lease Start (UTC),2023-07-18 11:58:03 +0000
Project ID,40ba6253-c257-4d9c-8658-7406bb437eb5
State,StableOK


In [17]:
from datetime import datetime
from datetime import timezone
from datetime import timedelta

# Set end date to 3 days from now
end_date = (datetime.now(timezone.utc) + timedelta(days=15)).strftime("%Y-%m-%d %H:%M:%S %z")
slice.renew(end_date)

Exception: Failed to renew slice: Status.FAILURE, (500)
Reason: INTERNAL SERVER ERROR
HTTP response headers: HTTPHeaderDict({'Server': 'nginx/1.21.6', 'Date': 'Tue, 18 Jul 2023 12:04:52 GMT', 'Content-Type': 'text/html; charset=utf-8', 'Content-Length': '338', 'Connection': 'keep-alive', 'Access-Control-Allow-Credentials': 'true', 'Access-Control-Allow-Headers': 'DNT, User-Agent, X-Requested-With, If-Modified-Since, Cache-Control, Content-Type, Range, Authorization', 'Access-Control-Allow-Methods': 'GET, POST, PUT, PATCH, DELETE, OPTIONS', 'Access-Control-Allow-Origin': '*', 'Access-Control-Expose-Headers': 'Content-Length, Content-Range, X-Error', 'X-Error': 'PDP Authorization check failed - Policy Violation: Your project is lacking Slice.NoLimitLifetime tag so you cannot renew resource lifetime by longer than two weeks.'})
HTTP response body: b'{\n    "errors": [\n        {\n            "details": "PDP Authorization check failed - Policy Violation: Your project is lacking Slice.NoLimitLifetime tag so you cannot renew resource lifetime by longer than two weeks.",\n            "message": "Internal Server Error"\n        }\n    ],\n    "size": 1,\n    "status": 500,\n    "type": "error"\n}'


In [None]:
#Confirm the new end time of your slice in the output of the following cell:
slice.show()

### Delete your slice resources

If you finished your experiment early, you should delete your slice! The following cell deletes all the resources in your slice, freeing them for other experimenters.

In [None]:
slice.delete()

In [None]:
slice.show()