# TCP Control Tutorial
<i>Adapted for use with FABRIC from [TCP congestion control](https://witestlab.poly.edu/blog/tcp-congestion-control-basics/)</i>
        
<b> Prerequisites  </b>
    
* You need to have your FABRIC bastion host key pair set up to do this tutorial. If you have not already set this up, follow steps 1-3 at https://learn.fabric-testbed.net/knowledge-base/logging-into-fabric-vms/.
* You are comfortable using ssh and executing basic commands using a UNIX shell. [Tips about how to login to hosts.](https://learn.fabric-testbed.net/knowledge-base/logging-into-fabric-vms/)
    <img src="./figures/tcpfigure.svg">
    
     <i>[Wikimedia Commons](https://commons.wikimedia.org/wiki/File:TCP_Slow-Start_and_Congestion_Avoidance.svg)</i>
    
TCP is one of the most widely used internet protocols, carrying the highest volume of traffic (compared to other transport-layer protocols). Because of this, a congestion control algorithm is implemented to prevent slowing due to over-utilization.

Additive-Increase/Multiplicative-Decrease (AIMD) is the primary meachanism for adjusting the rate of TCP flow. The sender transmitting TCP packets on a network will react to event by increasing (under-utilized) or decreasing (over-utilized) its sending rate.
    
A Congestion Window (CWND) is used to control sending rates. TCP maintains a CWND for each connection, limiting the number of unacknowledged packets in transit on the network. If the CWND is equal to unacknowledged packets, sending is stopped until more acknowledgements are recieved. Sending rate is not entirely controlled by the CWND, as TCP uses flow control as well as congestion control. 
    
Slow start is in effect during the  congestion control phase. During this phase the CWND is increased by the number of segments acknowledged on received ACK (exponential growth). This continues until a loss event occurs or a slow start threshold is reached. At the end of congestion control, congestion avoidance begins, where the CWND grows linearly.
    
Fast recovery (on TCP reno) occurs when congestion is detected by receiving duplicate ACKs. The effect of fast recovery is the CWND being reduced to the slow start threshold.

## 1. Design the Experiment
In this section, be careful to do the instructions listed with **"Do this"**, as well as running the code blocks.
### 1.1 Reserve Resources

#### Import the Fabric API

In [None]:
from fabrictestbed_extensions.fablib.fablib import FablibManager as fablib_manager

fablib = fablib_manager()
                     
fablib.show_config()

import json
import traceback

#### Create slice

In [None]:
try:
    #Create Slice
    slice = fablib.new_slice(name="tcpControl")
    
    #Router
    router = slice.add_node(name="router", site="MAX")
    router.set_capacities(cores=4, ram=16, disk=50)
    router.set_image("default_ubuntu_20")
    rPort1 = router.add_component(model='NIC_Basic', name="rPort1").get_interfaces()[0] 
    rPort2 = router.add_component(model='NIC_Basic', name="rPort2").get_interfaces()[0] 
    
    #Host 1
    host1 = slice.add_node(name="host1", site="MAX")
    host1.set_capacities(cores=4, ram=16, disk=50)
    host1.set_image("default_ubuntu_20")
    h1Port = host1.add_component(model='NIC_Basic', name="h1Port").get_interfaces()[0] 
    
    #Host 2
    host2 = slice.add_node(name="host2", site="MAX")
    host2.set_capacities(cores=4, ram=16, disk=50)
    host2.set_image("default_ubuntu_20")
    h2Port = host2.add_component(model='NIC_Basic', name="h2Port").get_interfaces()[0] 
    
    lan1 = slice.add_l2network(name="Lan1", interfaces=[rPort1, h1Port])
    lan2 = slice.add_l2network(name="Lan2", interfaces=[rPort2, h2Port])
    
    #Submit Slice Request
    slice.submit()
except Exception as e:
    print(f"Slice Failed: {e}")

In [None]:
from ipaddress import ip_address, IPv4Address, IPv6Address, IPv4Network, IPv6Network

try:    
    host1 = slice.get_node(name="host1") 
    host2 = slice.get_node(name="host2")
    router = slice.get_node(name="router")
    
    subnet1 = IPv4Network("10.1.1.0/24")
    subnet2 = IPv4Network("11.1.1.0/24")
    
    host1_iface = host1.get_interface(network_name="Lan1")
    host1_iface.ip_addr_add(addr="10.1.1.1", subnet=subnet1)
    
    router_iface = router.get_interface(network_name="Lan1")
    router_iface.ip_addr_add(addr="10.1.1.2", subnet=subnet1)
    
    router_iface2 = router.get_interface(network_name="Lan2")
    router_iface2.ip_addr_add(addr="11.1.1.2", subnet=subnet2)
    
    host2_iface2 = host2.get_interface(network_name="Lan2")
    host2_iface2.ip_addr_add(addr="11.1.1.1", subnet=subnet2) 
    
    host1.execute("sudo ip route add 11.1.1.0/24 via 10.1.1.2")
    router.execute("sudo sysctl -w net.ipv4.ip_forward=1")
    host2.execute("sudo ip route add 10.1.1.0/24 via 11.1.1.2")
except Exception as e:
    print(f"Exception: {e}")

### 1.2 Set up experiment

We will be installing the ```iperf``` network testing tool on both end hosts.

On host2 we will be installing moreutils and data visualization tools, as well as uploading scripts

In [None]:
slice = fablib.get_slice("tcpControl")
for node in slice.get_nodes():
    node.execute("sudo apt-get update;sudo apt-get -y install iperf3;")  
host2 = slice.get_node(name="host2")
host2.upload_file("./scripts/ss-data-analysis.R","ss-data-analysis.R")
host2.upload_file("./scripts/ss-output.sh","ss-output.sh")
host2.execute("sudo apt-get -y install moreutils r-base-core r-cran-ggplot2 r-cran-littler;sudo sysctl -w net.ipv4.tcp_no_metrics_save=1;chmod +x ss-output.sh ")

Here we are configuring the router with a 1 Mbps bottleneck, with a .1 MB buffer in both directions.

In [None]:
router = slice.get_node(name="router")
router.execute("\
sudo tc qdisc del dev ens7 root;\
sudo tc qdisc add dev ens7 root handle 1: htb default 3;\
sudo tc class add dev ens7 parent 1: classid 1:3 htb rate 1Mbit;\
sudo tc qdisc add dev ens7 parent 1:3 handle 3: bfifo limit 0.1MB;\
sudo tc qdisc del dev ens8 root;\
sudo tc qdisc add dev ens8 root handle 1: htb default 3;\
sudo tc class add dev ens8 parent 1: classid 1:3 htb rate 1Mbit;\
sudo tc qdisc add dev ens8 parent 1:3 handle 3: bfifo limit 0.1MB;\
")


## 2. Experiment
### 2.1 Using ```ss``` to observe TCP socket statistics
1. Open up a terminal for host1 and two terminals for host2
2. On host1 open an iperf server by running:
<br>```iperf3 -s  -1 ```

3. On host2 send data to the iperf server (on host1 10.1.1.1) for 60 seconds using TCP Reno by running:
<br>```iperf3 -c 10.1.1.1 -t 60 -C reno```

4. While the connection is running switch to the second host2 terminal and run:
<br>```ss -ein dst 10.1.1.1```
```ss``` 
<br>Arguments:
    * ```-e``` shows detailed socket information
    * ```-i``` to show internal TCP information (only known to sender).
    * ```-n``` show numeric IP addresses (instead of resolved names)
    * ```dst 10.1.1.1``` only show sockets with destination 10.1.1.1 (host1)
You can learn more about ```ss``` [here](https://linux.die.net/man/8/ss)

In the output it can be noticed two connections are created. One of the connections shares information related to iperf control, while the other carries data.

The output also contains:
    * Current CWND (in units of MSS)
    * Slow start threshold (ssthresh), only appears when the flow has entered the congestion avoidance phase
    * Number of retransmitted segments, appearing when there has been a retransmission

### 2.2 Generating traffic
1. On host1 open an iperf server by running:
<br>```iperf3 -s  -1 ```
2. On one of the host2 terminal run (this is the `ss` script):
<br>```sudo bash ss-output.sh 10.1.1.1```

3. On the other host2 terminal run:
<br>```iperf3 -c 10.1.1.1 -P 3 -t 60 -C reno```

4. After ```iperf``` finishes on host1 close out of the ```ss``` program and the ```iperf``` on host2 (Ctrl+C once).
5. Run `ls` on host2, 2 files will be present:
    * `sender-ss.txt` is the raw output
    * `sender-ss.csv` is the parsed output, withe columns containing:
        - Timestamp
        - TCP sender (IP:Port)
        - Number of unacknowledged retransmissions
        - Cumulative number of transmissions
        - Current CWND
        - Current slow start threshold

### 2.3 Visualization
Run the code block below to visualize the TCP traffic from 2.2:

In [None]:
slice = fablib.get_slice("tcpControl")
host2 = slice.get_node(name="host2")
host2.execute("Rscript ss-data-analysis.R  ")
host2.download_file("sender-ss.svg","sender-ss.svg")

See the output [here](./sender-ss.svg)

## 3. Cleanup Resources
### 3.1 Delete Slice

In [None]:
try:
    slice = fablib.get_slice("tcpControl")
    slice.delete()
except Exception as e:
    print(f"Fail: {e}")