# Lab 2: Reliable Data Transfer
    
In this lab, you will:
1. Review Python code for an implementation of Go-Back-N to perform reliable data transfer on top of UDP
2. Run the Go-Back-N program to transfer a file over a wide-area network using the Fabric testbed
3. Experimentally validate the relationship between window size and throughput discussed in lecture

    
<b> Prerequisites  
    
* You need to have your FABRIC bastion host key pair set up to do this tutorial. If you have not already set this up, follow steps 1-3 at https://learn.fabric-testbed.net/knowledge-base/logging-into-fabric-vms/.
* You should be comfortable using ssh and executing basic commands using a UNIX shell. [Tips about how to login to hosts.](https://learn.fabric-testbed.net/knowledge-base/logging-into-fabric-vms/)

Note that this is the second step in this assignment. If you have not already created your slice, go to slice creation notebook or click [Here](./CreateSlice.ipynb)

## 1. Set up the Experiment


### 1.1  Retrieve Slice
Import the slice you created in the [Create Slice Notebook](./CreateSlice.ipynb).


In [None]:
from fabrictestbed_extensions.fablib.fablib import FablibManager as fablib_manager

fablib = fablib_manager()
                     
fablib.show_config()

import json
import traceback

In [None]:
slice_name = "Lab02_RDT"
slice = fablib.get_slice(slice_name)
slice.list_nodes()

### 1.2 Upload files
Upload test programs to each node.

In [None]:
for node in slice.get_nodes():
    
    node.upload_file("testprogs/GBN_Client.py","GBN_Client.py")
    node.upload_file("testprogs/GBN_Server.py","GBN_Server.py")
    node.upload_file("testprogs/util.py","util.py") 
    
    node.upload_file("testprogs/test_file_10KB.txt","test_file_10KB.txt")
    node.upload_file("testprogs/test_file_100KB.txt","test_file_100KB.txt")
    node.upload_file("testprogs/test_file_1MB.txt","test_file_1MB.txt")


## 2. Run Experiment

### 2.1 Getting Started

1. SSH into each node:
    - For each node:
        - From the output of running the second cell under 1.1 above, copy the provided "SSH Command". This should be something like `ssh -i /home/fabric/work/fabric_config/slice_key -F /home/fabric/work/fabric_config/ssh_config ubuntu@205.172.170.122`
        - Open a terminal window in the JupyterHub and enter the copied command
        
        
2. Find the IP address for each node:
    - In the terminal of each node, enter the command `ip addr`.
    - You should see an output similar to the following (network addresses will differ):
    
        ```
        1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
            link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
            inet 127.0.0.1/8 scope host lo
               valid_lft forever preferred_lft forever
            inet6 ::1/128 scope host 
               valid_lft forever preferred_lft forever
        2: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc fq_codel state UP group default qlen 1000
            link/ether fa:16:3e:6b:8d:1c brd ff:ff:ff:ff:ff:ff
            inet 10.40.6.228/23 brd 10.40.7.255 scope global dynamic enp3s0
               valid_lft 73414sec preferred_lft 73414sec
            inet6 2620:0:c80:1003:f816:3eff:fe6b:8d1c/64 scope global dynamic mngtmpaddr noprefixroute 
               valid_lft 86397sec preferred_lft 14397sec
            inet6 fe80::f816:3eff:fe6b:8d1c/64 scope link 
               valid_lft forever preferred_lft forever
        3: enp7s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
            link/ether 06:70:14:5e:a2:01 brd ff:ff:ff:ff:ff:ff
            inet 10.132.129.2/24 scope global enp7s0
               valid_lft forever preferred_lft forever
            inet6 fe80::470:14ff:fe5e:a201/64 scope link 
               valid_lft forever preferred_lft forever
        ```
    - Look at the entry for interface `enp7s0`. We will use its IPv4 address, which is the one following the word `inet`. In the example above, this would be `10.132.129.2`.
    - **Record the IP addresses of your client and server nodes in Lab2.docx**


## Installing Packages

The test programs use the Python package `scapy` to make constructing packet headers easier. You’ll need to install this on both of your Fabric nodes. On each node, do the following:

- Update the Ubuntu package manager with the command: `sudo apt update`
- Install the Python package manager pip with the command: `sudo apt install -y python3-pip`
- Install the scapy package with: `pip3 install scapy`

## Finding RTT

You can use the built in `ping` program to measure the roundtrip time between two nodes.

- On your Client Node, run the command: `ping <ServerNode_IP_Address>`, where `<ServerNode_IP_Address>` is replaced with the IP address of your Server Node found in “Getting Started” above
- Let it run for at least 10 pings and then use CTRL-C to kill the program
- The final line of output should look something like:
```
rtt min/avg/max/mdev = 56.041/56.279/57.148/0.403 ms
```
Here, the average RTT is 56.279ms (the second value in the output line)
**Enter the average RTT between your nodes in Lab2.docx**

## Go-Back-N File Transfer Programs

- Review the code for the `GBN_Client.py` and `GBN_Server.py` programs (we will do this together in class). Note that you can view them on Github, or download and view in your preferred text editor. Or you can view them on a remote Fabric node using the command `cat <filename>` (e.g. `cat GBN _Server.py`) to print the file contents to your terminal window.

- Run the GBN_Server.py program on your Server Node using the command:
```
python3 GBN_Server.py -f copy_100KB.txt
```

- Run the GBN_Client.py program on your Client Node using the command:
```
python3 GBN_Client.py -a <ServerNode_IP_Address> -f test_file_100KB.txt
```
where `<ServerNode_IP_Address>` is replaced with the IP address of your Server Node found in “Getting Started” above.

The Client will transfer the 100 Kilobyte file `test_file_100KB.txt` to the server, which will save it as `copy_100KB.txt`

**Copy and paste the output from your server and client into Lab2.docx**

- To verify that the file was copied correctly, you should run the process in reverse to copy it back to your Client Node and compare it to the original file.
    - Run the GBN_Server.py program on your **Client Node** using the command: `python3 GBN_Server.py -f returned_100KB.txt`
    - Run the GBN_Client.py program on your **Server Node** using the command: `python3 GBN_Client.py -a <ClientNode_IP_Address> -f copy_100KB.txt`
    - Compare the resulting `returned_100KB.txt` to the original `test_file_100KB.txt` with the command: `diff test_file_100KB.txt returned_100KB.txt`
    If the `diff` command gives no output, then the files are identical (and everything worked correctly). If the files differ, the diff command will print out the differences between the files (if this happens, something went wrong – talk to the instructor to check your setup). Note that you can also visually inspect a file using the command `cat <filename>` (e.g. `cat copy_100KB.txt`) to print it to the terminal.

## Window Size and Throughput

By default, the GBN_Client.py program sends packets of 1000 bytes each and uses a window of only 1 packet (so it is actually equivalent to the “Stop-and-Wait” protocol we discussed).

- Based on the packet size and window size information above, calculate the expected throughput (in Mbps).

Recall, that since the sender is limited to one window of unacknowledged packets at any time, and receiving an ACK takes one RTT, our expected throughput is approximately: **(window\_size x packet\_size)/RTT**

(be careful with units: you’ll likely want to convert the packet size from bytes to bits and RTT from milliseconds to seconds. Then, convert the result in bits/sec to Mbps)

**Enter your expected throughput for a window size of 1 packet in Lab2.docx**

Does your measured result from 3a above match the expected result that you calculated?
**Comment in Lab2.docx**

- Calculate the window size needed to achieve a rate of 1 Mbps based on a packet size of 1000 bytes and the RTT between your nodes
**Enter your answer in Lab2.docx**

- You can change the window size used by the GBN_Client by using the -w command line option. Re-run the GBN Client and Server using the window you calculated in part (2) above to transfer the 1 Megabyte test file.
On Server Node:
```
python3 GBN_Server.py -f copy_1MB.txt
```

On Client Node:
```
python3 GBN_Client.py -a <ServerNode_IP_Address> -f test_file_1MB.txt -w <Calculated_Window_Size>
```
**Copy and paste your output in Lab2.docx Do the results match your expectation?**

## Loss Emulation

Since we can’t predict whether we will actually encounter packet loss during our experiments, here we will artificially inject loss to examine its effects on our programs

- To create artificial loss, run the following command on each of your Fabric Nodes:
```
sudo tc qdisc add dev enp7s0 root netem loss 1%
```

Don’t worry about the details of this command – its effect is to randomly drop 1% of the packets leaving each node.

- Run the GBN_Server.py program on your Server Node using the command:
```
python3 GBN_Server.py -f copy_1MB.txt
```

- Run the GBN_Client.py program on your Client Node using the command:
```
python3 GBN_Client.py -a <ServerNode_IP_Address> -f test_file_1MB.txt -w <Calculated_Window_Size>
```

where `<ServerNode_IP_Address>` is replaced with the IP address of your Server Node found in the “Getting Started” section above. **Copy and paste the output from your server and client into Lab2.docx**

How many timeouts do you observe? How many would you expect (based on loss rate and total number of 1000-byte packets needed to transfer a 1 MB file)? **Answer in Lab2.docx**

- Remove the emulated loss by running the following command on both nodes:
```
sudo tc qdisc del dev enp7s0 root
```

## Termination

Notice that the GBN_Server waits for 2 seconds before exiting. Review the GBN_Server.py code to see where this happens.

- Why do you think this timeout is needed? **Answer in Lab2.docx**
- To illustrate why the timeout is used, try the following to create a scenario with high loss (25%) and a short timeout (100ms):

On each node:
```
sudo tc qdisc add dev enp7s0 root netem loss 25%
```

On Server Node:
```
python3 GBN_Server.py -f copy_10KB.txt --final-timeout 0.1
```

On Client Node:
```
python3 GBN_Client.py -f test_file_10KB.txt -w <Window_for_1Mbps> -a <ServerNode_IP_Address>
```

If you run this scenario several times (re-run the server and client, without changing the loss settings), you will likely encounter the scenario where the sender (client) does not terminate but instead keeps trying to retransmit the last packet (in this case, you should `CTRL+C` the client to kill the program). What is the specific event that leads to this outcome? **Answer in Lab2.docx**

- Remove the emulated loss by running the following command on both nodes:
```
sudo tc qdisc del dev enp7s0 root
```

## Bonus: Improve the programs

You can earn bonus points for implementing a new feature to improve the program. The amount of the bonus will depend on how interesting the improvement is.

- A simple idea (5 points):
    - Instead of requiring the server specify a file name to save the file as, have the client send a special first packet that gives the file name (this will likely require adding a new packet type in util.py)
- More involved (15 points):
    - Change the Go-Back-N implementation to buffer out-of-order packets instead of discarding them, so the sender does not need to re-send the entire window on loss (similar to TCP)


## Cleanup Resources

Once you have completed the steps above, delete your slice to free up resources for other users. Note: if you stopped the notebook between running the first 3 code cells and getting to this point, you should re-run the first 2 code cells (but not the third) to retrieve the slice before running the following cell)

In [None]:
try:
    slice.delete()
except Exception as e:
    print(f"Fail: {e}")