# Lab 2: Internet Routing

**Credits:** This lab is an edited version of "A peek into Internet Routing" by
Fraida Fund. See that lab for additional information and resources:
[https://witestlab.poly.edu/blog/a-peek-into-internet-routing/](https://witestlab.poly.edu/blog/a-peek-into-internet-routing/)

## Set up

### Import required packages
The lab requires the following packages. So, we will import them now.

In [None]:
from fabrictestbed_extensions.fablib.fablib import FablibManager as fablib_manager
import pandas as pd

### Slice Initialization
You reserved the required resources in pre-lab. We will set up the relevant objects to work with those resources.

In [None]:
fablib = fablib_manager() 
conf = fablib.show_config()

In [None]:
slice_name="cs2520-lab2-bgp_" + fablib.get_bastion_username()

In [None]:
try:
    slice = fablib.get_slice(slice_name)
    print("Checked: A slice with the name '%s' already exists." % slice_name)
except:
    print("Error: You don't have a slice named %s yet. Please go through 'pre-lab2.ipynb' before proceeding." % slice_name)

### Log into resources
In `pre-lab2.ipynb`, you ran the cells under "Log into resources" to get the SSH commands that you need to use to access the hosts on FABRIC. We will reuse the same commands. As before, the code in the following cell should give us the relevant SSH commands:

In [None]:
pd.set_option('display.max_colwidth', None)
slice_info = [{'Name': n.get_name(), 'SSH command': n.get_ssh_command()} for n in slice.get_nodes()]
pd.DataFrame(slice_info).set_index('Name')

In the pre-lab notebook, we reserved a slice with four nodes at different locations. We are going to check where each node is located: 

In [None]:
slice.list_nodes()

**Record your selected sites here.**

1. Perlman: 
2. Floyd: 
3. Borg: 
4. Kahn: 

## Collect traceroute data

### Choose two destinations

We will explore the internet routes from each of your nodes to different internet destinations.

First, select *2 different publicly reachable internet destinations* (e.g. pitt.edu, ethz.ch, google.com, amazon.com, ...). Note: we will discuss and compare results, so this is most interesting if you don't all pick the same destinations (be creative!).

For each destination, find the IP address for that destination using the `dig` command. For example:

```
dig +short pitt.edu
```

**Run `dig` for each of your 2 destinations on each of your 4 nodes, and record the results. Do you get the same result each time? If not, can you explain why?**

Dig Results: Destination 1 (specify destination here)

- Node 1:
- Node 2:
- Node 3:
- Node 4:

Dig Results: Destination 2 (specify destination here)

- Node 1:
- Node 2:
- Node 3:
- Node 4:

### Run `mtr` traceroutes

In a terminal ssh'd into one of your GENI nodes, run (replacing `pitt.edu` with one of your chosen destinations):

```
mtr --aslookup --show-ips -w pitt.edu
```

In this command:

* `--aslookup` tells `mtr` to try to lookup the ASN associated with each router
  along the path (based on its IP address)
* `--show-ips` tells `mtr` to show the IP address of each router along the
  path, in addition to its hostname
* `-w` tells `mtr` to show its final report in wide format, including all
  available details

Here is a sample output from a node in Florida to pitt.edu:

![](./images/mtr-FIU-to-pitt.jpg)

**Run an `mtr` traceroute for each of your 2 destinations on each of your 4 nodes, and record the results**.

mtr Results: Destination 1 (specify destination here)

- Node 1:


- Node 2:


- Node 3:


- Node 4:

## Understanding the Output

### Traceroute basics

Traceroute (or `mtr`) works by sending probes with specific time-to-live (TTL)
values. When a router gets a packet with a TTL of 0, the router drops the
packet and returns an "ICMP TTL Exceeded" response to the source. Therefore,
by sending probes with increasing TTL values, we can map the path between two
hosts.

Specifically, when we send a packet with TTL 1, the first router on the path
will decrement the TTL, see that it has reached 0, drop it, and send an ICMP
TTL Exceeded to the source. The source can then note the source address of the
ICMP TTL Exceeded to find the first router on the path. It can also measure
the time between sending the probe and getting a response to measure the
roundtrip time to that router. It can then repeat this with TTL 2 to find the
second router, and so on.

To get more information about each router on the path, we can also:

* use a [reverse DNS lookup](https://en.wikipedia.org/wiki/Reverse_DNS_lookup)
  to get the hostname associated with the router's IP address
* look the router IP address up in the [Internet Routing
  Registry](https://www.radb.net/) to get its AS number

### Interpreting `mtr` output

Each line in the output above corresponds to one router on the path from the source to the destination.

Let's look at the first line:

![](./images/mtr-line1.jpg)

Here, we see information about the first hop in our path:

* the *hop number* is 1
* the *AS number* is unknown (shown as AS???)
* the router's *hostname* shows as "_gateway" (This IP address typically serves as the default gateway or the router's IP address within a private network)
* the router's IP address is 10.20.4.1
* the loss rate (percent of probes that did not receive a response) is 0.0
* the number of probes sent is 10
* the last observed RTT is 0.2 ms
* the average RTT over all 10 probes is 0.1 ms
* the best (shortest) observed RTT is 0.1 ms
* the worst observed RTT is 0.2 ms
* the standard deviation of RTTs is 0.0

Similarly, if we look at line 7 of the output

![](./images/mtr-line7.jpg)

we see information about the 7th hop in our path:

* the *hop number* is 7
* the *AS number* is 20080 (This AS is AMPATH (America's Path) which was developed in 2000 as a high performance exchange point in Miami, Florida, United States. AMPATH assists peer based and network research between U.S. and international research and education networks.)
* the router's *hostname* is 'et-0-0-1-67.rt05.bb.ampath.net'
* the router's IP address is 170.39.8.33
* the loss rate (percent of probes that did not receive a response) is 0.0
* the number of probes sent is 10
* the last observed RTT is 0.8 ms
* the average RTT over all 10 probes is 1.4 ms
* the best (shortest) observed RTT is 0.8 ms
* the worst observed RTT is 6.2 ms
* the standard deviation of RTTs is 1.7

### Limitations

Note that as you saw with hop 1, you may not get all of this information for every hop. 
If the AS number is listed as `???`. This just means it could not find a record of
the ASN for that IP address (which could be also be because the IP address is a private address). `mtr` uses the Internet Routing Registry, and you
can verify that there is no record for a IP address by going to the website:
[https://www.radb.net](https://www.radb.net).

As another example, see hop 6:

![](./images/mtr-line6.jpg)

Hop 6 has the AS number and an IP address but no host name.

In some cases, you can infer the AS from the hostname, or the other way around. In this case, we have the AS number AS62. This is from a data center company 'CyrusOne'
([AS62](https://bgp.he.net/AS62)).

Some lines may have no information, like the following at hop 3:

![](./images/mtr-line3.jpg)


This just means the router did not send back an ICMP response (often because it
is configured not to send ICMP TTL exceeded messages).

Normally, the final line of the output will be the destination host, but this
may also be missing if the destination is configured not to send ICMP TTL
exceeded messages.

## Analyze routes

Select one traceroute output (for one node and one destination) to start.

### Identify autonomous systems

For each router on the path, find the organization that manages its AS
using the `whois` command. For example (replace `AS4130` with one of your
ASes):

```
whois AS4130
```

**Record the ASName and OrgName for each router in your selected traceroute.
(note that this may not be possible for every router)**

Next, select one or more ASes from your selected path, and look it up in the
Hurricane Electric BGP toolkit at: [https://bgp.he.net/](https://bgp.he.net/).
Put the ASN into the search box and click "Search".

The "AS Info" tab shows basic information about the AS, the "Prefixes v4" tab
shows IPv4 address blocks that this AS originates, and the "Peers v4" tab shows
other ASes that this AS peers with.

**How many prefixes does your chosen AS originate? List a few of them. Does
your chosen AS have signed ROAs for its prefixes? What are some of the other
ASes it peers with? Record your responses here .**


...

### Analyze geographic path

For each router on the path, list any location details you can infer from the router hostnames.

Routers are often named using airport codes or city name abbreviations to
indicate where they are located.

For example, the router at hop 16 in my sample
traceroute `vl713.fq-core-2.gw.pitt.edu (136.142.2.170)` is
probably located in Pittsburgh, while the router at hop 12
`fourhundredge-0-0-0-1.4079.core1.phil.net.internet2.edu (163.253.1.137)` is probably located
in Philadelphia.

For more examples, see the following section:

#### Inferring geographical location
One of the most common naming conventions for routers is the use of geographical location in the router hostname! You'll often see airport codes or abbreviated city names within a router hostname. Once you geolocate the routers along the path, you can identify the geographical route a packet takes en route to its destination.

Here are some examples of router hostnames including geographical locations in the US:

![](./images/geo-locations-table1.jpg)
![](./images/geo-locations-table2.jpg)


Of course, routers may be located in any city, not only those in the table above - here is an example of a route that goes through London (UK), Paris (France), Geneva (Switzerland, a.k.a. CH), Milan (Italy), and Athens (Greece):

```
12 ae6.mx1.lon2.uk.geant.net (62.40.98.37) 87.687 ms
13 ae5.mx1.par.fr.geant.net (62.40.98.179) 94.053 ms
14 ae5.mx1.gen.ch.geant.net (62.40.98.182) 101.531 ms
15 ae6.mx1.mil2.it.geant.net (62.40.98.81) 108.380 ms
16 ae3.mx2.ath.gr.geant.net (62.40.98.151) 130.760 ms
```

Somtimes, the location may be specified even more exactly than just a city - for example, **de-cix-new-york.as13335.net** is located at the _DE-CIX exchange point_ in New York, and **equinix.sjc.datapipe.net** is at the Equinix exchange point in San Jose. (Routers located at an IXP often have the ASN in their name - this makes it easier to debug peering problems, when you can see at a glance what AS the router belongs to.)

**Record any geographic or other details you can learn about your chosen path.**