# Explore ROV deployment using BGP2GO and BGPStream

Link: https://cseweb.ucsd.edu/classes/wi23/cse291-e/projects/cse291wi23_pa7.pdf

# 1. General Questions:

 * What is a BGP collector?
 
    A BGP collector is a system that collects BGP updates from multiple Autonomous Systems (ASes) and stores them in a database for analysis. BGP collectors are typically operated by research organizations, network operators, or other entities that need to analyze BGP routing data for various purposes, such as monitoring network performance, detecting routing anomalies, or studying Internet topology. BGP collectors can be public or private, and they typically use peering relationships with ASes to obtain BGP updates.
* How does a BGP collector obtain BGP updates?

    A BGP collector obtains BGP updates through peering relationships with ASes. Peering is a process by which two ASes agree to exchange BGP updates with each other. BGP collectors typically establish peering relationships with multiple ASes to obtain a diverse set of BGP updates. The peering relationships can be bilateral or multilateral, and they can be established through various means, such as direct connections, Internet Exchange Points (IXPs), or Route Servers. Once a BGP collector has established peering relationships with ASes, it can receive BGP updates from them and store them in a database for analysis.
* What information is included in a BGP update?

    A BGP update typically includes the following information: - Prefix: The IP address range being advertised or withdrawn. - AS path: The sequence of ASes that the update has traversed, starting from the origin AS and ending at the peering point with the collector. - Next hop: The IP address of the next hop router that should be used to reach the advertised prefix. - Origin: The origin of the update, which can be either an AS or an External BGP (EBGP) peer. - Local preference: A value used by the AS to indicate its preference for a particular route. - MED (Multi-Exit Discriminator): A value used by the AS to indicate the preferred exit point for a particular route. - Community: A tag used by the AS to group routes and apply policies to them. BGP updates can also include other optional attributes, such as Aggregator, Atomic Aggregate, and Originator ID.
* How is BGP data stored and analyzed?

    BGP data is typically stored in a database for analysis. The database can be a relational database, a NoSQL database


# 2. BGP routing data

**Question 2.1**:
 
URL structure for MRT update files. According to the text, the URL format for MRT update files is as follows: 

http://routeviews.org/route-views3/bgpdata/YYYY.MM/UPDATES/updates.YYYYMMDD.HHMM.bz2 In this format, the variable elements that can be changed to point to other files are: - YYYY: Represents the year (e.g., 2023) - MM: Represents the month (e.g., 01 for January) - YYYYMMDD: 

Represents the date in the format YYYYMMDD (e.g., 20230102 for January 2, 2023) - HHMM: Represents the time in the format HHMM (e.g., 0230 for 02:30 AM) By modifying these variable elements in the URL, you can point to other MRT update files.


**Question 2.2**: 

The two popular route collector projects: Routeviews and RIPE RIS. According to the text, both Routeviews and RIPE RIS operate multiple collectors. However, the exact number of collectors is not mentioned in the given texts. Therefore, the source does not contain the information about the number of collectors each project operates.

**Question 2.3**:

The RIPE RIS website provides detailed information about the data collection process, including the dump intervals for RIBs and updates. According to the website, RIBs are dumped every two hours, while updates are dumped every 15 minutes.

**Question 2.4**:

 Based on the information available on the RIPE RIS website, we can discuss the pros and cons of the chosen intervals for RIBs and updates. 
 
 * Pros: 
    - Visibility (user-side): The chosen intervals for updates (15 minutes) provide users with more up-to-date information about changes in the routing table, which can be useful for real-time analysis and monitoring. 
    - Granularity: The chosen intervals for updates (15 minutes) provide a more granular view of changes in the routing table, which can help in identifying specific events or anomalies. 
 
 * Cons: 
    - Disk space (provider-side): The chosen intervals for updates (15 minutes) can result in a large amount of data being generated and stored, which can increase the storage requirements for the provider. 
    - Processing time: The chosen intervals for updates (15 minutes) can result in more frequent processing and analysis of the data, which can increase the processing time and resource requirements for the provider. 
    - Accuracy: The chosen intervals for updates (15 minutes) may not capture all changes in the routing table, as some changes may occur between the dump intervals. This can result in some changes being missed or delayed in the data. Overall, the chosen intervals for updates (15 minutes) provide a more granular view of changes in the routing table, but can also result in increased storage and processing requirements for the provider. The choice of intervals should be based on the specific needs and requirements of the users and providers, and may vary depending on the context and use case.


# 3. Parsing single MRT files

**Annotated HTML version of bgpreader’s output of MRT**:

file: http://nids.caida.org:45000/cgi-bin/bgpreader.sh?http://routeviews.org/route-views3/bgpdata/2023.01/UPDATES/updates.20230102.0230.bz2,100,1299

**Question 3.1**: How much time elapsed from line 80 to line 120?

In [2]:
timestamp80 = 1672626601.368799
timestamp120 = 1672626601.524185

To calculate the time elapsed, we subtract the timestamp of line 80 from the timestamp of line 120:

1672626601.524185 - 1672626601.368799 = 0.155386 seconds

In [3]:
import datetime

datetime80 = datetime.datetime.fromtimestamp(timestamp80)
datetime120 = datetime.datetime.fromtimestamp(timestamp120)

time_str80 = datetime80.strftime("%Y-%m-%d %H:%M:%S")
time_str120 = datetime120.strftime("%Y-%m-%d %H:%M:%S")

print(time_str80)
print(time_str120)

2023-01-02 11:30:01
2023-01-02 11:30:01
