# **BGPStream**

https://bgpstream.caida.org/docs/tutorials/pybgpstream


https://www.caida.org/catalog/papers/2016_bgpstream/supplemental/


https://github.com/CAIDA/pybgpstream/tree/master/examples

https://bgpstream.caida.org/docs/api/pybgpstream/pybgpstream.html


In [None]:
!sudo apt-get update
!sudo apt-get install -y curl apt-transport-https ssl-cert ca-certificates gnupg lsb-release
!curl -1sLf 'https://dl.cloudsmith.io/public/wand/libwandio/cfg/setup/bash.deb.sh' | sudo -E bash
!echo "deb https://pkg.caida.org/os/$(lsb_release -si|awk '{print tolower($0)}') $(lsb_release -sc) main" | sudo tee /etc/apt/sources.list.d/caida.list
!sudo wget -O /etc/apt/trusted.gpg.d/caida.gpg https://pkg.caida.org/os/ubuntu/keyring.gpg
!sudo apt update; sudo apt-get install bgpstream

In [None]:
!pip install pybgpstream
!python3 -m pip install pybgpkit-parser
!python3 -m pip install pybgpkit

In [3]:
from google.colab import drive
drive.mount('/content/drive')
%cd drive/MyDrive/Thesis Code/tools/bgpstream

Mounted at /content/drive
/content/drive/MyDrive/Thesis Code/tools/bgpstream


In [4]:
import pybgpstream
from collections import defaultdict
from collections import Counter
import networkx as nx
from itertools import groupby
import pickle
from bgpstream_aux import *

**Simple Example - fetch and print BGPStream**

In [None]:
# collectors=["rrc00"]
record_type="ribs"
from_time="2015-08-01 07:50:00"
until_time="2015-08-01 08:00:00"
filter=  'ipversion 6 and peer 25152 and path "_4554_"'
get_bgp_stream(None, record_type, from_time, until_time, filter=filter)

In [27]:
# create and configure the stream
stream = pybgpstream.BGPStream(
   from_time="2017-07-07 00:00:00", until_time="2017-07-07 00:10:00 UTC",
   collectors=["route-views.sg", "route-views.eqix"],
   record_type="updates",
   filter="peer 11666 and prefix more 210.180.0.0/16"
)

# add any additional (or dynamic) filters
# e.g. from peer AS 11666 regarding the more-specifics of 210.180.0.0/16:
# stream.parse_filter_string("peer 11666 and prefix more 210.180.0.0/16")
# or using the old filter interface:
# stream.add_filter("peer-asn", "11666")
# stream.add_filter("prefix-more", "210.180.0.0/16")

# read elems
for elem in stream:
   # record fields can be accessed directly from elem
   # e.g. elem.time
   # or via elem.record
   # e.g. elem.record.time
   print(elem)

# alternatively, records and elems can be read in nested loops:
for rec in stream.records():
   # do something with rec (e.g., choose to continue based on timestamp)
   print("Received %s record at time %d from collector %s" % (rec.type, rec.time, rec.collector))
   for elem in rec:
      # do something with rec and/or elem
      print("  Elem Type: %s" % elem.type)

update|A|1499385779.000000|routeviews|route-views.eqix|None|None|11666|206.126.236.24|210.180.224.0/19|206.126.236.24|11666 3356 3786|3356:575 11666:1000 3356:3 3356:22 11666:1002 3356:2003 3786:0 3356:86 3356:666|None|None
update|A|1499385779.000000|routeviews|route-views.eqix|None|None|11666|206.126.236.24|210.180.0.0/19|206.126.236.24|11666 3356 3786|3356:575 11666:1000 3356:3 3356:22 11666:1002 3356:2003 3786:0 3356:86 3356:666|None|None
update|A|1499385788.000000|routeviews|route-views.eqix|None|None|11666|206.126.236.24|210.180.64.0/19|206.126.236.24|11666 6939 4766 4766|11666:2000 11666:2001|None|None
update|A|1499385833.000000|routeviews|route-views.eqix|None|None|11666|206.126.236.24|210.180.96.0/19|206.126.236.24|11666 9318|11666:2008 11666:2000|None|None
update|A|1499385851.000000|routeviews|route-views.eqix|None|None|11666|206.126.236.24|210.180.32.0/20|206.126.236.24|11666 3356 3786 4663 4663 4663 4663 4663|3356:575 3356:2011 11666:1000 3356:3 3356:22 11666:1002 3786:11 33

**BGPStream Filters**

https://github.com/CAIDA/libbgpstream/blob/master/FILTERING


In [11]:
def build_bgpstream_filter(**filters):
    """
    Build and return a BGPStream filter query string for querying BGP data.

    This tool-like method is intended to be easily used by Language Models (LLMs)
    to dynamically construct complex BGPStream filters. It accepts keyword arguments
    where each key corresponds to a filter field (e.g., prefix, project, collector,
    record_type, time_start, time_end, as_path, etc.) and each value defines the criteria.

    The method supports:
      - **Simple key-value pairs:** e.g., prefix="1.1.1.0/24"
      - **Multiple filter values:** If a value is given as a list or tuple (e.g., prefix=["1.1.1.0/24", "2.2.0.0/16"]),
        they will be joined with commas.
      - **Negation:** To exclude specific criteria, prefix the key with "not_". In the filter string,
        a negation operator ('!') is prefixed to the value. For lists, each value will be negated.
        For example, not_origin_asn="65000" yields "origin_asn !65000".

    Examples:

      # 1. Basic filter:
      build_bgpstream_filter(prefix="1.1.1.0/24", project="routeviews")
      -> "prefix 1.1.1.0/24 project routeviews"

      # 2. Complex filter with multiple values and a time range:
      build_bgpstream_filter(prefix=["1.1.1.0/24", "2.2.0.0/16"],
                             project="routeviews",
                             collector=["route-views.sg", "route-views.oregon"],
                             record_type=["announcements", "updates"],
                             time_start="1651276800",
                             time_end="1651363200")
      -> "prefix 1.1.1.0/24,2.2.0.0/16 project routeviews collector route-views.sg,route-views.oregon record_type announcements,updates time_start 1651276800 time_end 1651363200"

      # 3. Complex filter with negations and patterns:
      build_bgpstream_filter(prefix="1.1.1.0/24",
                             not_project="ris",    # Exclude the "ris" project.
                             as_path="^64500",      # Filter AS paths starting with "64500" (could be used as a regex).
                             not_origin_asn=["65000", "65001"])  # Exclude these specific ASNs.
      -> "prefix 1.1.1.0/24 project !ris as_path ^64500 origin_asn !65000,!65001"

    Parameters:
      **filters : Arbitrary keyword arguments representing BGPStream filter criteria.

    Returns:
      A string representing the combined BGPStream filter query.
    """
    tokens = []
    for key, value in filters.items():
        if value is None:
            # Skip any filter keys with a None value.
            continue

        # Check if the key indicates a negation filter.
        negation = False
        if key.startswith("not_"):
            negation = True
            key = key[4:]  # Remove the 'not_' prefix to get the actual filter key.

        # Process the filter value:
        #   - If it is a list/tuple, join the items with commas,
        #     applying the negation operator '!' to each element if needed.
        #   - Otherwise, simply convert the value to a string and apply negation if required.
        if isinstance(value, (list, tuple)):
            processed_values = []
            for item in value:
                item_str = str(item)
                if negation:
                    item_str = f"!{item_str}"
                processed_values.append(item_str)
            value_str = ",".join(processed_values)
        else:
            value_str = str(value)
            if negation:
                value_str = f"!{value_str}"

        # Form the filter token as "key value_str".
        token = f"{key} {value_str}"
        tokens.append(token)

    # Join all tokens with a space to form the complete filter query.
    filter_query = " ".join(tokens)
    return filter_query

In [12]:
# Example 1: Basic filter with just prefix and project.
filter_example1 = build_bgpstream_filter(prefix="1.1.1.0/24", project="routeviews")
print("Example 1 - Basic Filter:")
print(filter_example1)
print()

Example 1 - Basic Filter:
prefix 1.1.1.0/24 project routeviews



In [None]:
get_bgp_stream(collectors, record_type, from_time, until_time, filter_example1)

In [None]:
# Example 2: Complex filter using multiple prefixes, collectors, record types, and a time window.
filter_example2 = build_bgpstream_filter(
    prefix=["1.1.1.0/24", "2.2.0.0/16"],
    project="routeviews",
    collector=["route-views.sg", "route-views.oregon"],
    record_type=["announcements", "updates"],
    time_start="1651276800",
    time_end="1651363200"
)
print("Example 2 - Complex Filter with Multiple Values:")
print(filter_example2)
print()

In [None]:
get_bgp_stream(collectors, record_type, from_time, until_time, filter_example2)

In [14]:
# Example 3: Advanced filter utilizing negation and regex patterns.
# In this filter:
#   - We select the prefix "1.1.1.0/24",
#   - Exclude data from the "ris" project,
#   - Include only AS paths that start with "64500" (using a regex pattern),
#   - Exclude origin ASNs "65000" and "65001".
filter_example3 = build_bgpstream_filter(
    prefix="1.1.1.0/24",
    not_project="ris",         # Negation: exclude the 'ris' project.
    as_path="^64500",           # AS path pattern (regex) filter.
    not_origin_asn=["65000", "65001"]  # Negation: exclude these ASNs.
)
print("Example 3 - Complex Filter with Negations and Patterns:")
print(filter_example3)
print()

Example 3 - Complex Filter with Negations and Patterns:
prefix 1.1.1.0/24 project !ris as_path ^64500 origin_asn !65000,!65001



In [None]:
get_bgp_stream(collectors, record_type, from_time, until_time, filter_example3)

**AS Graph Analysis**

**BGP Communities**

In [6]:
from_time="2015-08-01 07:50:00"
until_time="2015-08-01 08:10:00"
collectors=["rrc06"]
record_type="ribs"
filter="peer 25152 and prefix more 185.84.166.0/23 and community *:3400"
communities = get_bgp_communities_info(from_time, until_time, collectors, filter=filter)

In [None]:
for ct in communities:
    print(ct)
    print(communities[ct])

**BGPStream Filtering Rules**

The list of accepted terms and their meaning are:
  1. type - restrict the stream to only records of a certain type, e.g. updates.
  2. elemtype - restrict the stream to only elements of a certain type. Possible element types are "ribs", "announcements", "withdrawals", and "peerstates".
  3. peer - restrict the stream to only those elements from a particular peer ASN
  4. prefix exact - restrict the stream to only elements with a prefix that *exactly* matches the given prefix
  5. prefix more - restrict the stream to only elements with a prefix that either matches or is more specific than the given prefix
  6. prefix less - restrict the stream to only elements with a prefix that either matches or is less specific than the given prefix
  7. prefix any - restrict the stream to only elements with a prefix that is either more or less specific than the given prefix (exact matches also are included).
  8. ipversion (ipv) - restrict the stream to only elements with prefixes belonging to the given IP address family, i.e. IPv4 or IPv6.
  9. community (comm) - restrict the stream to only elements that have a community that matches the provided string. The string is formatted as 'asn:value' but '*' may be used as a wildcard, e.g. '*:300' will match all elements with a community value of 300, regardless of the ASN.
  10. aspath (path) - restrict the stream to only elements with an AS Path that matches the provided regular expression. The regular expression should be formatted using the standard Cisco format, i.e. '^' represents the start of the AS Path,
  "$" represents the end of the path and '_' represents the link between two consecutive peers. Standard regex operators can also be used, e.g. *,?,+ and [].
  For example, the expression '$681_1444_' will match any AS Paths that begin with AS681 followed by AS1444.
  Placing a '!' in front of the regular expression will cause the result to be negated, i.e. the element will only be streamed if the path does NOT match the regular expression. For example, "!$681_" will stream all paths that do not begin with AS681.

Examples

1. Filter only updates from the rrc00 collector:
'collector rrc00 and type updates'

2. Filter the prefix '1.2.3.0/22' or any more specific prefixes from either the rrc06 or the route-views.jinx collectors:
  'prefix more 1.2.3.0/22 and collector rrc06 and collector route-views.jinx'

3. Filter IPv6 records that have a peer asn of 25152 and include the ASN 4554 in the AS path:
  'ipversion 6 and peer 25152 and path "_4554_"'

**Tests**

In [8]:
collectors=["route-views.sg"]
record_type="rib"
from_time="2015-08-01 08:00:00"
until_time="2015-08-01 08:00:01"

stream = pybgpstream.BGPStream(
    collectors=collectors,
    record_type=record_type,
    from_time=from_time,
    until_time=until_time
    )

In [None]:
for rec in stream.records():
    for elem in rec:
        print("\t", elem.type, elem.peer_address, elem.peer_asn, \
            elem.type, elem.fields)

**Next hop (IP Address) that ASN xxx Announce the prefix yyy**

**ROA Research USING BGPStream - return the dates which the path was ROA valid or invalid**

**Traffic Engineering Detection**

**Returning all paths which are not include a specific AS or ASes (implement it with AS graph / filters)**