Welcome to the repository of as2org+. Here you can find a the source code and the instructions to install and use as2org+.
as2org+ is a product of the research published in the homonymous paper "as2org+ : Enriching AS-to-Organization Mappings with PeeringDB" appearing in the Proceedings of the Passive and Active Measurement Conference (PAM) 2023, March 2023, Virtual Event.
The paper is available here!
We highly recommend you to use a Python virtual environment to run as2org+. In this repository, we also include a requirements.txt
to install all python packages needed to run as2org+ and the examples.
To install this virtual environment, you have to run the following commands.
This repo includes addition requirements to run the example.
$ python3 -m venv .as2orgplus
$ source .as2orgplus/bin/activate
$ pip3 install -r requirements.txt
Install the as2orgplus library
$ python setup.py install
In the following lines, we use a 2021 PeeringDB snapshot as an example to illustrate all steps necessary to run as2org+. Please select snapshots that align with the dates for what you would like to obtain AS-to-Organization mappings.
as2org+ requires three different data inputs
- A PeeringDB snapshot
- An AS2Org file
- An AS relationships file
Copy&paste these commands to download three examples of these datasets
wget https://publicdata.caida.org/datasets/peeringdb/2021/06/peeringdb_2_dump_2021_06_01.json
wget https://publicdata.caida.org/datasets/as-organizations/20210401.as-org2info.txt.gz
wget https://publicdata.caida.org/datasets/as-relationships/serial-1/20210601.as-rel.txt.bz2
CAIDA concatenates two organization lists in the as-org2info files that need to be separated to execute as2org+. The next script splits an as-org2info file into two different files.
$ python as2org_file_splitter.py -f 20210401.as-org2info.txt.gz
as2org+ can be configure to extract embedded clusters in the aka
, notes
and org
fields, and in any combination of the three of them (e.g., aka
and notes
). Please take a look at the documentation for more details. In this example, we show how to run it with only data available in the aka
field.
python as2orgplus.py -f aka \
-s peeringdb_2_dump_2021_06_01.json \
-a 20210601.as-rel.txt.bz2 \
-w 2021-04-01_as2info.csv.gz \
-o aka_20210601.json
$ python as2orgplus.py --help
usage: as2orgplus.py [-h] -f FIELDS [FIELDS ...] -s SNAPSHOT -a ASREL
[-c2pth [C2P_THRESHOLD]] [-w [WHOIS]] [-o [OUTPUT]]
[-r [{simple,complex,both}]]
options:
-h, --help show this help message and exit
-f FIELDS [FIELDS ...], --fields FIELDS [FIELDS ...]
Enter the fields of PDB to use, e.g: notes, aka, org
or any combination of the previous ones such as -f org
aka
-s SNAPSHOT, --snapshot SNAPSHOT
Specify a PDB snapshot
-a ASREL, --asrel ASREL
Specify a AS-REL snapshot
-c2pth [C2P_THRESHOLD], --c2p_threshold [C2P_THRESHOLD]
Specify the output file
-w [WHOIS], --whois [WHOIS]
Specify the AS2Org file
-o [OUTPUT], --output [OUTPUT]
Specify the output file
-r [{simple,complex,both}], --regex [{simple,complex,both}]
Enter a regex complexity in field notes to use, e.g:
notes, simple, complex, both
If you use as2org+, please cite it as:
@inproceedings{as2orgplus:PAM23,
title = {as2org+ : Enriching AS-to-Organization Mappings with PeeringDB},
author = {Augusto Arturi and Esteban Carisimo and Fabián E. Bustamante},
url = {https://www.aqualab.cs.northwestern.edu/wp-content/uploads/2023/02/AArturi-PAM23.pdf},
year = {2023},
date = {2023-03-21},
booktitle = {Proc. of the Passive and Active Measurement Conference (PAM)},
keywords = {},
}
.
├── LICENCE
├── README.md
├── as2org_file_splitter.py
├── as2orgplus
│ ├── __init__.py
│ ├── aka.py
│ ├── clustering.py
│ ├── features.py
│ ├── filters.py
│ ├── helpers.py
│ ├── notes.py
│ └── org.py
├── as2orgplus.py
├── requirements.txt
└── setup.py
1 directory, 14 files
7 directories, 29 files