Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recipe: How to identify root causes in customer cone changes #509

Open
bhuffaker opened this issue Oct 11, 2022 · 3 comments · May be fixed by #675
Open

recipe: How to identify root causes in customer cone changes #509

bhuffaker opened this issue Oct 11, 2022 · 3 comments · May be fixed by #675

Comments

@bhuffaker
Copy link
Member

bhuffaker commented Oct 11, 2022

Overview

We would like to develop a recipe, likely multiple scripts. That will allow a users to identify the what caused a customer cone change between two dates for a target AS.

We will likely need to increase the set of files that are provided publicly. You will be workin on beamer.caida.org so yo have access to the full set of files. Keep a list of the files you need and we will make sure they are public when the recipe is released.

What we want to identify is root causes of the changes single in a single AS's customer cone. Find the set of ASes that where removed or added. Find the set of paths that contained the target AS and the added or removed ASes.

Identify common factors between the change:

  • the path containing the customer was found in stable, but not paths
    • was it not stable?
    • was it cleaned?
  • the path contain the customer was found in all-paths, but not in stable
    • why was it not copied to stable
  • the lost/added ASes share the same set of collectors or peers.
  • the lost/added ASes share the same AS across their paths
  • a relationship changed which caused the paths to be changed
    • once we have the above working, we will move on to identify the reason
      the relationship changed.

These should be presented in a table sorted by the number of ASes effected. This is just the starting point. We will likely discover more root causes as we work.

type description ASes lost AS gained
lost monitor monitor A was lost -34 -
lost collector collector B was lost -7 -
AS lost AS A disappeared -2 +4
AS gained AS B appeared - +4

Process

Here is a highlevel overview of the process we will follow.
1 script that identifies the set of paths from .paths.bz2 and .paths.bz that contain the target AS and ASes which where lost or gained between .ppdc-ases.txt.bz2 and .ppdc-ases.txt.bz2. and creates <AS_target>-paths.txt

  • target paths: .... AS_target .... AS_lost/AS_gained ...
  1. script that finds the original paths in <date*>all-paths.bz2 and annotate the path with the collector and peer that observed it.
    • you will need to use 20220901-path_info.jsonl to find the paths that were changed in the cleaning process
    • output format should be a JSONL ; only include original/tags if path was changed
    {
         "path":"4739|7545|45430|45458|38820",
         "peers":[ {"collector":"ripe/rrc00","peer":"161.129.152.2"} ]
         "tags":[{"link":"7545|45430","index":2,"type":"IX"}],
         "original":"4739|7545|24115|45430|45458|38820"
    }
  2. script create two tree rooted at the collectors
  • Lost Tree uses the paths from date1 for ASEs that where lost between date1 and date2
  • Gained Tree uses paths from date1 for ASes that where gained between date1 and date2
  • For each tree count the number of ASes below each node in the graph
  1. script that identifies the root causes, ie the most important nodes that contributed the the gained/losts ASes
  • this is going to be tricky, because we need to identify threshold for when to count a change to a child rather then a parent.
  • in some cases this will be easy, if the node's change is split across multiple parents, in others it will be harder to identify
  1. script that produces a table/report of the root causes
c-Z- A-B-C-D
c-Z- A-B-E-F
c- T- A-G
            / T(1) \             / G (1)
c (3) -  Z (2)-   A (3) - B (2) - C (1) - D (1)
                                                \ E (1) - F (1) 

Background

AS Rank combines multiple datasets, but this script will be examing the BGP paths that are used to infer the AS Relationships and Customer Cones. Reading material:

Reference code:

AS Rank data sources follows the follow:

  1. Download the BGP RIB
    /data/external/as-rank-ribs/20220901/{ripe,routeviews}

  2. combine all the paths
    /data/external/as-rank-ribs/20220901/20220901.all-paths.bz2
    ripe/rrc00|4 13830|3356|1299|9583|45769 1.186.170.0/24 i 161.129.152.2

    • org: ripe
    • collector: rrc00
    • frequency seen across 5 days: 4
    • asn path: 13830|3356|1299|9583|45769
    • prefix: 1.186.170.0/24
    • code: i
    • peer: 161.129.152.2
    • path: rrc00 - 161.129.152.2 - 13830|3356|1299|9583|45769 - 1.186.170.0/24
  3. find the stable paths 47787|174|12389|42742
    /data/external/as-rank-ribs/20220901/20220901stable.paths.bz2

  4. created the set of "clean" paths" 47787|174|12389|42742
    /data/external/as-rank-ribs/20220901/20220901.paths.bz2

  5. infer the set of AS Relationships
    /data/external/as-rank-ribs/20220901/20220901.as-rel.txt.bz2

  • 3356|2|-1 3356 is a provider of 2 : 3356 < 2
  • 2|3356|1 2 is a customer of 3356 : 2 > 3356
  • 2|3356|0 2 is peer of 3356 2 - 3356
  • 4 > 5 - 10 < 99 < 23
  1. infer AS customer cones using as-rel on paths
    /data/external/as-rank-ribs/20220901.ppdc-ases.txt.bz2
    3356 1 2 3 4 6 7 9 11 3356 this means 3356's customer cone includes 1 2 3 4 6 7 9 11 3356

You can find a set of annotated paths here:

  • /data/external/topology-asdata/as-rank/20220901/20220901-path_info.jsonl
    • this contains the set of "changed" paths and why they where changed.
    • these files are created from the as-relationship-serial-1 files and used to populate AS rank.
      {"original":"34288|4637|1221|17819|138592","path":"34288|4637|1221|138592","tags":[{"link":"1221|138592","index":3,"type":"IX"}]}

/data/external/topology-asdata/as-rank/20220901/
To make sure you have access to the full set of files. We will

Inferring customer cone

A > B - C < E
             C < E  c.cone_add(e)
A > B > C < E
                    E
A > B < C < E
             C < E c.cone_add(e)

Rules for cleaning paths

cleaning

  • remove IX from AS paths

    A IX B -> A B
  • remove adjacent duplicates ASes

    A A C D -> A C D

rejecting, the whole path is throw out

  • loops: An AS is repeated with a different AS inbetween the first and second occurance

    A B C D C -> rejected

  • poisoned paths: If there is a non clique member between two clique members

    A_clique B_not_clique C_clique
    A B C1 C2 C3 D -> not rejected
    A B C1 C2 D C3 E -> rejected, because of C2 D C3
    
@richmass1 richmass1 changed the title recipe: How to identiy root causes in custome one changes recipe: How to identiy root causes in customer cone changes Oct 12, 2022
@richmass1 richmass1 changed the title recipe: How to identiy root causes in customer cone changes recipe: How to identify root causes in customer cone changes Oct 12, 2022
@bhuffaker
Copy link
Member Author

num_paths : number_asn

  • how many lost / gained
  • most asns seen by how many paths
  • how many paths
  1. count lost
    paths from paths.date1 contain target and ASNs in asns.date1 not in asns.date2
  2. count path freq
    number of paths seening an ASN | number of ASNs with tthat number of paths
    num paths | num asns lost | all asns
    10 | 324 # 324 asns were seen on 10 paths
    4 | 3456 # 3452 asns wher seen on 4 paths
    1 | 23423423
  3. print out paths given dest
    .print-paths.py -r 2021.as-rel.gz -p paths-lost-as.2021.txt 87 90
    424 > 100 - 98 < 87
    424 > 100 - 98 < 2 < 87
    424 > 101 - 99 < 90
  4. filter down to customer cone
  5. print betweeness of tranist ASNs
    https://en.wikipedia.org/wiki/Betweenness on just lost ASNS, and all paths

@bhuffaker
Copy link
Member Author

  • T: target
  • G1: gained
  • G2: gained
  • 1 > 2 : 1 is customer of 2
  • 1 < 2 : 1 is provider of 2
  • 1 - 2 : 1 is peer of 2

paths:

1 > 2 > T - 3 < G1
1 > 8 < T < 4 < 5
1 > 8 < T < 4 < G1
8 < T < 4 < G1
8 < T < 4 < G1
14 > 15 - T < 4 < 2 < G2
35 - 45 < T < 4 < 3 < G3
  1. strip paths that do not contain a gained AS

    1 > 2 > T - 3 < G1
    1 > 8 < T < 4 < G1
    8 < T < 4 < G1
    8 < T < 4 < G1
    14 > 15 - T < 4 < 2 < G2
    35 - 45 < T < 4 < 3 < G3
    
  2. strip the path up to just after the peer or customer link

    3 - G1
    T < 4 < G1
    T < 4 < G1
    T < 4 < G1
    T < 4 < G1
    T < 4 < 2 < G2
    45 < T < 4 < 3 < G3
    
  3. strip up to target

    4 < G1
    4 < G1
    4 < G1
    4 < G1
    4 < 2 < G2
    4 < 3 < G3
    
  4. count all the Gained ASes below the ASN towards it's gained set

asn asnes gained from one of it's paths
2 G2
3 G3
4 G1, G2, G3

@bhuffaker
Copy link
Member Author

  • Put your code under CAIDA's org

    • create a new branch in catalog-data 509-recipe-cone-chanages
    • copy every thing in your repository into sources/how-to-identify-root-causes-in-customer-cone-changes
  • You need to update the README with the inputs you used. ie which all-paths, rel, customer cone, etc files specifically

  • Sort ASNs by the degree to which they are present in the gained or lost over what is expect from their frequency in all-paths

    asn relationship to target asn percentage of all paths percentage of (gain/lost) paths
    954 provider 10% 15%
    45 upstream 50% 76%
    • upstream : AS is before target, but not directly connected
    • provider : AS is before and connected
    • customer : AS is after and connected
    • downstream : as is after target, but not directly connected
  • add your own personal analysis and theories based on what you find

@jes089 jes089 linked a pull request Jan 30, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants