Skip to content

ritik-malik/routeviews_BGP_V4.0

Repository files navigation

routeviews_BGP_V4.0

This repository contains the final work done during the BTech Project :
Mapping The Maze: The Study of Internet Shutdowns across the world

Aim: Finding a relationship between geopolitical events and internet shutdowns across the world.

Conclusion: We can use BGP data as a parameter to detect internet shutdowns on a marcroscopic scale.

The final BTP report can be found in the BTP Report directory

Successful case studies include:

  1. Iran
  2. Uganda
  3. Myanmar
  4. US

Unsuccessful case study:

  1. India

The results can found in the Results directory.


The project has old phases & pipeline which can be found here:

New & improved pipeline from routeviews_BGP_V3.0
Beta version

Check plan.txt for logic and working

Major upgradations in new pipeline

  • Dates are flexible, not hardcoded for 1 month, can use any 30 days
  • Support removed for mongoDB, replaced by py dictioneries (much faster!)
  • Execution time 7 hours, compared to 1.5 days previously
  • Efficient storage : Using pickle to store dicts as binaries
  • More intutive input for scripts
  • Each script has little man page inside for debugging
  • This time no need to scrap prefixes, use the ribs itself

Pipeline flow

pipeline.sh

  • Input YYYY MM & DD until 30 days are covered (new UI)
  • Input 4 timestamps (0200 0800 1400 2000 recommended for better coverage)
  • Input ISP_ASN folder name + LIMIT for graphs
  • Perform sanity check for all files
  • Display all the inputs (vars array) + show warning
  • Confimation check before proceeding
  • Call master.sh in background & exit

Structure of the array VARS :

$vars{[0]} = 1st date [start] YYYYMMDD
$vars{[1]} = 2nd date
$vars{[2]} = 3rd date
.
.
$vars{[29]} = 30th date [end]

$vars{[30]} = timestamp_1 TTTT
$vars{[31]} = timestamp_2
$vars{[32]} = timestamp_3
$vars{[33]} = timestamp_4
.
${vars[34]} = LIMIT XX

This array is passed to master.sh

What's next?

To get better insight of actual approach & hypothesis -

  • Read plan.txt
  • Each script has a little doc inside
    Start from pipeline.sh, it will lead to all other scripts

Some interesting stats

The results are out from new pipeline for ribs from 14th Jan to 12th Feb (for India) :-
(This is important to analyse as these are the results from the 1st run)

Some good stuff :

  • New pipeline now runs in 7 hours, compared to 40 hours previously!
  • The hypothesis was right, we got overall more prefixes from ribs than from CIDR,
    CIDR prefixes -> 19884
    Ribs prefixes -> 20720
    (834 new prefixes)
  • Storage space or RAM is not an issue now , new pipeline is quite optimized
    Storage < 500 MB, RAM < 6 GB
  • We got more number of prefixes with dips > 20% in ribs
    Old pipeline : 356
    New pipeline : 1205
    And that's insane!

Some bad stuff :

  • There seems to be very little correlation, but could be just coincidence,
    Only a very small fraction of graphs falling in the right spot, on the days of shutdown
    (this is only for India)
    (we got perfect correlation for Iran & Myanmar)

A major concern :
We still don't get it...?
If these dips in graphs are not for shutdowns, then why are they for though,
We didn't see same pattern anywhere else!

Update:

This project is almost over now, and might not be maintained further.

About

The new beta pipeline, continuation from V3.0

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages