Skip to content

Database

jfuruness edited this page Nov 29, 2020 · 3 revisions

This file needs editing, these docs are minorly outdated.

Database Submodule

Status: Development

Database Description

This module contains all the functionality to interact with a database. First we have the config class in the config.py. This class keeps track of all the different database sections and their login information. By default pytest always uses the test database class, and if section is not set then bgp is used. Note that this class will install the database for any section that is passed in and does not exist.

Then there is the Postgres class in postgres.py. This class contains some useful features for controlling postgres. It contains the functionality to install postgres and perform modifications for a nice fast database (Note that these modifications make the database corruptable upon crashing, but don't care about that since we can reproduce the data easily). We can also "unhinge" the database, which basically means turning off all safety features and writing to disk for some really fast queries. This also contains the functionality to restart the postgres database. Note that a lot of these system modifications will affect all running instances of postgres databases.

Then there is the database class. This class deals specifically with connecting to the database and performing some minor functions. You can use it as a context manager to be able to access the database and execute commands. By default the realDictCursor is used so that all results returned are lists of dictionaries.

Lastly there is the generic table class. This class is super convenient to write functions with. It inherits the database class. Is it mainly to be used through inheritance. When a class inherits this class, it gains access to a large swath of functionality. It can connect to the database, and whenever it does create tables if they do not exist. It can clear tables upon connection. It has built in clear_table and other functions such as get_count, etc. Check out some great examples of how this is used in the ROAs Parser Submodule in tables.py, and in the usage examples below.

Database Enhancements

Rather than repeat documentation, please view the Postgres class for the modify database function. In addition, see the unhinge_db function, for when the database is unhinged. These are extensive lists of sql queries and the reasons why we use them in order to improve database performance.

Note that we NEVER alter the config file for the database. We only ever use the ALTER SYSTEM command so that we can always have the default config to be able to go back to.

Database Usage

Postgres Functions

Erasing all installed databases:

from lib_bgp_data import Postgres
Postgres().erase_all()

Unhinging database:

from lib_bgp_data import Postgres
Postgres().unhinge_db()
# Do stuff here
Postgres().rehinge_db()

Restarting postgres:

from lib_bgp_data import Postgres
Postgres().restart_postgres()

Database Functions

Connecting to the database

from lib_bgp_data import Database
with Database() as db:
    db.execute("my sql query")
# Database is closed here

Executing multiple sql queries at once:

from lib_bgp_data import Database
sql_queries = ["sql quer 1", "sql query 2"]
with Database() as db:
    db.multiprocess_execute(sql_queries)
# Database is closed here

Generic Table Functions

Lets say you have a table called roas (example in roas [arser.

from lib_bgp_data import Generic_Table
class ROAs_Table(Generic_Table):
    """Announcements table class"""

    __slots__ = []

    name = "roas"

    def _create_tables(self):
        """ Creates tables if they do not exist"""

        sql = """CREATE UNLOGGED TABLE IF NOT EXISTS roas (
              asn bigint,
              prefix cidr,
              max_length integer
              ) ;"""
        self.execute(sql)

Lets get all rows from the Roas_Table:

# Note that when it is initialized, create tables is called!
with ROAs_Table() as db:
    # Returns a list of dictionaries
    rows = db.get_all()
# connection is closed here

Lets clear the table upon initializing, in case it has things in it

# This will drop the table (if exists) and recreate it
with ROAs_Table(clear=True) as db:
    # Do stuff here
# Connection closed here

Other convenience funcs:

  • get_count: returns total count
  • get_all: Returns all rows as a list of dicts
  • clear_table: Drops table if exists
  • copy_table: Takes as input a path string and copies table to that path
  • filter_by_IPV_family: Filter table by IPV6 or IPV4
  • columns: Returns the columns of that table

Again please note: upon connection, it creates the tables. If clear is passed, it will clear them. After the context manager is over it will close the database.

Database Design Choices

  • lib_bgp_data
  • Database Submodule
    • RealDictCursor is used as the default cursor factory because it is more OO and using a dictionary is very intuitive.
    • Unlogged tables are used for speed
    • Most safety measures for corruption and logging are disabled to get a speedup since our database is so heavily used with such massive amounts of data
    • Lots of convenience functions so that code would not be duplicated across submodules

Database Installation

See: Installation Instructions