Python scripts to analyze an authentication dataset.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


Python scripts to analyze an authentication dataset.

The authentication dataset from LANL ( provides a valuable benchmark dataset for researchers in cybersecurity and/or graphs/networks.

We recommend using the Anaconda Python 3 distribution.

Getting Started

We begin by generating two files from the original dataset:

  • time_secs_binary_f32.dat - a binary file containing just the time (secs) data (32-bit values)
  • auth_graph_adjlist.dat - an ASCII file containing the global graph (as an adjacency list)

You have two choices for obtaining these two files. The script will generate both of them. However, it took about 8 hours on a laptop. The other choice is to simply download the graph file from and generate the other file using (which takes just a few minutes).

Sample scripts

$ ipython --matplotlib

Python 3.4.2 |Anaconda 2.1.0 
Using matplotlib backend: MacOSX

In [1]: %run create_time_file

In [2]: %run histo_time

matplotlib plot of histogram of time events

Interactive matplotlib window with pan, zoom, rubberband buttons

In [3]: %run readG_draw

After some time, the full graph will be plotted (below, for what it's worth). You can then interactively pan and zoom in on regions of interest.

authN graph

Global, static authN graph

In [4]: %run readG_hub_subgraph

hub subgraph

A hub as a subgraph

hub subgraph