A MLsploit module to integrate Barnum.
This is a MLsploit module for using Barnum.


See MLsploit's documentation for how to import a module:

python createmodule barnum


This module contains three functions.


This function takes PDF files and modifies them using Mimicus, which is a system designed to evade static PDF malware classifiers.

To use this module, upload your PDF files via the MLsploit web UI and pass them to a pipeline containing the Mimicus function. The PDF files will be overwritten with their modified versions.


This function takes traces produced using Barnum's Tracer and trains on them using Barnum's Learner. At a very high level, Barnum is a system that detects document malware by analyzing control flow traces for anomalous behavior.

To upload a trace into MLsploit, you must perform the following steps:

  1. Capture a trace using Barnum Tracer.

  2. Preprocess the trace using, which is provided by Barnum Learner (see the README).

  3. Zip each trace directory, one per sample.

Here's an example of what one trace ZIP should look like:

$ unzip -l
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  2019-01-03 16:56   44a7851c6f204ed156203b178f1d59a7c0e02b26a447fef7be648c5366aa4312/
  1177960  2019-01-03 16:55   44a7851c6f204ed156203b178f1d59a7c0e02b26a447fef7be648c5366aa4312/trace_parsed.gz
      861  2019-01-03 16:53   44a7851c6f204ed156203b178f1d59a7c0e02b26a447fef7be648c5366aa4312/mapping.txt.gz
       85  2019-01-03 16:53   44a7851c6f204ed156203b178f1d59a7c0e02b26a447fef7be648c5366aa4312/info.txt
---------                     -------
  1178906                     4 files

It is okay if your trace contains additional files not shown above, but it must contain a trace_parsed.gz and info.txt at a minimum.

Once your traces are uploaded via the MLsploit web UI, create a pipeline containing this function and pass it traces.

This function will produce the file, which contains the trained model.


This function takes an already trained model and evaluates it on a set of traces.

It works similarly to the Barnum-Train function, except it should be passed a along with the traces. Currently, the model file must be named and must contain the files lstm.h5, lstm.json and svm.

For users that want to train a model outside of MLsploit (without using the Barnum-Train function) and upload it, here's an example of what a should look like:

$ unzip -l
  Length      Date    Time    Name
---------  ---------- -----   ----
   996720  2019-07-03 17:02   lstm.h5
     4989  2019-07-03 17:02   lstm.json
    24229  2019-07-03 17:40   svm
---------                     -------
  1025938                     3 files


To test whether this module is correctly installed into MLsploit, test input files are provided in the examples directory. Upload its contents via the MLsploit web UI.

Note, this data is only for verifying that the module works. Do not expect the provided model to yield high accuracy.

