Installing bulk_extractor

rmac75 edited this page Dec 28, 2015 · 94 revisions

Overview

bulk_extractor can be used on Windows, Linux, and Macintosh OS X platforms.

This page contains instructions for downloading, building and installing bulk_extractor on Linux and OS X, and for downloading and installing the bulk_extractor binary on Windows. If you would like to build your own Windows binary, a Linux system must be used; see Cross-compiling for Windows below.

For additional information on bulk_extractor see Forensics Wiki Entry: http://www.forensicswiki.org/wiki/Bulk_extractor

Windows Users

  1. Download the latest bulk_extractor Windows installer from here.
  2. Install bulk_extractor by running the downloaded Windows installer.
Note: Temporarily turn off your virus checker if it refuses to download and/or install bulk_extractor.

Linux and OS X Users

Downloading the Latest Release

  1. Download the latest bulk_extractor .tar.gz file from here.
  2. Follow the instructions below to build bulk_extractor.

Preparing the Build Environment

Install the build environment, then download, and follow the steps to build and install the latest version of bulk_extractor using a command line interface.

Before compiling bulk_extractor for your platform, you may need to install packages on your system which bulk_extractor requires to compile cleanly. See instructions below for installing packages specific for your system. One of the following two packages is required:

  • TRE or else the libgnurx regular expression library (libgnurx-static) is required. TRE is preferred because experiments indicate that it is about 10X faster.
The following packages are optional but are needed to provide specific capabilities:
  • Install LIBEWF to read E01 files.
  • Install AFFLIB to read AFF files. Note that the AFF format is becoming deprecated.
  • Install SQLITE to be able to output report data into a SQL database.

If you want to build bulk_extractor from the current development tree, then also read the Developers section below.

Fedora Linux

First, please install the build environment with the following commands:

 sudo yum update
 sudo yum groupinstall development-tools
 sudo yum install flex zlib-devel
 sudo yum install libxml2-devel openssl-devel tre-devel
 sudo yum install gcc-c++

Note: the following specific packages may be loaded instead of installing development-tools:

  git
  gcc
  gcc-c++
  autoconf
  automake
  libtool
  openssl-devel

Please install optional packages, as desired:

  sudo yum install libewf-devel afflib-devel sqlite-devel

If the Bulk Extractor Viewer (BEViewer) is required, also install a Java JDK Version 6 or newer.

CentOS / RHEL Users

Install the build environment:

 sudo yum update
 sudo yum groupinstall development tools
 sudo yum install flex zlib-devel expat-devel
 sudo yum install libxml2-devel openssl-devel tre-devel

If you want to run bulk_extractor on images in E01 (EWF) format, you must install libewf. Please download, build, and install libewf as described at https://github.com/libyal/libewf/wiki/Building. Basically, you will download libewf from here, then extract it:

 tar xfv libewf-experimental-<version>.tar.gz

then move to the new directory:

 cd libewf-<version>

then build and install it:

 ./configure
 make
 sudo make install

This will put the library and tools in /usr/local, if you want to change this to e.g. /usr add --prefix=/usr when running ./configure.

    ./configure --prefix=/usr

On Linux make sure libewf.so is in the library cache, normally it suffices to run:

    sudo ldconfig

Or update your LD_LIBRARY_PATH shell variable to /usr/local/lib or wherever you installed libewf, for example

    export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH

Installing afflib

We have decided that AFF support is deprecated so we're no longer trying to provide instructions for building afflib and using it with bulk_extractor.

Configuring bulk_extractor:

In the instructions for building bulk_extractor below, you are instructed to run ./configure before building hashdb and bulk_extractor.

To install in your local space without sudo, use these options:

 ./configure --prefix=$HOME/local/ --exec-prefix=$HOME/local CPPFLAGS=-I$HOME/local/include/ LDFLAGS=-L$HOME/local/lib
Debian and Ubuntu

On Debian (wheezy) and Ubuntu 12.04, this was sufficient:

  $ sudo apt-get -y install gcc g++ flex libewf-dev libssl-dev
Mac OS X Users

The install process for Mac users is similar to that for Fedora users. We recommend using MacPorts:

 sudo port install flex autoconf automake pkgconfig

The following might be helpful, but development code might be required. The -devel ports might not be available for OS X, but you try to install these ports anyhow (as they will be updated eventually):

 sudo port install libewf openssl tre libxml2

At present, libewf is too old to provide the support needed to process E01 files. However, for OS X, libewf-devel is not available in ports. Therefore, please download and install libewf as described for CentOS / RHEL Users, above.

How to Build Bulk_Extractor From Source

After you prepare your build environment, download, build and install a bulk_extractor release version using the following command-line instructions.

Download and unzip the latest .tar.gz bulk_extractor distribution tarball available here. For example:

   wget http://digitalcorpora.org/downloads/bulk_extractor/bulk_extractor-1.5.5.tar.gz
   gunzip bulk_extractor-1.5.5.tar.gz
   tar -xvf bulk_extractor-1.5.5.tar.gz

It is now intended that the hashdb scanner also be included in bulk_extractor. However, you need to build hashdb first for it to be included in the bulk_extractor binary. Download the latest .tar.gz hashdb distribution tarball available here. For example:

   cd bulk_extractor-1.5.5
   wget http://digitalcorpora.org/downloads/hashdb/hashdb-x.x.x.tar.gz
   gunzip hashdb-x.x.x.tar.gz
   tar -xvf hashdb-x.x.x.tar.gz

To install bulk_extractor globally using sudo, please follow these steps:

   Build hashdb next
   In the hashdb directory:
   ./configure
   make
   sudo make install
  
   Now make and install  '''bulk_extractor''' with hashdb included:
   In the bulk_extractor directory
   make
   sudo make install

To build and install in your local space without sudo, please follow these instructions instead:

 In the hashdb directory:
 ./configure --prefix=$HOME/local/ --exec-prefix=$HOME/local CPPFLAGS=-I$HOME/local/include/ LDFLAGS=-L$HOME/local/lib
 make
 make install
 In the bulk_extractor directory:
 ./configure --prefix=$HOME/local/ --exec-prefix=$HOME/local CPPFLAGS=-I$HOME/local/include/ LDFLAGS=-L$HOME/local/lib
 make
 make install

Developers

The bulk_extractor source code is available on GitHub. First, please install the git source code management system,, for example, in CentOS, RHEL, and Fedora use yum:

 sudo yum install git

Download bulk_extractor and its submodules, then run bootstrap.sh:

 git clone --recursive https://github.com/simsong/bulk_extractor.git
 cd bulk_extractor
 sh bootstrap.sh

The bootstrap script builds the configure.ac file. Then you run ./configure to create the makefile.

 ./configure

To include hashdb, follow the build instructions on the hashdb github page: https://github.com/simsong/hashdb/wiki/Installing-hashdb

Finally you compile bulk_extractor and install the binary.

 make && make install

(Note: To install globally in a system directory, you can use sudo make install.)

If you are developing with github, after a checkout, you may wish to do this:

make gitfixup # brings every submodule to master CXXFLAGS="-fsanitize=address" ./configure # Runs with ASan (requires clang & libasan to be installed)

- Run -E with all of the scanners one-by-one with ASan to find scanner-specific bugs. Currently there seems to be a bug in email in the histogram generation process and in scan_hex

To keep bulk_extractor and its submodules current with the latest code on GitHub, type:

 cd to the bulk_extractor directory
 make pull

To change your repository to make it use a new master branch of a submodule:

 cd to the submodule
 git pull origin master
 cd back to the bulk_extractor directory
 git add submodule directory,  
 then commit and push the bulk_extractor change using the latest new submodule

Compiling Notes

1. bulk_extractor builds with the GNU auto tools.

2. We recommend compiling bulk_extractor with -O3 and that is the

   default. You can disable all optimization flags by specifying the
   configure option --with-noopt.

3. Building with a different glibc In creating the bulk_extractor.so, it may be necessary to build with an older glibc. We're not sure how to do it, but one of these links may help:

4. The following directories will NOT be installed with the commands provided:

    python/   - bulk_extractor python tools.
    	      	Copy them where you wish and run them directly. 

These tools are experimental.

    plugins/  - This is for C/C++ developers only. You can develop your own
    	      	bulk_extractor plugins which will then be run at run-time

if the .so or .dll files are in the same directory as the bulk_extractor executable.

Cross-compiling for Windows

The Windows configuration of bulk_extractor can be cross-compiled on a Fedora 20 or newer system using mingw. A script is provided in the src_win directory for configuring a Fedora virtual machine to cross-compile to windows. Some users have also reported success at compiling on Ubuntu, but it is harder.

If you downloaded bulk_extractor using git (rather than downloading the .tar.gz file), run bootstrap.sh:

 sh bootstrap.sh

If you have previously run configure for a native build, please clean up:

 make distclean

Install MinGW and the libraries required for cross-compilation. This will take some time and will require the root password:

 cd src_win
 ./CONFIGURE_F20.bash

To include hashdb, follow the build instructions on the hashdb github page: https://github.com/simsong/hashdb/wiki/Download

Finally, cross-compile bulk_extractor to build the Windows installer:

 make

Please install the generated bulk_extractor windows installer .exe file onto your Windows system.