Convert PST files to RCF822 *.eml files and generate electronic discovery load files.
C++ C Ruby
Switch branches/tags
Nothing to show
Clone or download
Pull request Compare This branch is 2 commits ahead of aranetic:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
pstsdk @ 3551107
spec
test_data
.gitignore
.gitmodules
CMakeLists.txt
COPYING-AGPL.txt
Gemfile
Gemfile.lock
README.md
document.cpp
document.h
document_spec.cpp
edrm-fields.txt
edrm.cpp
edrm.h
edrm_spec.cpp
main.cpp
md5.c
md5.h
rfc822.cpp
rfc822.h
rfc822_spec.cpp
spike.cpp
utilities.cpp
utilities.h
utilities_spec.cpp
xml_context.cpp
xml_context.h
xml_context_spec.cpp

README.md

process-pst: Command-line program for turning PSTs into email+loadfiles

Copyright (c) 2010 Aranetic LLC.

process-pst is a command-line tool that does basic, first-pass processing for PST files.

process-pst custodian1.pst custodian1

This will create the directory custodian1 (if it doesn't already exist), and an EDRM XML loadfile custodian1/edrm-loadfile.xml. Any emails found in custodian1 will be converted to RFC822 format and stored in custodian1, and any attachments will be extracted.

We are also interested in supporting simple text extraction and other loadfile formats, including Concordance- and Summation-compatible loadfiles. Your patches are extremely welcome!

Note that process-pst is distributed under a "share and share alike" license: If you distribute copies of process-pst, you must make the source code available under the terms of the GNU Affero General Public License, version 3. And if you modify process-pst and allow other people to interact with it over a network, you have certain obligations to provide them with the modified source code. You may have certain other obligations. See the full text of the license for details. You can think of this as the vendor version of pro bono work on behalf of the larger legal community.

License

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License (the "License") as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but it is provided on an "AS-IS" basis and WITHOUT ANY WARRANTY; without even the implied warranties of MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NONINFRINGEMENT. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see http://www.gnu.org/licenses/.

Downloading the source code

git clone http://github.com/aranetic/process-pst.git
cd process-pst
git submodule update --init

Compiling

process-pst uses CMake to manage the build process.

Mac

First, use MacPorts to install Boost 1.42, GCC 4.4, CMake 2.8 and iconv:

sudo port install boost @1.42.0
sudo port install gcc44 cmake libiconv

To run the unit tests, you will also want to install Ruby, rubygems, and bundler.

On 10.6, gcc44 (as of 13 Jul 2010) currently inappropriately builds/installs libgcc_s, see https://trac.macports.org/attachment/ticket/25042/email1.eml. Move these out of the way:

sudo mkdir /opt/local/lib/gcc44/hidden
sudo mv /opt/local/lib/gcc44/libgcc_s* /opt/local/lib/gcc44/hidden/

Then, install the necessary Ruby gems and build using CMake:

bundle install
CC=gcc-mp-4.4 CXX=g++-mp-4.4 cmake .
make

cmake may also find the system libiconv instead of the MacPorts one, and fail to link. If so, replace the ICONV_LIBRARY line in CMakeCache.txt with one pointing at MacPorts, as follows:

ICONV_LIBRARY:FILEPATH=/opt/local/lib/libiconv.dylib

and rerun the cmake to regenerate the build files, and make again.

Linux

These instructions have been tested on a pristine Ubuntu 10.04 system created using Amazon's EC2 service and the 32-bit ami-2d4aa444.

First, set up your system with the necessary compilers, libraries and gems:

sudo apt-get install cmake g++-4.4 ruby ruby-dev build-essential \
  libxml2-dev libxslt-dev git-core
wget http://production.cf.rubygems.org/rubygems/rubygems-1.3.7.tgz
tar xzf rubygems-1.3.7.tgz
(cd rubygems-1.3.7 && sudo ruby setup.rb)
sudo gem1.8 install bundler -v 0.9.26

Boost must be installed manually, using GCC 4.4 as describe here:

wget http://downloads.sourceforge.net/project/boost/boost/1.42.0/boost_1_42_0.tar.gz?use_mirror=voxel
tar xzf boost_1_42_0.tar.gz
cd boost_1_42_0
echo "using gcc : 4.4 : /usr/bin/g++-4.4 ; " >> tools/build/v2/user-config.jam
./bootstrap.sh --prefix=/opt/boost --without-libraries=python
./bjam --toolset=gcc-4.4
sudo ./bjam install
cd ..

Next, check out process-pst:

git clone git://github.com/aranetic/process-pst.git
cd process-pst
git submodule update --init

Then, install the necessary Ruby gems and build using CMake:

bundle install
CC=gcc-4.4 CXX=g++-4.4 cmake -D BOOST_ROOT=/opt/boost .
make

Running the unit tests

You can run the unit tests using CMake:

CTEST_OUTPUT_ON_FAILURE=1 make test

All the tests should pass.