Skip to content


Subversion checkout URL

You can clone with
Download ZIP
Tabula is a tool for liberating data tables trapped inside PDF files
CSS JavaScript Ruby HTML

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.


Tabula helps you liberate data tables trapped inside PDF files.

© 2012-2013 Manuel Aristarán. Available under MIT License. See and

Why Tabula?

If you’ve ever tried to do anything with data provided to you in PDFs, you know how painful this is — you can’t easily copy-and-paste rows of data out of PDF files. Tabula allows you to extract that data in CSV format, through a simple web interface:

{TODO: screenshot / screencast here}

Caveat: Tabula only works on text-based PDFs, not scanned documents.

Running Tabula

Running a packaged version


Running Tabula from source (for developers)

  1. Download JRuby. You can install it from its website, or using tools like rvm or rbenv

  2. Download Tabula and install the Ruby dependencies. (Note: if using rvm or rbenv, ensure that JRuby is being used.

    git clone git://
    cd tabula
    gem install bundler
    bundle install

Then, start the development server:

bundle exec rackup

The site instance should now be viewable at .

You can a couple some options when executing the server in this manner:

TABULA_DATA_DIR="/tmp/tabula" \
bundle exec rackup
  • TABULA_DATA_DIR controls where uploaded data for Tabula is stored. By default, data is stored in the OS-dependent application data directory for the current user. (similar to: C:\Users\foo\AppData\Roaming\Tabula on Windows, ~/Library/Application Support/Tabula on Mac, ~/.tabula on Linux/UNIX)
  • TABULA_DEBUG prints out extra status data when PDF files are being processed. (false by default.)

Alternatively, running the server as a JAR file

Testing in this manner will be closer to testing the "packaged application" version of the app.

bundle exec rake war
java -Dfile.encoding=utf-8 -Xms256M -Xmx1024M -jar build/tabula.jar

Building a packaged application version

After performing the above steps ("Running Tabula from source"), you can compile Tabula into a standalone application:

Mac OS X

rake macosx

This will result in a portable "" archive (inside the build directory) for Mac OS X users.


You can build .exe files for the Windows target on any platform.

Download a 3.1.X (beta) copy of Launch4J.

Unzip it into the Tabula repo so that "launch4j" (with subdirectories "bin", etc.) is in the repository root.


rake windows

This will result in a portable "" archive (inside the build directory) for Mac OS X users.

If you have issues, you can try building manually. (These commands are for OS X/Linux and may need to be adjusted for Windows users.)

# (from the root directory of the repo)
rake war
cd launch4j
ant -f ../build.xml windows

A "tabula.exe" file will be generated in "build/windows". To run, the exe file needs "tabula.jar" (contained in "build") in the same directory. You can create a .zip archive by doing:

# (from the root directory of the repo)
cd build/windows
mkdir tabula
cp tabula.exe ./tabula/
cp ../tabula.jar ./tabula/
zip -r9 tabula
rm -fr tabula


Interested in helping out? See for ideas.

Something went wrong with that request. Please try again.