Latest release

1.1.0 Release (1.1.0.16092117)

@mtigas mtigas released this Sep 22, 2016 · 12 commits to master since this release

tabula 1.1.0 / tabula-java 0.9.1

We're proud to announce the first official release of Tabula 1.1! This version contains a rewrite of our processing backend which should provide a significant performance increase. The backend rewrite also improves support for RTL languages and fixes many other bugs.

This page contains technical release notes; please visit the Tabula official homepage for an overview of Tabula and quick download links:
http://tabula.technology/


If you have any issues with this version of Tabula, please let us know!

New features / bugfixes

  • Tabula now extracts tables up to 7x faster than previous versions.
  • Table auto-detection has also been improved. (#456, tabulapdf/tabula-java#56)
  • If there’s an error during file upload & initial processing, warn the user. (#433)
  • Allow running the jar distribution in "headless" mode. Users for the jar distribution will need to manually open their web browser to the Tabula page (normally http://127.0.0.1:8080/ ).
  • Improved support for RTL languages like Hebrew and Arabic. (tabulapdf/tabula-java#66)
  • Upgraded to jruby-9.1.5.0, improving encoding support.
  • Lots and lots of other improvements — the extraction and processing backend has been completely rewritten!

Known Issues & Caveats

  • Note: If you are using the .jar version (for Linux/etc), you now need to open your browser to the Tabula page (http://127.0.0.1:8080/) manually.
  • Caveat: Tabula only works on text-based PDFs (ones where you can select text). Scanned documents do not work, and we do not recommend OCR for large files unless you have a data cleaning plan since even state-of-the-art OCR software can have significant error rates.
  • _OS X Gatekeeper_: If you’re running Mac OS X 10.8 or later and get a message that says "Tabula can't be opened because it is from an unidentified developer" message, please let us know. (It shouldn't be happening anymore.) See this GateKeeper page for more assistance — your "Allow applications downloaded from" setting should be set to "Mac App Store and identified developers" or "Anywhere."

Downloads

Windows & Linux users need to have Java installed to use Tabula. You can download Java here. The Mac version contains an integrated copy of Java.

Verification:

SHA-256

$ shasum -a256 tabula*.zip
4df6dce00f3bf7393684cb832c4c3cf3e2539edb6d62d3a0042330254d593826  tabula-jar-1.1.0c.zip
70ec8a524e881ed66d6048776ed0ceb16a0c8b68d06e1295a39698e836274b04  tabula-mac-1.1.0c.zip
243270c5918229415223794da56dffd8ba102683d195579a7c9f6aa578a2765a  tabula-win-1.1.0c.zip

PGP

You can also verify that you are downloading an authentic, unmodified version of Tabula using PGP. Files are signed with Mike Tigas’ key (0xA993E7156E0E9923), which is available here, on Keybase, or on most key servers.

Download the .zip of the Tabula version you want and also download the corresponding .zip.asc, then use the gpg --verify command, such as:

$ gpg --verify tabula-jar-1.1.0c.zip.asc

You will want the output to contain Good signature from "Mike Tigas <...>" somewhere in it.

Downloads

Pre-release

1.1.0-beta.1

@mtigas mtigas released this Mar 31, 2016 · 41 commits to master since this release

tabula 1.1.0-beta.1 / tabula-java 0.9.0

We're proud to announce the first public beta preview of Tabula 1.1! This version contains several bugfixes and a rewrite of our processing backend which should provide a significant performance increase.

This is a beta release: because Tabula utilizes a new processing backend, users may encounter bugs that were not present in previous versions of Tabula. If you have any issues with this beta version of Tabula, please let us know!

New features / bugfixes

  • Tabula now extracts tables up to 7x faster than previous versions.
  • Table auto-detection has also been improved. (#456, tabulapdf/tabula-java#56)
  • If there’s an error during file upload & initial processing, warn the user. (#433)
  • Lots and lots of other improvements — the extraction and processing backend has been completely rewritten!

Known Issues & Caveats

  • Bug: This version of Tabula always shows that there is an update, even for the same version: "New version! Tabula 1.1.0-beta.1 is available (you have 1.1.0-beta1)".
  • Bug: Although the backend for the Tabula app has been rewritten using tabula-java instead of tabula-extractor, the "script" export option still generates commands to call the older tabula-extractor. To fix this, you may follow the download and example instructions for tabula-java, and replace the tabula command with java -jar tabula-0.9.0-jar-with-dependencies.jar in the generated script. (#484)
  • Caveat: Tabula only works on text-based PDFs (ones where you can select text). Scanned documents do not work, and we do not recommend OCR for large files unless you have a data cleaning plan since even state-of-the-art OCR software can have significant error rates.
  • _OS X Gatekeeper_: If you’re running Mac OS X 10.8 or later and get a message that says "Tabula can't be opened because it is from an unidentified developer" message, please let us know. (It shouldn't be happening anymore.) See this GateKeeper page for more assistance — your "Allow applications downloaded from" setting should be set to "Mac App Store and identified developers" or "Anywhere."

Downloads

Windows & Linux users need to have Java installed to use Tabula. You can download Java here. The Mac version contains an integrated copy of Java.

Verification:

SHA-1

$ shasum -a1 tabula*.zip
647602da17d365f107260afd8a8b5b3e1d687b98  tabula-jar-1.1.0-beta.1a.zip
054e5c2eb68149f3384b56cc615539e4050de479  tabula-mac-1.1.0-beta.1a.zip
bbdca4686b3818e9ec0be5d36e7bc48e944a77d6  tabula-win-1.1.0-beta.1a.zip

PGP

You can also verify that you are downloading an authentic, unmodified version of Tabula using PGP. Files are signed with Mike Tigas’ key (0xA993E7156E0E9923), which is available here, on Keybase, or on most key servers.

Download the .zip of the Tabula version you want and also download the corresponding .zip.asc, then use the gpg --verify command, such as:

$ gpg --verify tabula-jar-1.1.0-beta.1.zip.asc

You will want the output to contain Good signature from "Mike Tigas <...>" somewhere in it.

Downloads

1.0.1

@mtigas mtigas released this Sep 15, 2015 · 109 commits to master since this release

tabula 1.0.1 / tabula-extractor 0.8.0

This Tabula update is a minor update which fixes several major issues with our 1.0 release.

New features / bugfixes

  • "Extract" button now works in Chrome and Safari. (#350 #368 #375 #383 #385 #386 )
  • Fix some auto-detect behavior in documents with many pages. (#355 #356 #358 #360)
  • Port number changed back to 8080 due to some situations where Tabula did not actually run on the desired port.
  • Fix some behavior when loading documents imported with old versions of Tabula.
  • jruby updated to 1.7.22 (#381)
  • Mac version now uses Java 1.8.0_60
  • tabula-extractor updated to 0.8.0 (#382)

Known Issues & Caveats

  • Caveat: Tabula only works on text-based PDFs (ones where you can select text). Scanned documents do not work, and we do not recommend OCR for large files unless you have a data cleaning plan since even state-of-the-art OCR software can have significant error rates.
  • _OS X Gatekeeper_: If you’re running Mac OS X 10.8 or later and get a message that says "Tabula can't be opened because it is from an unidentified developer" message, please let us know. (It shouldn't be happening anymore.) See this GateKeeper page for more assistance — your "Allow applications downloaded from" setting should be set to "Mac App Store and identified developers" or "Anywhere."

Downloads

Verification:

SHA-1

$ shasum -a1 tabula*.zip
3e1a9879ad2d414c90917a97ed2c48171897d710  tabula-jar-1.0.1.zip
674eea7018411bdecf9c32052de7305a90950b01  tabula-mac-1.0.1.zip
2e72609435412e93a81ec7c1ab3db1219806cd0c  tabula-win-1.0.1.zip

PGP

You can also verify that you are downloading an authentic, unmodified version of Tabula using PGP. Files are signed with Mike Tigas’ key (0xA993E7156E0E9923), which is available here, on Keybase, or on most key servers.

Download the .zip of the Tabula version you want and also download the corresponding .zip.asc, then use the gpg --verify command, such as:

$ gpg --verify tabula-jar-1.0.1.zip.asc

Downloads

1.0.0

@mtigas mtigas released this Aug 6, 2015 · 109 commits to master since this release

tabula 1.0.0 / tabula-extractor 0.7.6

We're proud to announce the release of Tabula 1.0!

This release features an overhaul of the Tabula user interface, designed by Jason Das (@floodfish). The new interface improves page selection and streamlines a typical user’s workflow.

Special thanks to the Knight Foundation for supporting this work with their Prototype Fund, and thanks to all who helped us squash bugs as we beta tested this redesign over the summer.

New features / bugfixes

  • New user interface!
  • Tabula now runs on port 34555 by default, preventing conflict with some software running on port 8080. To access Tabula, you now go to http://127.0.0.1:34555/ instead of http://127.0.0.1:8080/. (#322 #331)
  • Fixed issue where "copy to clipboard" output was returning bad (unquoted) data. (#315)
  • [Mac] Tabula now comes bundled with Java (Java 8u51); fixes "Legacy Java Environment (SE 6) Is Required" issue on newer versions of OSX. (#237)
  • [Mac] Fixed "…can't be opened because it is from an unidentified developer" message in some cases, due to using the wrong codesigning identity. (#327)

Known Issues & Caveats

  • "Export" download feature is currently broken in Chrome. #350
  • Some intermittent errors, possibly due to autodetect. (#355 #356 #358 #360) If you encounter an error while extracting, please report it to us and tell us the steps you took before the error occurred.
  • Caveat: Tabula only works on text-based PDFs (ones where you can select text). Scanned documents do not work, and we do not recommend OCR for large files unless you have a data cleaning plan since even state-of-the-art OCR software can have significant error rates.
  • _OS X Gatekeeper_: If you’re running Mac OS X 10.8 or later and get a message that says "Tabula can't be opened because it is from an unidentified developer" message, please let us know. (It shouldn't be happening anymore.) See this GateKeeper page for more assistance — your "Allow applications downloaded from" setting should be set to "Mac App Store and identified developers" or "Anywhere."

Downloads

Verification:

SHA-1

$ shasum -a1 tabula*.zip
432927b1f9e52e407912b0b2692c579144ea2b89  tabula-jar-1.0.0.zip
32ff0912e5c5db9d687d7b06c42b7163867334d7  tabula-mac-1.0.0.zip
187ab31958ff8df09afa5f36e9be4f02255ba75d  tabula-win-1.0.0.zip

PGP

You can also verify that you are downloading an authentic, unmodified version of Tabula using PGP. (As an example, you can read more about what this means on the Tor Project’s page about how users can verify their downloads.) Files are signed with Mike Tigas’ key (0xA993E7156E0E9923), which is available here, on Keybase, or on most key servers.

Download the .zip of the Tabula version you want and also download the corresponding .zip.asc, then use the gpg --verify command, such as:

$ gpg --verify tabula-jar-1.0.0.zip.asc

Downloads

0.9.7

@mtigas mtigas released this Jan 31, 2015 · 293 commits to master since this release

tabula 0.9.7 / tabula-extractor 0.7.6


New features / bugfixes

  • Fix Internal Server Error issue when selecting empty areas #207 #234 #235 #238 (tabula-extractor 0.7.6)
  • Downgraded to jRuby 1.7.15, due to encoding error causing Internal Server Error in Windows #203

Checksums:

MD5

MD5 (tabula-jar-0.9.7.zip) = 4cdec4df69e832f7785bea938e3a702c
MD5 (tabula-win-0.9.7.zip) = b3bdd152a70a12a9fca217956601d5de
MD5 (tabula-mac-0.9.7.zip) = 2daa05889afdb68f0b3bafd16b8b9f01
MD5 (tabula-mac-0.9.7-large-experimental.zip) = 0bde53edfb5fb93cd526582cbce8eb2d

SHA-1

ac10d31de01dff7edd2ef09e75cf6cbb5e997442  tabula-jar-0.9.7.zip
878b1969bbade75427ae39a5bc054a43ca308785  tabula-win-0.9.7.zip
97a53ec892b6d9ba623e1977bc7a6a7ff951e93b  tabula-mac-0.9.7.zip
8588719dbd36bea98ff8aac09c99457a16657083  tabula-mac-0.9.7-large-experimental.zip

_OS X Gatekeeper_: If you’re running Mac OS X 10.8 or later and get a "Tabula can't be opened because it is from an unidentified developer" message, please let us know. Generally, you can bypass this by right-clicking or control-clicking on the app and then pressing "Open". See this GateKeeper page for more assistance — your "Allow applications downloaded from" setting should be set to "Mac App Store and identified developers" or "Anywhere."

_OS X & Java SE 6 runtime_: If you receive an error that says "you need to install the legacy Java SE 6 runtime" and are using OS X 10.9 Mavericks or OS X 10.10 Yosemite, please try downloading tabula-mac-0.9.7-large-experimental.zip and report whether it worked or not by commenting on this thread.

Downloads

0.9.6

@mtigas mtigas released this Sep 29, 2014 · 304 commits to master since this release

tabula 0.9.6 / tabula-extractor 0.7.5

Note: Tabula 0.9.6 for Windows was temporarily unavailable due to an issue; Windows users can now download and install this version.

New features / bugfixes

  • Tabula doesn't trigger the Gatekeeper "unidentified developer" prompt anymore!
  • Certain encoding issues under Windows (incompatible character encodings: UTF-8 and CP850) are fixed. Ticket: #102; related: #190 #197
  • Improved detection of tables with no bounding frame. Ticket: tabulapdf/tabula-extractor#69
  • An "advanced options" menu is available, including the ability to generate scripts for the command-line version of tabula-extractor.
  • Upgraded to jRuby 1.7.16, which may improve stability and performance.

Checksums:

MD5

$ md5 tabula*.zip
MD5 (tabula-jar-0.9.6.zip) = fc5ad534357887d53b36f1f7947f0808
MD5 (tabula-mac-0.9.6.zip) = 586a9e8e00b752db96737c929436ff26
MD5 (tabula-win-0.9.6a.zip) = 18a0f674744ac71a6ba67b3226ffec5c

SHA-1

$ shasum -a1 tabula*.zip
46c2d8b6c5ee595823e64d8ba9c9548831ba70fd  tabula-jar-0.9.6.zip
3fced0b4cac5f9c8c83c254568ae1328592b62fb  tabula-mac-0.9.6.zip
0d970913844690287540d1e9c56fa8503576ecd7  tabula-win-0.9.6a.zip

_Note_: If you’re running Mac OS X 10.8 or later and get a "Tabula can't be opened because it is from an unidentified developer" message, please let us know. Generally, you can bypass this by right-clicking or control-clicking on the app and then pressing "Open". See this GateKeeper page for more assistance — your "Allow applications downloaded from" setting should be set to "Mac App Store and identified developers" or "Anywhere."

Downloads

0.9.5

@mtigas mtigas released this May 25, 2014 · 323 commits to master since this release

tabula 0.9.5 / tabula-extractor 0.7.4

NOTE: Due to a packaging error, the previous 0.9.4 version has been replaced. Tabula 0.9.5 is functionally identical to version 0.9.4.

New features / bugfixes

  • v0.9.5.2 (Mac OS X only): Tabula doesn't trigger the Gatekeeper "unidentified developer" prompt anymore. (We got an Apple Developer account and a signing key!)
  • v0.9.5: Fix issue where systems without Git could not open Tabula. Also fixes update notification mechanism within the app.
  • v0.9.4: Fix "(NoMethodError) undefined method `lines' for []:Array" error when user selects an area with no text. Tickets #170 #168 #159. (tabula-extractor 0.7.4)
  • v0.9.4: Fix launch issue on Linux systems where Java cannot access the desktop environment's default browser. Tickets #171 #148 #147.
  • v0.9.4: Fix "flickering" issue when the text selection’s pop-up window appears. #175 #177
  • v0.9.4: Upgraded to jRuby 1.7.12 & Warbler 1.4.2, which may improve stability and performance.

Checksums:

MD5

$ md5 tabula*.zip
MD5 (tabula-jar-0.9.5.zip) = 50a2dbd22807499cb1b293efd1991d89
MD5 (tabula-mac-0.9.5.2.zip) = c2df654dde586c8665772500f3cacf40
MD5 (tabula-win-0.9.5.zip) = f94bd7c3b7be1d7605277ffbc8fbac81

SHA-1

$ shasum -a1 tabula*.zip
24f0f1215a2d97d34b526342664cea6fb657148a  tabula-jar-0.9.5.zip
fa22acf2a2be0b982b9b4e6eab8156dab30bd298  tabula-mac-0.9.5.2.zip
27254cdf07df61e75b3a963542816f8d08dd0cb0  tabula-win-0.9.5.zip

_Note_: If you’re running Mac OS X 10.8 or later and get a "Tabula can't be opened because it is from an unidentified developer" message, please let us know. Generally, you can bypass this by right-clicking or control-clicking on the app and then pressing "Open". See this GateKeeper page for more assistance — your "Allow applications downloaded from" setting should be set to "Mac App Store and identified developers" or "Anywhere."

Downloads

0.9.4

@mtigas mtigas released this May 16, 2014 · 324 commits to master since this release

tabula 0.9.4 / tabula-extractor 0.7.4

NOTE: Due to a packaging error, this build has been replaced by Tabula 0.9.5, which is functionally identical to this version.

Downloads