Permalink
Commits on Oct 20, 2011
  1. Sketched out an idea for how qpdf's decryption could be integrated an…

    ben committed Oct 20, 2011
    …d 'just work.' The recurseive loop is ugly and this is the first Ruby code I've ever written, so I'm sure there's 100 things wrong with it.
Commits on Oct 3, 2011
  1. Merge pull request #23 from minio-sk/portable-tests

    jashkenas committed Oct 3, 2011
    Make the tests portable
  2. Merge pull request #24 from minio-sk/feature-multiple_languages

    jashkenas committed Oct 3, 2011
    Feature multiple languages
Commits on Oct 1, 2011
  1. Add test for multiple language support

    Michal Barla committed Oct 1, 2011
    This test does not require any additional tesseract language backends to be
    installed, but might fail if tesseract changes its error messages in the
    future.
  2. Allow language parameter for tesseract text extraction

    Michal Barla committed Oct 1, 2011
  3. Make the tests portable:

    kremso committed Oct 1, 2011
    This patch addresses two problems with tests
     - Various tests rely on Dir.glob ordering. This is not reliable; this patch
       introduces assert_directory_contains to avoid Dir.glob ordering
       inconsistencies.
     - test_ocr_extraction relies on exact text match from tesseract. However, this
       differs with each version of tesseract. This patch instead checks that all
       required txt files exist and that they have reasonable size.
  4. Support libre office as office home param for java

    Michal Barla committed Oct 1, 2011
Commits on Sep 29, 2011
  1. Merge pull request #21 from simeonwillbanks/master

    jashkenas committed Sep 29, 2011
    File --mime-type option unrecognized on CentOS
Commits on Sep 28, 2011
  1. CentOS, and most likely other distros, do not have the --mime-type op…

    simeonwillbanks committed Sep 28, 2011
    …tion for file. Here is the error: 'file: unrecognized option --mime-type'. The --mime option is more standard.
Commits on Sep 15, 2011
  1. Merge pull request #13 from simeonwillbanks/master

    jashkenas committed Sep 15, 2011
    Inspect file mime-type
Commits on Sep 14, 2011
  1. Revert "removing Docsplit's default unsharp."

    jashkenas committed Sep 14, 2011
    This reverts commit e9153b7.
Commits on Sep 13, 2011
  1. Docsplit 0.6.0

    jashkenas committed Sep 13, 2011
  2. Issue #10, stop crying wolf.

    jashkenas committed Sep 13, 2011
Commits on Sep 9, 2011
  1. Merge pull request #19 from edtsech/add_brew_to_gh-pages

    jashkenas committed Sep 9, 2011
    Add `brew` command to installation part of gh-page.
  2. Add `brew` to installation.

    edtsech committed Sep 9, 2011
Commits on Sep 1, 2011
  1. At least on my version of Ubuntu (Natty Narwhal) the tesseract librar…

    palewire committed Sep 1, 2011
    …y is labeled as 'tesseract-ocr'
Commits on Aug 4, 2011
  1. Inspect file mime-type to determine if GraphicsMagick can convert fil…

    Simeon Willbanks committed Aug 4, 2011
    …e to PDF
Commits on Jul 25, 2011
  1. Merge pull request #12 from vrybas/11_escape_incoming_file_names

    jashkenas committed Jul 25, 2011
    11 escape incoming file names
Commits on Jul 22, 2011
Commits on May 16, 2011
Commits on May 13, 2011
  1. despeckle before OCR.

    jashkenas committed May 13, 2011
  2. Docsplit 0.5.2

    jashkenas committed May 13, 2011
Commits on Apr 26, 2011
  1. Docsplit 0.5.1

    jashkenas committed Apr 26, 2011
  2. default tesseract to english

    jashkenas committed Apr 26, 2011
Commits on Oct 20, 2010
  1. vowel_5 instead of vowel_4

    jashkenas committed Oct 20, 2010