@manisandro manisandro released this Sep 26, 2018 · 46 commits to master since this release

Assets 7

gImageReader 3.3.0 (Sep 26 2018):
This is the first stable release of the 3.3.x series. The main change compared to 3.2.99 is support for the script traineddatas which were introduced with tesseract 4.x.

As with previous releases, the Windows builds using tesseract 4 are still to be considered experimental.

For a full list of changes between 3.2.99 and 3.3.0, see the git commit log.

Pre-release

@manisandro manisandro released this Feb 24, 2018 · 157 commits to master since this release

Assets 7

gImageReader 3.2.99 (Feb 24 2018)
This is the beta release for gImageReader 3.3.0. The main highlight is a much expanded hOCR editor, and many bug fixes. Consult the changelog below for details. Special thanks to @ZaMaZaN4iK and @SantosSi for their valuable contributions both in code and improvement ideas.

There are a number of incomplete translations, so this would be a great moment for interested people to update their translations. gImageReader now hosts its translations on Weblate, so translating is easier than ever!

Please report any issues you might find to ensure a polished 3.3.0 release.

As with previous releases, the Windows builds using tesseract 4 are to be considered experimental.

Binary packages for Linux are available for Ubuntu in the gImageReader-devel PPA and for Fedora in this COPR repository.

Changelog

  • Add support for reading DJVU documents
  • Add support for encrypted PDF files
  • Rewrite HOCR editor and greatly expand its functionality:
    • Allow displaying confidence values in HOCR tree
    • Allow clicking in the canvas to jump to the corresponding item in the HOCR tree
    • Support mass-editing of HOCR child item attributes from parent
    • Honour font family attributes if possible
    • Honour and allow toggling bold and italic attributes
    • Correctly honour the baseline
    • Add search/replace and substitution list support
    • Add preview mode while editing
    • Allow manually adding lines, words and paragraphs
    • Allow swapping items
    • Automatically adjust parent bounding boxes when resizing and removing children
    • Add navigation toolbar to facilitate navigating through the HOCR tree
    • Use relative paths to source files in HOCR HTML document if source files are on same level or below the HOCR file
    • Add export to text
    • Add export to ODT
    • Allow choosing paper size in PDF export
    • Allow setting document metadata in PDF export
    • Allow setting encryption in PDF export
    • [Qt] Allow using QPrinter as PDF export backend, which has better support for complex scripts

@manisandro manisandro released this Jul 1, 2017 · 452 commits to master since this release

Assets 7

gImageReader 3.2.3 (Jul 01 2017):

  • Fix broken hOCR export
  • Add option to prepend source filename / page to plain text output

Please note that the tesseract4.0.0.git2b854e3 builds are experimental, intended for those who want to try out the latest tesseract 4.0.0 alpha version. Make sure you update your tessata files if you use that version!

@manisandro manisandro released this Jun 30, 2017 · 461 commits to master since this release

Assets 7

gImageReader 3.2.2 (Jun 30 2017):

  • Attempt to use original source image for PDF output
  • Allow collapsing/expanding branches of hOCR tree via context menu
  • Recognize guillemets as quote characters
  • Fix crash when adding zero-page sources
  • Fix possible crash when rapidly switching documents
  • [Gtk] Fix output pane orientation not properly restored
  • [Gtk] Don't crash when rendering of image fails
  • [Gtk] Fix icons not appearing with recent Gtk versions
  • [Qt] Don't display empty image if rendering of downscaled image fails

Please note that the tesseract4.0.0.git2b854e3 builds are experimental, intended for those who want to try out the latest tesseract 4.0.0 alpha version. Make sure you update your tessata files if you use that version!

@manisandro manisandro released this Feb 10, 2017 · 486 commits to master since this release

Assets 7

gImageReader 3.2.1 (Feb 10 2017):

  • Add possibility to rotate individual pages of multipage documents
  • Ensure the tessdata manager downloads compatible tesseract languge definitions
  • Add CCITT Group4 compression option for monochrome PDF export
  • Allow choosing between diffuse and threshold dithering for monochrome PDF export
  • Preview JPEG compression quality in PDF output preview
  • Make brightness/contrast/resolution changes affect all selected sources
  • [Qt] Support multipage images through QImageReader (Qt5.9+ will support multipage TIFFs)
  • [Gtk] Fix hang when saving selection image
  • [Qt] Fix possible deadlock when rapidly switching sources
  • Updated translations

Update Feb 13 2017
Added experimental windows builds using tesseract-4.0.0 alpha.

@manisandro manisandro released this Nov 23, 2016 · 522 commits to master since this release

Assets 5

gImageReader 3.2.0 (Nov 23 2016):

This is the first stable release of the 3.2.x series. It includes many bug fixes since 3.1.99, most of which were tracked down and patched by Daniel Plakhotich.

Starting from 3.2.0 I'll be maintaining a FAQ page.

Changelog:

@manisandro manisandro released this Oct 13, 2016 · 583 commits to master since this release

Assets 5

gImageReader 3.1.99 (Oct 13 2016):

This is the release candidate for gImageReader 3.2. The main highlight is a greatly enhanced hOCR editor and PDF export functionality.

Please report any issues you may find to ensure a polished 3.2.0 final release. If the translation for your language is missing or incomplete, this would be a good moment to submit an updated translation according to the instructions in the Readme.

Many thanks to all the users who provided valuable feedback and suggestions.

Changelog

  • General improvements:
    • Catch critical tesseract errors which otherwise result in the application crashing
    • Improve spelling dictionary auto-installation logic
    • Allow choosing whether to store language files (language definitions, spelling dictionaries) in system-wide or user-local directories
  • Plain text mode improvements:
    • Allow recognizing user-defined regions on multiple pages
    • Also treat \u2014 character as a hyphen
    • Make preserve paragraphs option correctly deal with trailing whitespace
  • hOCR editor improvements:
    • Add "Add to dictionary" and "Ignore word" actions to spell-checking menu in hOCR editor
    • Exclude non-word characters from spell-checking
    • Allow merging adjacent word items
    • Allow adjusting bounding boxes of document elements by resizing the selection in the canvas
    • Allow removing arbitrary items from the document tree
    • Allow defining custom graphic regions from context-menu of the respective page item
  • PDF export improvements:
    • Add previewing capability
    • Take into account baseline information to better position the words in the generated PDF
    • Add options to choose color format and compression of images written to PDF, allowing to greatly reduce the size of PDF
    • Correctly handle paper size and DPI
    • Improve logic for uniformizing word and line spacing
    • Make sure correct hypen character is used, allowing PDF applications to correctly find hyphenated words
  • New and updated translations
  • Various bug fixes
  • Full details in commit log: https://github.com/manisandro/gImageReader/commits/master

@manisandro manisandro released this May 3, 2016 · 691 commits to master since this release

Assets 5

gImageReader 3.1.91 (May 03 2016):

This is a beta release. Please report any issues you may find.

For the translation status, see https://translations.launchpad.net/gimagereader

Note: On recent Windows versions, if you want to use the Tessdata Manager, you currently need to run the program as administrator (via right-click on the application shortcut).

@manisandro manisandro released this Apr 27, 2016 · 697 commits to master since this release

Assets 5

gImageReader 3.1.90 (Apr 28 2016):

  • gImageReader 3.2 beta 1
  • Add an initial hOCR editor implementation, with possibility to save as hOCR HTML, PDF with invisible text overlay, or a PDF reconstructed from the extracted text and graphics
  • Allow selecting and working on multiple sources at once
  • Add a tessdata manager, to conveniently manage tesseract language definitions directly from the application
  • Show a progress bar when recognizing, add a cancel button
  • Modernized Gtk UI
  • Expose script and orientation detection support
  • Possiblity to pan via middle button drag
  • Remove the need to specify the culture code in custom language definitions, and use a built-in language-culture mapping instead to search for spelling dictionaries
  • Various bug fixes
  • Full details in commit log: https://github.com/manisandro/gImageReader/commits/master

This is a beta release. Please report any issues you may find.

For the translation status, see https://translations.launchpad.net/gimagereader

@manisandro manisandro released this Jun 30, 2015 · 816 commits to master since this release

Assets 7

gImageReader 3.1.2 (Jun 30 2015):

  • Fix incorrect behavior of "Append to current text" with multiple recognition areas

Update Feb 19 2016
Windows installers built against tesseract 3.04.00 are available for testing. People encountering crashes when using traineddata files generated for tesseract 3.04.00 should try these.

Update Feb 27 2016
Tesseract 3.04.00 Windows Installers rebuilt to include SSL libraries (fixes dictionary autoinstall failures). Links to tessdata files in manual have also been updated.