Skip to content

dwightkelly/jbig2enc

 
 

Repository files navigation

Welcome, gentle reader.

This is an encoder for JBIG2:
www.jpeg.org/public/fcd14492.pdf

JBIG2 encodes bi-level (1 bpp) images using a number of clever tricks to get
better compression than G4. This encoder can:
   * Generate JBIG2 files, or fragments for embedding in PDFs
   * Generic region encoding
   * Perform symbol extraction, classification and text region coding
   * Perform refinement coding and,
   * Compress multipage documents

It uses the (Apache-ish licensed) Leptonica library:
  http://www.leptonica.com/

You'll need version 1.63 - which is included with the distribution tarball.

KNOWN BUGS:

The refinement coding causes Acrobat to crash. It's not known if this is a bug
in Acrobat, though it may well be.

BUILDING:

First, read the file PATENTS in the same directory as this file.

Fetch leptonica from its website and build it. For this you will also
need the standard libtiff, libpng and libjpeg.

If leptonica was unpacked in ../leptonlib-1.58/ you should just be able to type
make. Otherwise, fix up the path at the beginning of the Makefile first.

USAGE:

See the jbig2enc.h header for the high level API, or the jbig2 program for an
example of usage. It's very short and most of it is taken up with argument
parsing.

CHANGES:

0.27:
  * Update to the latest Leptonica (1.58)
  * Fix comments in pdf.py (thanks Dan)

0.26:
  * Update to the latest Leptonica (1.53)

0.25:
   * Should build on Windows if you get a leptonica which does so (thanks to
     Dwight Kelly and a man known only as Dennis)

0.24:
   * Update to leptonica 1.38
   * Add ability to gray symbols but it's commented out because it does little
     good
   * Refinement support broke because of memory savings in leptonica

0.23:
   * Quick release to support leptonica 1.37 (now bundled)

0.22:
   * Added segmentation support to extract text from mixed images
   * Flyspeck removal disabled for now
   * A few minor fixes

0.21:
   * Patch from Alberto Accomazzi to fix the pdf.py script and to output more
     sensible filenames in pdf mode
   * Symbols which are only used once are now included in a per-page dictionary
     (great for scans of old books)
   * Fixed several scale bugs (one at 65K symbols and one at 100K symbols)
   * Page numbers > 256 went wrong

About

JBIG2 Encoder

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 94.3%
  • Python 5.7%