A text extractor. It had been re-implemented by Ruby. See the link:
C Ruby Shell C++
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
apt
build
chupatext
data
debian
doc
license
modules
po
test
.gitignore
AUTHORS
COPYING
ChangeLog
INSTALL
Makefile.am
NEWS
NEWS.ja
README
README.ja
autogen.sh
chupatext-excel.pc.in
chupatext-gzip.pc.in
chupatext-pdf.pc.in
chupatext-ruby.pc.in
chupatext-tar.pc.in
chupatext-text.pc.in
chupatext-word.pc.in
chupatext-zip.pc.in
chupatext.h
chupatext.pc.in
configure.ac
gtk-doc.make

README

# -*- rd -*-

ChupaText had been re-implemented by Ruby. This code is no longer
changed. See https://github.com/ranguba/chupa-text/ about new
implementation.

= README --- An introduction of ChupaText, a text extraction utility

== Name

ChupaText

== Author

  * Nobuyoshi Nakada <nakada@clear-code.com>
  * Kouhei Sutou <kou@clear-code.com>

== License

  * Source: LGPLv2.1 or later. (detail:
    ((<"license/lgpl-2.1.txt"|URL:http://www.gnu.org/licenses/lgpl-2.1.html>)))
  * Document: Triple license: LGPL, GFDL and/or CC.
    * LGPL: v2.1 or later. (detail:
      ((<"license/lgpl-2.1.txt"|URL:http://www.gnu.org/licenses/lgpl-2.1.html>)))
    * GFDL: v1.3 or later. (detail:
      ((<"license/gfdl-1.3.txt"|URL:http://www.gnu.org/licenses/fdl.html>)))
    * CC: ((<BY-SA|URL:http://creativecommons.org/licenses/by-sa/3.0/>))
  * Exceptions:
    * modules/excel/: GPLv2. (detail:
      ((<"license/gpl-2.txt"|URL:http://www.gnu.org/licenses/gpl-2.html>)))
      They are included in ((<Gnumeric|URL:http://projects.gnome.org/gnumeric/>)).
    * ...

== What's this?

ChupaText is a text extraction utility. It can extracts text
and metadata from PDF and office documents. You can use it
vie library, command line and Web service.

== Dependency libraries and softwares

Required:
  * GLib >= 2.24
  * libgsf

Optional:
  * Poppler
  * wv
  * libgoffice
  * Gnumeric
  * LibreOffice, OpenOffice.org or unoconv
  * ruby >= 1.9.2

== Get

tar.gz: ((<URL:http://rubyforge.org/frs/?group_id=8073>))

== Repository

There is the repository for ChupaText on
((<GitHub|URL:http://github.com/ranguba/chupatext>)).

  % git clone git://github.com/ranguba/chupatext.git

== Install

See ((<install>)).

== Usage

  % chupatext [OPTION ...] FILE ...

FILE is a file what you want to extract from.

See ((<chupatext|"doc/chupatext.rd">)) for more details.

== Thanks

  * Yuto Hayamizu