Skip to content

Commit

Permalink
Merge pull request #48 from deborahgu/iamaster
Browse files Browse the repository at this point in the history
Incorporated all internetarchive fork changes, with some efficiency, PEP8 modifications.
  • Loading branch information
deborahgu committed May 14, 2018
2 parents c76af24 + 89d918d commit c61c24f
Show file tree
Hide file tree
Showing 19 changed files with 299 additions and 199 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Byte-compiled / optimized / DLL files
__pycache__/
.pytest_cache/
*.py[cod]
*$py.class

Expand Down
1 change: 1 addition & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
include requirements.txt
56 changes: 22 additions & 34 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Introduction
This module transforms ABBYY XML documents, generated by ABBYY FineReader 10,
into primitively accessible ePub 3. The code is optimized for ABBYY XML
documents created by the Internet Archive, though it may work for other ABBYY
XML as well.
XML as well.

Features
========
Expand Down Expand Up @@ -37,45 +37,43 @@ Requirements

* Python 3
* If running epubcheck, a Java Runtime environment
* If running DAISY Ace, Node.js
* If using Kakadu, `install the binaries <http://kakadusoftware.com/downloads/>`_ and add the your PATH and LD_LIBRARY_PATH

Usage
=====

From within a Python program:

.. code:: python
.. code:: python
from abbyy_to_epub3 import create_epub
book = create_epub.Ebook('docname') # See *Assumptions* below.
book.craft_epub()
From the shell:

.. code:: bash
.. code:: bash
abbyy2epub docname # See *Assumptions* below.
The available command line arguments are:

.. code:: bash
usage: abbyy2epub [-h] [-d] [--epubcheck] [--ace] docname
usage: abbyy2epub [-h] [-d] [--epubcheck] docname
Process an ABBYY file into an EPUB
positional arguments:
item_dir The file path where this item's files are kept.
item_identifier The unique ID of this item.
item_bookpath The prefix to a specific book within an item.In a simple
book, usually the same as the item_identifier.
book, usually the same as the item_identifier.
optional arguments:
-h, --help show this help message and exit
-d, --debug Show debugging information
--epubcheck Run EpubCheck on the newly created EPUB
--ace Run DAISY Ace on the newly created EPUB
System dependencies
Expand All @@ -85,47 +83,39 @@ If you'd like to run `epubcheck <https://github.com/IDPF/epubcheck>`_, there
are certain system dependencies. Depending on running environment, these may
need to be manually installed. On Ubuntu, I installed these with:

.. code:: bash
.. code:: bash
sudo apt-get install default-jre libpython3-dev
If you'd like to run the DAISY Ace accessibility checker, you'll also need
Node.js and Ace. On Ubuntu, I installed these with:

.. code:: bash
sudo apt-get install nodejs
sudo npm install ace-core -g
If Ace successfully installed, you should be able to run:

.. code:: bash
ace --help
at the command line. This should display usage information. For more
information see the `Ace Getting Started Guide
<http://inclusivepublishing.org/toolbox/accessibility-checker/getting-started/>`.
Installation
============

This package can be installed on your local system. From the directory
containing setup.py:

.. code:: bash
.. code:: bash
pip install -r requirements.txt
python setup.py develop
pip install .
You can rebuild the documentation, which is generated with Sphinx.

.. code:: bash
.. code:: bash
cd docs
make html
Deploying at the Internet Archive
===================

Before deploying, make sure you bump the version of the package in `__init__.py`. Then, run the `upload.sh` script in the root of the repository and enter the appropriate Internet Archive credentials when prompted.

You can test that the package has been installed correctly by going to https://devpi.archive.org or by running `$ pip3 install --upgrade -i https://petaboxdevpi:{PASSWORD}@devpi.archive.org/books/formats abbyy_to_epub3`.

Note that `petaboxdevpi:{PASSWORD}` is not needed inside IA network`

Testing
===================

Expand All @@ -143,11 +133,9 @@ specific book. Given a datanode and an `item_dir` of an item, all the
constituent files for a book can be constructed using `item_identifier` and
`item_bookpath` in the following ways:

In order to access the files of an item, you need to know:

# The `item_identifier` (the unique ID of this item)
# The `item_dir` is the file path where this items files are kept
# The `item_bookpath` is name of the particular book file, often the same as `item_identifier`
- The `item_identifier` (the unique ID of this item)
- The `item_dir` is the file path where this items files are kept
- The `item_bookpath` is name of the particular book file, often the same as `item_identifier`

The structure is assumed to be:

Expand Down
4 changes: 4 additions & 0 deletions abbyy_to_epub3/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@

__title__ = 'abbyy_to_epub3'
__version__ = '1.6.5'
__author__ = '@deborahgu'
32 changes: 20 additions & 12 deletions abbyy_to_epub3/commandline.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
import argparse
import logging

from abbyy_to_epub3 import create_epub
from abbyy_to_epub3.create_epub import Ebook

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -51,16 +51,22 @@ def main():
),
)
parser.add_argument(
'--epubcheck',
default=False,
action='store_true',
help='Run EpubCheck on the newly created EPUB',
'-o',
'--out',
default=None,
help='Output path for epub',
)
parser.add_argument(
'--ace',
default=False,
action='store_true',
help='Run DAISY Ace on the newly created EPUB',
'--tmpdir',
default=None,
help='Specify custom path for tmp abbyy and jp2 files'
)
parser.add_argument(
'--epubcheck',
nargs='?',
const=Ebook.DEFAULT_EPUBCHECK_LEVEL,
help='Run EpubCheck on the newly created EPUB. '
'Options: `warning` & worse (default), `error` & worse, `fatal` only',
)
args = parser.parse_args()

Expand All @@ -69,14 +75,16 @@ def main():
if debug:
logger.addHandler(logging.StreamHandler())
logger.setLevel(logging.DEBUG)
book = create_epub.Ebook(
book = Ebook(
args.item_dir,
args.item_identifier,
args.item_bookpath,
debug=debug,
args=args,
epubcheck=args.epubcheck,
)
book.craft_epub(
epub_outfile=args.out or 'out.epub', tmpdir=args.tmpdir
)
book.craft_epub()


if __name__ == "__main__":
Expand Down
3 changes: 1 addition & 2 deletions abbyy_to_epub3/constants.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright 2017 Deborah Kaplan
# Copyright 2017 Deborah Kaplan
#
# This file is part of Abbyy-to-epub3.
# Source code is available at <https://github.com/deborahgu/abbyy-to-epub3>.
Expand Down Expand Up @@ -31,7 +31,6 @@
# on each block, just use a custom pagetype for anything where
# "addToAccessFormats" is set to false. This is 'skippable.'
skippable_pages = [
'cover',
'copyright',
'color card',
'skippable',
Expand Down

0 comments on commit c61c24f

Please sign in to comment.