Automatically exported from code.google.com/p/oac-ead-to-pdf
XSLT Python Shell
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
fonts
images
javalib
log
sample_files
scripts
xslt
.hgignore
INSTALL.txt
LICENSE.txt
README.txt
check_oac_ead_font_support.sh
fix_relative_css_paths.py
font_supports_file.py
font_supports_file.sh
install.sh
oac4_to_pdf.xslt.in
oac_pdf-unifont.css.in
oac_pdf.css.in
pdf_gen.py
pdf_gen.sh
pdf_gen_by_size.py
pdf_gen_by_size.sh
pdf_gen_by_size_parallel.py
pdf_gen_by_size_parallel.sh
pdf_gen_list_parallel.py
timeout.py

README.txt

This is the code used to generate PDF's from Encoded Archival Description (EAD)
xml file for the Online Archive of California (OAC) website. 

This is organically grown code for PDF generation. The pisa tool was chosen in
order to have the PDF's match the html view of EAD as much as possible.

This is a refactored release and requires upgrading of a number of libraries.
The new libraries and code correctly handle Unicode characters and will embed the appropriate DejaVuSans font element into the pdf for non latin-1 unicode charaacters.

It also adds the capacity to take advantage of multi-processors with new code in the *parallel.py files. This uses the Parallel Python library to control the multiple processes.

The main code is in pdf_gen.py and should run directly from pdf_gen.py.
The files in the scripts/ directory are more specific to our OAC setup, but they do provide examples of various options.

INSTALL:
========

Run -->

python fix_relative_css_paths.py

to generate css and xslt files with the correct absolute path to your install
directory.


Known Issues:
=============

On large EAD (over a few megabytes), this program can take a *very* long time to
run. A timeout function wrapper was introduced to address this when working on
large sets of EAD. Also, see the pdf_gen_by_size.py for a code that runs well on
a large set of EAD.

The timeout function utilizing signals doesn't work on Windows. If running on
windows, you might try commenting out the timeout wrapping code.

Contact me at: mark.redar@ucop.edu.