EPUB KB policy testing
Contents of this repository
A set of EPUB test files, specifically created for the purpose of testing an automated procedure to validate EPUBs against KB institutional policies. These policies require the following:
- Files must be valid EPUB (version 2 or 3)
- File may not contain DRM or encryption (edge case: font mangling, should be permitted)
- All resources in the container fall within the Core Media Types
- No Digital Talking Book (DTB) content documents
As a result, most of the files in this repo deliberately violate one or more of the above requirements.
Some of the files were newly created (with a little help from Sigil), whereas others were taken or adapted from other openly-licensed data sets.
- content - uncompressed contents of each test file (each subdirectory represents one epub)
- build - actual epub builds
- epubcheckout - epubcheck output
- pubresources - various resources (files) that were used for creating the epubs.
The script build.sh iterates over all subdirectories in the content folder and compresses the contents of each to a functional epub file in the build directory.
For an explanation of how the build process works, see here.
The script analyse.sh validates all epubs in the build directory with Epubcheck (it uses both the stable 3.0 version and the alpha 4.0.0 one). You have to install these yourself on your system. Then update the file paths to epubcheck3Jar and epubcheck4Jar at the top of the script.
Description of test files
|File name||Epub version||Description||Epubcheck output|
|epub20_minimal.epub||2||Basic file with one text resource and one image||3,4|
|epub20_minimal_encryption.epub||2||Includes encryption.xml resource in
|epub30_font_obfuscation.epub||3||Includes fonts that are obfuscated (which results in hasEncryption in epubcheck). Taken from EPUB 3 Sample Documents (wasteland with OTF fonts, obfuscated).||3,4|
|epub20_foreign_resource_no_fallback.epub||2||Includes JP2 image, which is a format that is not on the list of Core Media Types; no fallback defined||3,4|
|epub20_foreign_resource_with_fallback.epub||2||Includes JP2 image, which is a format that is not on the list of Core Media Types; fallback defined in manifest, identifier in content document||3,4|
|epub20_foreign_resource_with_fallback_noID.epub||2||Includes JP2 image, which is a format that is not on the list of Core Media Types; fallback defined in manifest, no identifier in content document||3,4|
|epub20_dtbook.epub||2||Includes Digital Talking Book content. Taken from threepress, published under BSD 3 license.||3,4|
How to add a new test file
- Add uncompressed directory structure to content folder
- Run script to update the builds
- Add descriptive entry to table above
All files here are released under the Creative Commons 3.0 BY-SA license, unless stated otherwise.