The Archivist's PDF Cabinet of Horrors
About these files
Test PDF files, created for detecting PDF features that are undesired in archival settings. Most of these files were originally created in Microsoft Word 2003, and then converted to PDF with Adobe Acrobat Professional 9.5.2. The source Word files are included here as well, but in many cases the PDFs required further processing in Acrobat (e.g. embedding videos, attaching files, encryption) so they're probably not that useful.
digitally_signed_3D_Portfolio.pdf, which was kindly provided by Adobe.
Here's a more detailed description of the files (arranged by feature class(es)):
- encryption_openpassword.pdf - requires password to open the file
- encryption_nocopy.pdf - requires password to copy document contents
- encryption_noprinting.pdf - requires password for printing
- encryption_notextaccess.pdf - requires password to enable text access for screen reader devices for the visually impaired
- embedded_video_avi.pdf - contains embedded AVI movie
- embedded_video_quicktime.pdf - contains embedded Quicktime movie
- text_only_fontsNotEmbedded.pdf - used fonts are not embedded
- text_only_fontsEmbeddedAll.pdf - used fonts are embedded
- text_only_fontsEmbeddedSubset.pdf - used fonts are embedded as subset
- text_only_pdfa1b.pdf - PDF/A-1a (with embedded fonts)
- test_fontArialNotEmbedded.pdf -
font Arialfonts Arial and Times New Roman isare not embedded
- calistoMTNoFontsEmbedded.pdf - Font Calisto MT is not embedded
- veraPDFHiRes.pdf - Intact and valid PDF/A-1b file with bitmap image
- veraPDFHiResChangedHeight.pdf - As above, but wrong value of Height entry in Image XObject
- veraPDFHiResWrongObjectID.pdf - As veraPDFHiRes.pdf, but with reference to wrong (non-existing) XObject
- balloon_a1b_jp2k.pdf - file claims PDF/A-1b conformance, but contains JPEG 2000 image, which is not allowed in PDF/A-1 (file also has some other violations of PDF/A-1).
- fileAttachment.pdf - contains a document-level file attachment (an oldskool Quattro Pro spreadsheet, no less!) that is defined using an EmbeddedFiles entry in the document’s name dictionary
- fileAttachment_fileAttachmentAnnotation.pdf - contains a page-level file attachment that is defined using a File Attachment Annotation
- externalLink.pdf - contains link to another document
- webCapture.pdf - uses Web Capture feature for importing text from a website
- corruptionOneByteMissing.pdf - one byte missing from comment line following file header
Digitally Signed 3D Portfolio
- digitally_signed_3D_Portfolio.pdf - a PDF 1.7 portfolio with multiple sheets, forms and 3D images; one of the sheets is digitally signed
- pdf-17-header18.pdf - PDF 1.7, but header string is
%PDF-1.8(which causes a false negative with some identification tools, see e.g. here).
All files in this folder: Creative Commons CC0: Public Domain Dedication. See http://creativecommons.org/publicdomain/zero/1.0/