Skip to content
Boris Doubrov edited this page Sep 16, 2023 · 8 revisions

PDF/A-4 validation rules

Rule 6.1.2-1

Requirement

The file header shall begin at byte zero and shall consist of "%PDF-2.n" followed by a single EOL marker, where 'n' is a single digit number between 0 (30h) and 9 (39h)

Error details

File header starts at non-zero offset or does not match the pattern %PDF-2.n, where 'n' is a single digit number between 0 and 9

  • Object type: CosDocument
  • Test condition: headerOffset == 0 && /^%PDF-2\.[0-9]$/.test(header)
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.1.2-2

Requirement

The aforementioned EOL marker shall be immediately followed by a % (25h) character followed by at least four bytes, each of whose encoded byte values shall have a decimal value greater than 127

Error details

Binary comment in the file header is missing or does not start with 4 bytes with byte values above 127.

The presence of encoded character byte values greater than decimal 127 near the beginning of a file is used by various software tools and protocols to classify the file as containing 8-bit binary data that should be preserved during processing.

  • Object type: CosDocument
  • Test condition: headerByte1 > 127 && headerByte2 > 127 && headerByte3 > 127 && headerByte4 > 127
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.1.3-1

Requirement

File identifiers shall be defined by the ID entry in a PDF file’s trailer dictionary

Error details

Missing or empty ID in the document trailer

Processing systems and documents may contain references to PDF files. Simply storing a file name, however, even in a platform-independent format, does not guarantee that the file can be found. Even if the file still exists and its name has not been changed, different server software applications may identify it in different ways. For example, servers running on DOS platforms must convert all file names to 8 characters and a 3-character extension; different servers may use different strategies for converting longer file names to this format.

External file references can be made more reliable by including a file identifier in the file itself and using it in addition to the normal platform-based file designation. File identifiers are defined by the optional ID entry in a PDF file’s trailer dictionary. The value of this entry is an array of two strings. The first string is a permanent identifier based on the contents of the file at the time it was originally created, and does not change when the file is incrementally updated. The second string is a changing identifier based on the file's contents at the time it was last updated. When a file is first written, both identifiers are set to the same value. If both identifiers match when a file reference is resolved, it is very likely that the correct file has been found; if only the first identifier matches, then a different version of the correct file has been found.

  • Object type: CosDocument
  • Test condition: lastID != null && lastID.length() > 0
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 14.4

Rule 6.1.3-2

Requirement

The Encrypt key shall not be present in the trailer dictionary.

Error details

The Encrypt key is present in the trailer dictionary

The explicit prohibition of the Encrypt keyword has the implicit effect of disallowing encryption and password-protected access permissions.

  • Object type: CosTrailer
  • Test condition: isEncrypted != true
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.1.3-3

Requirement

No data shall follow the last end-of-file marker as described in ISO 32000-2:2020, 7.5.5.

Error details

Data is present after the last end-of-file marker

The trailer of a PDF file enables an application reading the file to quickly find the cross-reference table and certain special objects. Applications should read a PDF file from its end. The last line of the file contains only the end-of-file marker, %%EOF. Some PDF viewers require only that the %%EOF marker appear somewhere within the last 1024 bytes of the file. But having any data after %%EOF marker introduces risks that the PDF document might not be processed correctly.

  • Object type: CosDocument
  • Test condition: postEOFDataSize == 0
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 7.5.5

Rule 6.1.3-4

Requirement

The Info key shall not be present in the trailer dictionary of PDF/A-4 conforming files unless there exists a PieceInfo entry in the document catalog dictionary.

Error details

The Info key present in the trailer dictionary, but PieceInfo entry does not present in the document catalog dictionary

  • Object type: CosDocument
  • Test condition: containsPieceInfo == true || containsInfo == false
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.1.3-5

Requirement

If a document information dictionary is present, it shall only contain a ModDate entry.

Error details

Document information dictionary is present and does not contain only a ModDate entry

  • Object type: CosInfo
  • Test condition: size == 1 && ModDate != null
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.1.4-1

Requirement

The xref keyword and the cross-reference subsection header shall be separated by a single EOL marker

Error details

Extra spacings or missing EOL characters after the 'xref' keyword in the cross reference table.

The cross-reference table contains information that permits random access to indirect objects within the file, so that the entire file need not be read to locate any particular object. The table contains a one-line entry for each indirect object, specifying the location of that object within the body of the file.

The cross-reference table is the only part of a PDF file with a fixed format; this permits entries in the table to be accessed randomly. Any variations in this format, including unnecessary EOL markers may result in incorrect parsing of the cross-reference table and, thus, errors in reading the PDF document.

  • Object type: CosXRef
  • Test condition: xrefEOLMarkersComplyPDFA
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.1.5-1

Requirement

Hexadecimal strings shall contain an even number of non-white-space characters

Error details

The number of hexadecimal digits in a hexadecimal string shall always be even

Strings in PDF documents may be written as a literal byte sequence or as a hexadecimal string; the latter is useful for including arbitrary binary data in a PDF file. A hexadecimal string is written as a sequence of hexadecimal digits (0–9 and either A–F or a–f) enclosed within angle brackets (< and >):

<4E6F762073686D6F7A206B6120706F702E>

Each pair of hexadecimal digits defines one byte of the string. White-space characters (such as space, tab, carriage return, line feed, and form feed) are ignored.

PDF Validation Technical Working Group notes

White-space characters are defined as NULL (00h), TAB (09h), LINE FEED (0Ah), FORM FEED (0Ch), CARRIAGE RETURN (0Dh), SPACE (20h). They may appear within hexadecimal strings for formatting purposes:

<4E6F 7620 7368 6D6F 7A20 6B61 2070 6F70>

  • Object type: CosString
  • Test condition: (isHex != true) || hexCount % 2 == 0
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.1.5-2

Requirement

A hexadecimal string is written as a sequence of hexadecimal digits (0–9 and either A–F or a–f)

Error details

Hexadecimal string contains non-white-space characters outside the range 0 to 9, A to F or a to f

  • Object type: CosString
  • Test condition: (isHex != true) || containsOnlyHex == true
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.1.6.1-1

Requirement

The value of the Length key specified in the stream dictionary shall match the number of bytes in the file following the LINE FEED (0Ah) character after the stream keyword and preceding the EOL marker before the endstream keyword.

Error details

Actual length of the stream does not match the value of the Length key in the Stream dictionary

Every stream dictionary has a Length entry that indicates how many bytes of the PDF file are used for the stream's data. (If the stream has a filter, Length is the number of bytes of encoded data.) In addition, most filters are defined so that the data is self-limiting; that is, they use an encoding scheme in which an explicit end-of-data (EOD) marker delimits the extent of the data. Finally, streams are used to represent many objects from whose attributes a length can be inferred. All of these constraints must be consistent.

  • Object type: CosStream
  • Test condition: Length == realLength
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.1.6.1-2

Requirement

A stream dictionary shall not contain the F, FFilter, or FDecodeParms keys

Error details

A stream object dictionary contains one of the F, FFilter, or FDecodeParms keys

These keys are used to point to document content external to the file. The explicit prohibition of these keys has the implicit effect of disallowing external content that can create external dependencies and complicate preservation efforts.

  • Object type: CosStream
  • Test condition: F == null && FFilter == null && FDecodeParms == null
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.1.6.1-3

Requirement

The Subtype entry in a 3D stream dictionary (ISO 32000-2:2020, 13.6.3) shall have a value which is either U3D or PRC as described in Annex B.

Error details

The Subtype entry in the 3D stream dictionary (ISO 32000-2:2020, 13.6.3) has a value that is neither U3D nor PRC.

  • Object type: PD3DStream
  • Test condition: Subtype == 'U3D' || Subtype == 'PRC'
  • Specification: ISO 19005-4:2020
  • Levels: 4E
  • Additional references:
    • ISO 19005-4:2020, Annex B

Rule 6.1.6.2-1

Requirement

All standard stream filters listed in ISO 32000-2:2020, 7.4, Table 6 may be used, with the exception of LZWDecode. Filters that are not listed in ISO 32000-2:2020, 7.4, Table 6 shall not be used. In addition, the Crypt filter shall not be used unless the value of the Name key in the decode parameters dictionary is Identity.

Error details

LZW compression is used

The use of the LZW compression algorithm has been subject to intellectual property constraints. The Crypt filter is used to apply encryption and access control to the file.

  • Object type: CosFilter
  • Test condition: internalRepresentation == "ASCIIHexDecode" || internalRepresentation == "ASCII85Decode" || internalRepresentation == "FlateDecode" || internalRepresentation == "RunLengthDecode" || internalRepresentation == "CCITTFaxDecode" || internalRepresentation == "JBIG2Decode" || internalRepresentation == "DCTDecode" || internalRepresentation == "JPXDecode" || (internalRepresentation == "Crypt" && decodeParms == "Identity")
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 7.4

Rule 6.1.7-1

Requirement

Font names, names of colourants in Separation and DeviceN colour spaces, and structure type names, after expansion of character sequences escaped with a NUMBER SIGN (23h), if any, shall be valid UTF-8 character sequences.

Error details

The name object does not represent a correct UTF-8 character sequence

As stated above, name objects shall be treated as atomic within a PDF file. Ordinarily, the bytes making up the name are never treated as text to be presented to a human user or to an application external to a conforming reader. However, occasionally the need arises to treat a name object as text, such as one that represents a font name, a colorant name in a separation or DeviceN colour space, or a structure type.

In such situations, the sequence of bytes (after expansion of NUMBER SIGN sequences, if any) should be interpreted according to UTF-8, a variable-length byte-encoded representation of Unicode in which the printable ASCII characters have the same representations as in ASCII. This enables a name object to represent text virtually in any natural language, subject to the implementation limit on the length of a name.

  • Object type: CosUnicodeName
  • Test condition: isValidUtf8 == true
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 7.3.5

Rule 6.1.8-1

Requirement

The object number and generation number shall be separated by a single white-space character. The generation number and obj keyword shall be separated by a single white-space character. The object number and endobj keyword shall each be preceded by an EOL marker. The obj and endobj keywords shall each be followed by an EOL marker.

Error details

Extra spacings or missing EOL characters around indirect object/generation number or keywords 'obj' and 'endobj'.

The definition of an indirect object in a PDF file consists of its object number and generation number, followed by the value of the object itself bracketed between the keywords "obj" and "endobj". The requirements of this rule guarantee that the definition of an indirect object can be parsed unambiguously.

  • Object type: CosIndirect
  • Test condition: spacingCompliesPDFA
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.1.9-1

Requirement

The value of the F key in the Inline Image dictionary shall not be LZW, LZWDecode, Crypt, a value not listed in ISO 32000-2:2020, 8.9.7, Table 92, or an array containing any such value.

Error details

Inline image uses LZW, Crypt or one of the unknown filters

Inline images are defined directly within the content stream in which it will be painted, rather than as separate objects. The use of LZW, Crypt and other non-standard compression methods is also not permitted for such images.

  • Object type: CosIIFilter
  • Test condition: internalRepresentation == "ASCIIHexDecode" || internalRepresentation == "ASCII85Decode" || internalRepresentation == "FlateDecode" || internalRepresentation == "RunLengthDecode" || internalRepresentation == "CCITTFaxDecode" || internalRepresentation == "DCTDecode" || internalRepresentation == "AHx" || internalRepresentation == "A85" || internalRepresentation == "Fl" || internalRepresentation == "RL" || internalRepresentation == "CCF" || internalRepresentation == "DCT"
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 8.9.7

Rule 6.1.11-1

Requirement

No keys other than UR3 and DocMDP shall be present in a permissions dictionary (ISO 32000-2:2020, 12.8.6, Table 263).

Error details

The document permissions dictionary contains keys other than UR3 and DocMDP

  • Object type: PDPerms
  • Test condition: entries.split('&').filter(elem => elem != 'UR3' && elem != 'DocMDP').length == 0 || entries == ''
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 12.8.6

Rule 6.1.12-1

Requirement

If the Version key is present in the document catalog dictionary, the first character in its value shall be a 2 (32h) and the second character of its value shall be a PERIOD (2Eh) (decimal point). The third character shall be a decimal digit. The number of characters of the value of the Version key shall be exactly 3.

Error details

The Version key with value does not match the pattern 2.n, where 'n' is a single digit number between 0 and 9.

  • Object type: PDDocument
  • Test condition: Version == null || /^2\.[0-9]$/.test(Version)
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.2-1

Requirement

Content streams shall not contain any operators not defined in ISO 32000-2:2020 even if such operators are bracketed by the BX/EX compatibility operators.

Error details

A content stream contains an operator not defined in ISO 32000-2:2020

Ordinarily, when a viewer application encounters an operator in a content stream that it does not recognize, an error will occur. A pair of compatibility operators, BX and EX, modify this behavior. These operators must occur in pairs and may be nested. They bracket a compatibility section, a portion of a content stream within which unrecognized operators are to be ignored without error. This mechanism enables a PDF document to use operators defined in newer versions of PDF without sacrificing compatibility with older viewers.

However, as the use of undefined operators may still result in error for some PDF processors, their use is not permitted in PDF/A-compliant documents, even if they are bracketed by BX/EX compatibility operators.

In earlier versions of the PDF format a PostScript operator "PS" was defined. As this operator is not defined in PDF Reference its use is implicitly prohibited by PDF/A-4 specification.

  • Object type: Op_Undefined
  • Test condition: false
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.2-2

Requirement

A content stream that references other objects, such as images and fonts that are necessary to fully render or process the stream, shall have an explicitly associated Resources dictionary as described in ISO 32000-2:2020, 7.8.3. Such resource dictionaries shall define all named resources referenced by this content stream.

Error details

A content stream refers to resources not defined in an explicitly associated Resources dictionary

  • Object type: PDContentStream
  • Test condition: inheritedResourceNames == ''
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 7.8.3

Rule 6.2.2-3

Requirement

A content stream's named resources shall be defined by a resource dictionary, which shall enumerate the named resources needed by the operators in the content stream and the names by which they can be referred to.

Error details

A content stream's named resources not define by a resource dictionary, which shall enumerate the named resources needed by the operators in the content stream and the names by which they can be referred to.

  • Object type: PDContentStream
  • Test condition: undefinedResourceNames == ''
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 7.8.3

Rule 6.2.3-1

Requirement

The profile stream that is the value of the DestOutputProfile key shall either be an output profile (Device Class = "prtr") or a monitor profile (Device Class = "mntr"). The profiles shall have a colour space of either "GRAY", "RGB", or "CMYK".

Error details

The embedded PDF/A Output Intent colour profile has invalid header

  • Object type: ICCOutputProfile
  • Test condition: (deviceClass == "prtr" || deviceClass == "mntr") && (colorSpace == "RGB " || colorSpace == "CMYK" || colorSpace == "GRAY") && version < 5.0
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 14.11.5

Rule 6.2.3-2

Requirement

If any OutputIntents array contains more than one entry, as might be the case where a file is compliant with this part of ISO 19005 and at the same time with PDF/X or PDF/E, then all entries that contain a DestOutputProfile key shall have as the value of that key the same indirect object, which shall be a valid ICC profile stream.

Error details

OutputIntents array contains output intent dictionaries with non-matching destination output profiles

A PDF document may conform to several PDF standards at the same time, such as PDF/X, PDF/E or PDF/UA. Each of these standards relies on the presence of the OutputIntent color profile and has to identify such profile via a standard-specific subtype entry (S key in the OutputIntent dictionary). For example, the value of this key is "GTS_PDFA1" for the PDF/A standards, and is "GTS_PDFX" for PDF/X standards.

The requirement for all output intent dictionaries to share the same output ICC profile minimizes risks that a wrong ICC output profile is used for rendering the PDF document.

  • Object type: OutputIntents
  • Test condition: sameOutputProfileIndirect == true
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.3-3

Requirement

The DestOutputProfileRef key, as defined in ISO 32000-2:2020, 14.11.5, Table 401, shall not be present in any output intent dictionary.

Error details

The output intent dictionary contains forbidden entry DestOutputProfileRef

  • Object type: PDOutputIntent
  • Test condition: containsDestOutputProfileRef == false
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 14.11.5, Table 401

Rule 6.2.4.2-1

Requirement

The profile that forms the stream of an ICCBased colour space shall conform to ISO 32000-2:2020, 8.6.5.5.

Error details

The embedded ICC profile is either invalid or does not satisfy PDF 2.0 requirements

ICC profiles can be used in PDF documents to identify the source color spaces. Similar to the requirements for the ICC output profile (see Rule 6.2.3-1), the embedded ICC input profile shall satisfy a number of additional requirements of PDF 2.0 Specification. These requirements cover the device class of the ICC profile, its connection color space and the version of the ICC standard the input profile is based on.

  • Object type: ICCInputProfile
  • Test condition: (deviceClass == "prtr" || deviceClass == "mntr" || deviceClass == "scnr" || deviceClass == "spac") && (colorSpace == "RGB " || colorSpace == "CMYK" || colorSpace == "GRAY" || colorSpace == "Lab ") && version < 5.0
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 8.6.5.5

Rule 6.2.4.2-2

Requirement

Overprint mode (as set by the OPM value in an ExtGState dictionary) shall not be one (1) when an ICCBased CMYK colour space is used for stroke and overprinting for stroke is set to true, or when ICCBased CMYK colour space is used for fill and overprinting for fill is set to true, or both.

Error details

Overprint mode (OPM) is set to 1 when an ICCBased CMYK colour space is used with enabled overprinting

This prohibition avoids unpredictable overprinting behaviour when overprint mode is 1 if implicit colour conversion is applied as described in ISO 32000-2:2020, 8.6.7.

  • Object type: PDICCBasedCMYK
  • Test condition: overprintFlag == false || OPM == 0
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 8.6.7

Rule 6.2.4.2-3

Requirement

An ICCBased colour space shall not be used where the profile is a CMYK destination profile and is identical to that in the current PDF/A OutputIntent or the current transparency blending colorspace.

Error details

An ICCBased CMYK color space is identical to the current PDF/A OutputIntent color profile or the current transparency blending color space.

  • Object type: PDICCBasedCMYK
  • Test condition: (ICCProfileIndirect == null || (ICCProfileIndirect != gOutputProfileIndirect && ICCProfileIndirect != currentTransparencyProfileIndirect)) && (ICCProfileMD5 == null || (ICCProfileMD5 != gOutputICCProfileMD5 && ICCProfileMD5 != currentTransparencyICCProfileMD5))
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.4.3-2

Requirement

DeviceRGB shall only be used if a device independent DefaultRGB colour space has been set when the DeviceRGB colour space is used or if the current transparency blending space, when the DeviceRGB colour space is used, is a device independent RGB-based colour space or the current PDF/A OutputIntent, when the DeviceRGB colour space is used, contains an 'RGB ' destination profile.

Error details

DeviceRGB colour space is used without RGB output intent profile and without current transparency blending space being Device-independent RGB.

  • Object type: PDDeviceRGB
  • Test condition: (gPageOutputCS == null ? gDocumentOutputCS == 'RGB ' : gPageOutputCS == 'RGB ') || gTransparencyCS == 'RGB ' || gTransparencyCS == 'CalRGB'
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.4.3-3

Requirement

DeviceCMYK shall only be used if a device independent DefaultCMYK colour space has been set when the DeviceCMYK colour space is used or if the current transparency blending space, when the DeviceCMYK colour space is used, is a device independent CMYK-based colour space or the current PDF/A OutputIntent, when the DeviceCMYK colour space is used, contains a 'CMYK' destination profile.

Error details

DeviceCMYK colour space is used without CMYK output intent profile and without current transparency blending space being Device-independent CMYK.

  • Object type: PDDeviceCMYK
  • Test condition: (gPageOutputCS == null ? gDocumentOutputCS == 'CMYK' : gPageOutputCS == 'CMYK') || gTransparencyCS == 'CMYK'
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.4.3-4

Requirement

DeviceGray shall only be used if a device independent DefaultGray colour space has been set when the DeviceGray colour space is used, or if a PDF/A OutputIntent is in effect.

Error details

DeviceGray colour space is used without output intent profile

  • Object type: PDDeviceGray
  • Test condition: gPageOutputCS != null || gDocumentOutputCS != null
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.4.4-1

Requirement

For any spot colour used in a DeviceN or NChannel colour space, an entry in the Colorants dictionary shall be present.

Error details

A colorant of the DeviceN or NChannel color space is not defined in the Colorants dictionary

  • Object type: PDDeviceN
  • Test condition: areColorantsPresent == true
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 8.6.6.5

Rule 6.2.4.4-2

Requirement

All Separation arrays within a single conforming PDF/A-4 file (including those in Colorants dictionaries) that have the same name shall have the same tintTransform and alternateSpace. In evaluating equivalence, the PDF objects shall be compared, rather than the computational result of the use of those PDF objects. Compression and whether or not an object is direct or indirect shall be ignored.

Error details

Several occurrences of a Separation colour space with the same name are not consistent

  • Object type: PDSeparation
  • Test condition: name == 'All' || name == 'None' || areTintAndAlternateConsistent == true
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.5-1

Requirement

A graphics state parameter dictionary (ISO 32000-2:2020, 8.4.5) shall not contain the TR key

Error details

A graphics state parameter dictionary contains the TR key

In PDF, a transfer function adjusts the values of color components to compensate for nonlinear response in an output device and in the human eye. Each component of a device color space (for example, the red component of the DeviceRGB space) is intended to represent the perceived lightness or intensity of that color component in proportion to the component’s numeric value. Many devices do not actually behave this way, however; the purpose of a transfer function is to compensate for the device's actual behavior.

As this may lead in significantly different visual appearance of PDF documents on different devices, the use of transfer functions is not permitted by PDF/A.

  • Object type: PDExtGState
  • Test condition: TR == null
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 8.4.5

Rule 6.2.5-2

Requirement

A graphics state parameter dictionary shall not contain the TR2 key with a value other than Default

Error details

A graphics state parameter dictionary contains the TR2 key with a value other than Default

The TR2 key has the same meaning as TR key (see Rule 6.2.5-1) except that the value may also be the name or name Default, denoting the transfer function used by default by the output device. And this is the only case, permitted in PDF/A-compliant documents.

  • Object type: PDExtGState
  • Test condition: TR2 == null || TR2 == "Default"
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.5-3

Requirement

A graphics state parameter dictionary shall not contain the HTO key

Error details

A graphics state parameter dictionary contains the HTO key

  • Object type: PDExtGState
  • Test condition: containsHTO == false
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.5-4

Requirement

All halftones in a conforming PDF/A-4 file shall have the value 1 or 5 for the HalftoneType key.

Error details

A Halftone has type other than 1 or 5

  • Object type: PDHalftone
  • Test condition: HalftoneType != null && (HalftoneType == 1 || HalftoneType == 5)
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.5-5

Requirement

Halftones in a conforming PDF/A-4 file shall not contain a HalftoneName key.

Error details

A Halftone dictionary contains the HalftoneName key

  • Object type: PDHalftone
  • Test condition: HalftoneName == null
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.5-6

Requirement

The TransferFunction key in a halftone dictionary shall be used only as required by ISO 32000-2:2020.

Error details

Custom TransferFunction in a Halftone dictionary is not permitted for primary (CMYK) colorants

  • Object type: PDHalftone
  • Test condition: colorantName == 'Default' || ((colorantName == null || colorantName == 'Cyan' || colorantName == 'Magenta' || colorantName == 'Yellow' || colorantName == 'Black') ? TransferFunction == null : TransferFunction != null)
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 10.6.5.2, Table 128

Rule 6.2.7.1-1

Requirement

An Image dictionary shall not contain the Alternates key

Error details

Alternates key is present in the Image dictionary

Alternate images provide a straightforward and backward-compatible way to include multiple versions of an image in a PDF file for different purposes. These variant representations of the image may differ, for example, in resolution or in color space. The primary goal is to reduce the need to maintain separate versions of a PDF document for low-resolution on-screen viewing and highresolution printing. However, this mechanism is prohibited in PDF/A-compliant documents, as it introduces risks of choosing different images for rendering.

  • Object type: PDXImage
  • Test condition: containsAlternates == false
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.7.1-2

Requirement

An Image dictionary shall not contain the OPI key

Error details

OPI key is present in the Image dictionary

The Open Prepress Interface (OPI) is a mechanism, originally developed by Aldus Corporation, for creating low-resolution placeholders, or proxies, for such high-resolution images. The proxy typically consists of a downsampled version of the full-resolution image, to be used for screen display and proofing. Before the document is printed, it passes through a filter known as an OPI server, which replaces the proxies with the original full-resolution images. Similar to Rule 6.2.7.1-1, this mechanism is prohibited in PDF/A-compliant documents, as it introduces risks of unpredictable PDF rendering.

  • Object type: PDXImage
  • Test condition: containsOPI == false
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.7.1-3

Requirement

If an Image dictionary contains the Interpolate key, its value shall be false. For an inline image, the I key, if present, shall have a value of false.

Error details

The value of the Interpolate key in the Image dictionary is true

When the resolution of a source image is significantly lower than that of the output device, each source sample covers many device pixels. This can cause images to appear "jaggy" or "blocky." These visual artifacts can be reduced by applying an image interpolation algorithm during rendering. Instead of painting all pixels covered by a source sample with the same color, image interpolation attempts to produce a smooth transition between adjacent sample values.

However, the interpolation algorithm is implementation-dependent and is not specified by PDF. Image interpolation may not always be performed for some classes of images or on some output devices. Therefore, this mechanism is not permitted in PDF/A-compliant documents.

  • Object type: PDXImage
  • Test condition: Interpolate == false
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.7.3-1

Requirement

The number of colour channels in the JPEG2000 data shall be 1, 3 or 4.

Error details

JPEG2000 image has number of colour channels different from 1, 3 or 4.

  • Object type: JPEG2000
  • Test condition: nrColorChannels == 1 || nrColorChannels == 3 || nrColorChannels == 4
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.7.3-2

Requirement

If the number of colour space specifications in the JPEG2000 data is greater than 1, there shall be exactly one colour space specification that has the value 0x01 in the APPROX field.

Error details

The JPEG2000 image contains more than one colour specification with the best colour fidelity (value 0x01 in the APPROX field).

  • Object type: JPEG2000
  • Test condition: hasColorSpace == true || nrColorSpaceSpecs == 1 || nrColorSpacesWithApproxField == 1
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.7.3-3

Requirement

The value of the METH entry in its 'colr' box shall be 0x01, 0x02 or 0x03. A conforming processor shall use only that colour space and shall ignore all other colour space specifications.

Error details

Invalid JPEG2000 image: the value of the METH entry in its 'colr' box is different from 0x01, 0x02 or 0x03.

  • Object type: JPEG2000
  • Test condition: hasColorSpace == true || colrMethod == 1 || colrMethod == 2 || colrMethod == 3
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.7.3-4

Requirement

JPEG2000 enumerated colour space 19 (CIEJab) shall not be used.

Error details

JPEG2000 image uses enumerated colour space 19 (CIEJab), which is not allowed in PDF/A

  • Object type: JPEG2000
  • Test condition: hasColorSpace == true || colrEnumCS == null || colrEnumCS != 19
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.7.3-5

Requirement

The bit-depth of the JPEG2000 data shall have a value in the range 1 to 38. All colour channels in the JPEG2000 data shall have the same bit-depth.

Error details

JPEG2000 image has different bit-depth parameters in 'bpcc' box, or bit depth is out of range

  • Object type: JPEG2000
  • Test condition: bpccBoxPresent == false && (bitDepth >= 1 && bitDepth <= 38)
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.8.1-1

Requirement

A form XObject dictionary shall not contain an OPI key.

Error details

The form XObject dictionary contains an OPI key

  • Object type: PDXForm
  • Test condition: containsOPI == false
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.8.2-1

Requirement

A conforming file shall not contain any reference XObjects

Reference XObjects enable one PDF document to import content from another. The document in which the reference occurs is called the containing document; the one whose content is being imported is the target document. The target document may reside in a file external to the containing document or may be included within it as an embedded file stream.

As this makes the initial PDF document dependent on the presence of external resources, this mechanism is not permitted in PDF/A-compliant documents.

Error details

The document contains a reference XObject (Ref key in the form XObject dictionary)

  • Object type: PDXForm
  • Test condition: containsRef == false
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.9-1

Requirement

Only blend modes that are specified in ISO 32000-2:2020 shall be used for the value of the BM key in a graphic state dictionary or an annotation dictionary.

Error details

The document uses the blend mode not defined in ISO 32000-2:2020

  • Object type: CosBM
  • Test condition: internalRepresentation == "Normal" || internalRepresentation == "Compatible" || internalRepresentation == "Multiply" || internalRepresentation == "Screen" || internalRepresentation == "Overlay" || internalRepresentation == "Darken" || internalRepresentation == "Lighten" || internalRepresentation == "ColorDodge" || internalRepresentation == "ColorBurn" || internalRepresentation == "HardLight" || internalRepresentation == "SoftLight" || internalRepresentation == "Difference" || internalRepresentation == "Exclusion" || internalRepresentation == "Hue" || internalRepresentation == "Saturation" || internalRepresentation == "Color" || internalRepresentation == "Luminosity"
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 11.3.5, Tables 134-135

Rule 6.2.9-2

Requirement

If the document does not contain a PDF/A output intent, then all pages that contain transparency shall either have a page-level PDF/A output intent or the page dictionary shall include the Group key, and the attribute dictionary that forms the value of that Group key shall include a CS entry whose value shall be used as the default blending colour space.

Error details

The page contains transparent objects with no blending colour space defined.

PDF transparency (as described in ISO 32000-2:2020, Clause 11) may be used in a PDF/A-4 file. This requirement ensures that there is always an explicitly defined transparency blending space specified for any content which has associated transparency.

  • Object type: PDPage
  • Test condition: gDocumentOutputCS != null || outputColorSpace != null || containsGroupCS == true || containsTransparency == false
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 11.3.4

Rule 6.2.10.2-1

Requirement

All fonts and font programs used in a conforming file, regardless of rendering mode usage, shall conform to the provisions in ISO 32000-2:2020, 9.6 and 9.7, as well as to the font specifications referenced by these provisions. Type - name - (Required) The type of PDF object that this dictionary describes; must be Font for a font dictionary

Error details

A Font dictionary has missing or invalid Type entry

  • Object type: PDFont
  • Test condition: Type == "Font"
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 9.6.2.1, Table 109
    • ISO 32000-2:2020, 9.6.4, Table 110
    • ISO 32000-2:2020, 9.7.4.1, Table 115
    • ISO 32000-2:2020, 9.7.6.1, Table 119

Rule 6.2.10.2-2

Requirement

All fonts and font programs used in a conforming file, regardless of rendering mode usage, shall conform to the provisions in ISO 32000-2:2020, 9.6 and 9.7, as well as to the font specifications referenced by these provisions. Subtype - name - (Required) The type of font; must be "Type1" for Type 1 fonts, "MMType1" for multiple master fonts, "TrueType" for TrueType fonts "Type3" for Type 3 fonts, "Type0" for Type 0 fonts and "CIDFontType0" or "CIDFontType2" for CID fonts

Error details

A Font dictionary has missing or invalid Subtype entry

The correct value of the Subtype entry in the font dictionary is critical for the text rendering in PDF documents.

  • Object type: PDFont
  • Test condition: Subtype == "Type1" || Subtype == "MMType1" || Subtype == "TrueType" || Subtype == "Type3" || Subtype == "Type0" || Subtype == "CIDFontType0" || Subtype == "CIDFontType2"
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 9.6.2.1, Table 109
    • ISO 32000-2:2020, 9.6.2.3
    • ISO 32000-2:2020, 9.6.3
    • ISO 32000-2:2020, 9.6.4, Table 110
    • ISO 32000-2:2020, 9.7.4.1, Table 115
    • ISO 32000-2:2020, 9.7.6.1, Table 119

Rule 6.2.10.2-3

Requirement

All fonts and font programs used in a conforming file, regardless of rendering mode usage, shall conform to the provisions in ISO 32000-2:2020, 9.6 and 9.7, as well as to the font specifications referenced by these provisions. BaseFont - name - (Required) The PostScript name of the font

Error details

A BaseFont entry is missing or has invalid type

For Type 1 fonts, this is usually the value of the FontName entry in the font program. The PostScript name of the font can be used to find the font’s definition in the viewer application or its environment. It is also the name that will be used when printing to a PostScript output device.

For TrueType fonts the value of BaseFont is determined in one of two ways. It is defined as the PostScript name that is an optional entry in the "name" table of the TrueType font itself. In the absence of such an entry in the "name" table, a PostScript name is derived from the name by which the font is known in the host operating system.

  • Object type: PDFont
  • Test condition: Subtype == "Type3" || fontName != null
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 9.6.2.1, Table 109
    • ISO 32000-2:2020, 9.7.4.1, Table 115
    • ISO 32000-2:2020, 9.7.6.1, Table 119

Rule 6.2.10.2-4

Requirement

All fonts and font programs used in a conforming file, regardless of rendering mode usage, shall conform to the provisions in ISO 32000-2:2020, 9.6 and 9.7, as well as to the font specifications referenced by these provisions.

FirstChar - integer - (Required except for the standard 14 fonts) The first character code defined in the font's Widths array

Error details

A non-standard simple font dictionary has missing or invalid FirstChar entry

The PostScript names of 14 Type 1 fonts, known as the standard fonts, are as follows:

Times−Roman      Helvetica             Courier             Symbol
Times−Bold       Helvetica−Bold        Courier−Bold        ZapfDingbats
Times−Italic     Helvetica−Oblique     Courier−Oblique
Times−BoldItalic Helvetica−BoldOblique Courier−BoldOblique

These fonts, or their font metrics and suitable substitution fonts, are guaranteed to be available to any viewer application.

  • Object type: PDSimpleFont
  • Test condition: isStandard == true || FirstChar != null
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 9.6.2.1, Table 109
    • ISO 32000-2:2020, 9.6.4, Table 110

Rule 6.2.10.2-5

Requirement

All fonts and font programs used in a conforming file, regardless of rendering mode usage, shall conform to the provisions in ISO 32000-2:2020, 9.6 and 9.7, as well as to the font specifications referenced by these provisions.

LastChar - integer - (Required except for the standard 14 fonts) The last character code defined in the font's Widths array

Error details

A non-standard simple font dictionary has missing or invalid LastChar entry

See also rule 6.2.10.2-4.

  • Object type: PDSimpleFont
  • Test condition: isStandard == true || LastChar != null
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 9.6.2.1, Table 109
    • ISO 32000-2:2020, 9.6.4, Table 110

Rule 6.2.10.2-6

Requirement

All fonts and font programs used in a conforming file, regardless of rendering mode usage, shall conform to the provisions in ISO 32000-2:2020, 9.6 and 9.7, as well as to the font specifications referenced by these provisions.

Widths - array - (Required except for the standard 14 fonts; indirect reference preferred) An array of (LastChar − FirstChar + 1) widths

Error details

Widths array is missing or has invalid size

See also rule 6.2.10.2-4.

  • Object type: PDSimpleFont
  • Test condition: isStandard == true || (Widths_size != null && Widths_size == LastChar - FirstChar + 1)
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 9.6.2.1, Table 109
    • ISO 32000-2:2020, 9.6.4, Table 110

Rule 6.2.10.2-7

Requirement

All fonts used in a conforming file shall conform to the font specifications defined in PDF Reference 5.5. The subtype is the value of the Subtype key, if present, in the font file stream dictionary. The only valid values of this key in PDF 2.0 are Type1C - Type 1–equivalent font program represented in the Compact Font Format (CFF), CIDFontType0C - Type 0 CIDFont program represented in the Compact Font Format (CFF) and OpenType - OpenType® font program, as described in the ISO/IEC 14496-22.

Error details

Unsupported font file format of the embedded font

  • Object type: PDFont
  • Test condition: fontFileSubtype == null || fontFileSubtype == "Type1C" || fontFileSubtype == "CIDFontType0C" || fontFileSubtype == "OpenType"
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 9.9.1, Table 124
    • Adobe Technical Note #5176, The Compact Font Format Specification
    • ISO/IEC14496-22:2019

Rule 6.2.10.3.1-1

Requirement

For any given composite (Type 0) font within a conforming file, the CIDSystemInfo entry in its CIDFont dictionary and its Encoding dictionary shall have the following relationship: - If the Encoding key in the Type 0 font dictionary has a value of Identity-H or Identity-V, then any values for the Registry, Ordering, and Supplement keys may be used in the CIDSystemInfo dictionary of the CIDFont. - Otherwise the corresponding values of the Registry and Ordering keys in both CIDSystemInfo dictionaries shall be identical, and the value of the Supplement key in the CIDSystemInfo dictionary of the CIDFont shall be less than or equal to the value of the Supplement key in the CIDSystemInfo dictionary of the CMap.

Error details

CIDSystemInfo entries the CIDFont and CMap dictionaries of a Type 0 font are not compatible

CIDFont and CMap dictionaries contain a CIDSystemInfo entry specifying the character collection assumed by the CIDFont or by each CIDFont associated with the CMap - that is, the interpretation of the CID numbers used by the CIDFont. A character collection is uniquely identified by the Registry, Ordering, and Supplement entries in the CIDSystemInfo dictionary. Character collections whose Registry and Ordering values are the same are compatible.

PDF/A-4 standard requires that the Registry and Ordering strings of the CIDSystemInfo dictionaries for that font shall be identical and the value of the Supplement key in the CIDSystemInfo dictionary of the CIDFont shall be less than or equal to the Supplement key in the CIDSystemInfo dictionary of the CMap, unless the value of the CMap dictionary UserCMap key is "Identity-H" or "Identity-V".

  • Object type: PDType0Font
  • Test condition: cmapName == "Identity-H" || cmapName == "Identity-V" || (CIDFontOrdering != null && CIDFontOrdering == CMapOrdering && CIDFontRegistry != null && CIDFontRegistry == CMapRegistry && CIDFontSupplement != null && CMapSupplement != null && CIDFontSupplement <= CMapSupplement)
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.10.3.2-1

Requirement

ISO 32000-2:2020, 9.7.4 Table 115 requires that all embedded Type 2 CIDFonts, the CIDFont dictionary shall contain a CIDToGIDMap entry that shall be a stream mapping from CIDs to glyph indices or the name Identity, as described in ISO 32000-2:2020, 9.7.4 Table 115.

Error details

A Type 2 CIDFont dictionary has missing or invalid CIDToGIDMap entry

For Type 2, the CIDFont program is actually a TrueType font program, which has no native notion of CIDs. In a TrueType font program, glyph descriptions are identified by glyph index values. Glyph indices are internal to the font and are not defined consistently from one font to another. Instead, a TrueType font program contains a "cmap" table that provides mappings directly from character codes to glyph indices for one or more predefined encodings.

If the TrueType font program is embedded, the Type 2 CIDFont dictionary must contain a CIDToGIDMap entry that maps CIDs to the glyph indices for the appropriate glyph descriptions in that font program.

  • Object type: PDCIDFont
  • Test condition: Subtype != "CIDFontType2" || CIDToGIDMap != null || fontFile_size == 0
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 9.7.4, Table 115

Rule 6.2.10.3.3-1

Requirement

All CMaps used within a conforming PDF/A-4 file, except those listed in ISO 32000-2:2020, 9.7.5.2 Table 116, shall be embedded in that file as described in ISO 32000-2:2020, 9.7.5.

Error details

A non-standard CMap is not embedded

  • Object type: PDCMap
  • Test condition: CMapName == "Identity-H" || CMapName == "Identity-V" || CMapName == "GB-EUC-H" || CMapName == "GB-EUC-V" || CMapName == "GBpc-EUC-H" || CMapName == "GBpc-EUC-V" || CMapName == "GBK-EUC-H" || CMapName == "GBK-EUC-V" || CMapName == "GBKp-EUC-H" || CMapName == "GBKp-EUC-V" || CMapName == "GBK2K-H" || CMapName == "GBK2K-V" || CMapName == "UniGB-UCS2-H" || CMapName == "UniGB-UCS2-V" || CMapName == "UniGB-UFT16-H" || CMapName == "UniGB-UFT16-V" || CMapName == "B5pc-H" || CMapName == "B5pc-V" || CMapName == "HKscs-B5-H" || CMapName == "HKscs-B5-V" || CMapName == "ETen-B5-H" || CMapName == "ETen-B5-V" || CMapName == "ETenms-B5-H" || CMapName == "ETenms-B5-V" || CMapName == "CNS-EUC-H" || CMapName == "CNS-EUC-V" || CMapName == "UniCNS-UCS2-H" || CMapName == "UniCNS-UCS2-V" || CMapName == "UniCNS-UFT16-H" || CMapName == "UniCNS-UTF16-V" || CMapName == "83pv-RKSJ-H" || CMapName == "90ms-RKSJ-H" || CMapName == "90ms-RKSJ-V" || CMapName == "90msp-RKSJ-H" || CMapName == "90msp-RKSJ-V" || CMapName == "90pv-RKSJ-H" || CMapName == "Add-RKSJ-H" || CMapName == "Add-RKSJ-V" || CMapName == "EUC-H" || CMapName == "EUC-V" || CMapName == "Ext-RKSJ-H" || CMapName == "Ext-RKSJ-V" || CMapName == "H" || CMapName == "V" || CMapName == "UniJIS-UCS2-H" || CMapName == "UniJIS-UCS2-V" || CMapName == "UniJIS-UCS2-HW-H" || CMapName == "UniJIS-UCS2-HW-V" || CMapName == "UniJIS-UTF16-H" || CMapName == "UniJIS-UTF16-V" || CMapName == "KSC-EUC-H" || CMapName == "KSC-EUC-V" || CMapName == "KSCms-UHC-H" || CMapName == "KSCms-UHC-V" || CMapName == "KSCms-UHC-HW-H" || CMapName == "KSCms-UHC-HW-V" || CMapName == "KSCpc-EUC-H" || CMapName == "UniKS-UCS2-H" || CMapName == "UniKS-UCS2-V" || CMapName == "UniKS-UTF16-H" || CMapName == "UniKS-UTF16-V" || embeddedFile_size == 1
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 9.7.5.2, Table 116

Rule 6.2.10.3.3-2

Requirement

For those CMaps that are embedded, the integer value of the WMode entry in the CMap dictionary shall be identical to the WMode value in the embedded CMap stream.

Error details

WMode entry in the embedded CMap and in the CMap dictionary are not identical

A CMap also specifies the writing mode (horizontal or vertical) for any CIDFont with which the CMap is combined. This determines which metrics are to be used when glyphs are painted from that font.

In case of embedded CMap file, the writing mode is specified in two different places: in the PDF dictionary associated with the embedded CMap, and inside the embedded CMap file itself. To avoid any ambiguities, PDF/A standard requires that values of these two writing modes coincide.

  • Object type: CMapFile
  • Test condition: WMode == dictWMode
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.10.3.3-3

Requirement

A CMap shall not reference any other CMap except those listed in ISO 32000-2:2020, 9.7.5.2 Table 116.

Error details

A CMap references another non-standard CMap

  • Object type: PDReferencedCMap
  • Test condition: CMapName == "Identity-H" || CMapName == "Identity-V" || CMapName == "GB-EUC-H" || CMapName == "GB-EUC-V" || CMapName == "GBpc-EUC-H" || CMapName == "GBpc-EUC-V" || CMapName == "GBK-EUC-H" || CMapName == "GBK-EUC-V" || CMapName == "GBKp-EUC-H" || CMapName == "GBKp-EUC-V" || CMapName == "GBK2K-H" || CMapName == "GBK2K-V" || CMapName == "UniGB-UCS2-H" || CMapName == "UniGB-UCS2-V" || CMapName == "UniGB-UFT16-H" || CMapName == "UniGB-UFT16-V" || CMapName == "B5pc-H" || CMapName == "B5pc-V" || CMapName == "HKscs-B5-H" || CMapName == "HKscs-B5-V" || CMapName == "ETen-B5-H" || CMapName == "ETen-B5-V" || CMapName == "ETenms-B5-H" || CMapName == "ETenms-B5-V" || CMapName == "CNS-EUC-H" || CMapName == "CNS-EUC-V" || CMapName == "UniCNS-UCS2-H" || CMapName == "UniCNS-UCS2-V" || CMapName == "UniCNS-UFT16-H" || CMapName == "UniCNS-UTF16-V" || CMapName == "83pv-RKSJ-H" || CMapName == "90ms-RKSJ-H" || CMapName == "90ms-RKSJ-V" || CMapName == "90msp-RKSJ-H" || CMapName == "90msp-RKSJ-V" || CMapName == "90pv-RKSJ-H" || CMapName == "Add-RKSJ-H" || CMapName == "Add-RKSJ-V" || CMapName == "EUC-H" || CMapName == "EUC-V" || CMapName == "Ext-RKSJ-H" || CMapName == "Ext-RKSJ-V" || CMapName == "H" || CMapName == "V" || CMapName == "UniJIS-UCS2-H" || CMapName == "UniJIS-UCS2-V" || CMapName == "UniJIS-UCS2-HW-H" || CMapName == "UniJIS-UCS2-HW-V" || CMapName == "UniJIS-UTF16-H" || CMapName == "UniJIS-UTF16-V" || CMapName == "KSC-EUC-H" || CMapName == "KSC-EUC-V" || CMapName == "KSCms-UHC-H" || CMapName == "KSCms-UHC-V" || CMapName == "KSCms-UHC-HW-H" || CMapName == "KSCms-UHC-HW-V" || CMapName == "KSCpc-EUC-H" || CMapName == "UniKS-UCS2-H" || CMapName == "UniKS-UCS2-V" || CMapName == "UniKS-UTF16-H" || CMapName == "UniKS-UTF16-V"
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 9.7.5.2, Table 116

Rule 6.2.10.4.1-1

Requirement

The font programs for all fonts used for rendering within a conforming file shall be embedded within that file, as defined in ISO 32000-2:2020, 9.9.

Error details

The font program is not embedded

Text rendering mode 3 specifies that glyphs are not stroked, filled or used as a clipping boundary. A font referenced for use solely in this mode is therefore not rendered and is thus exempt from the embedding requirement.

  • Object type: PDFont
  • Test condition: Subtype == "Type3" || Subtype == "Type0" || renderingMode == 3 || fontFile_size == 1
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 9.9

Rule 6.2.10.4.1-2

Requirement

Embedded fonts shall define all glyphs referenced for rendering within the conforming file. A font referenced for use solely in rendering mode 3 is therefore not rendered and is thus exempt from the embedding requirement. In all cases for TrueType fonts that are to be rendered, character codes shall be able to be mapped to glyphs according to ISO 32000-2:2020, 9.6.5 without the use of a non-standard mapping chosen by the conforming processor.

Error details

Not all glyphs referenced for rendering are present in the embedded font program

All conforming PDF/A readers shall use the embedded fonts, rather than other locally resident, substituted or simulated fonts, for rendering.

The fonts used exclusively with text rendering mode 3 (invisible) are exempt from this requirement. OCR solutions often use such invisible fonts on top of the original scanned image to enable selection and copying of recognized text.

There is no exemption from the requirements of this rule for the 14 standard Type 1 fonts. See Rule 6.2.10.2-4 for the list of all standard fonts.

  • Object type: Glyph
  • Test condition: renderingMode == 3 || isGlyphPresent == null || isGlyphPresent == true
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 9.6.5

Rule 6.2.10.5-1

Requirement

For every font embedded in a conforming file, the glyph width information in the font dictionary and in the embedded font program shall be consistent for every glyph referenced for rendering. Glyphs that are referenced only with rendering mode 3 are exempt from this requirement.

Error details

Glyph width information in the embedded font program is not consistent with the Widths entry of the font dictionary

This requirement is necessary to ensure predictable font rendering, regardless of whether a given reader uses the metrics in the Widths entry or those in the font program.

  • Object type: Glyph
  • Test condition: renderingMode == 3 || widthFromFontProgram == null || widthFromDictionary == null || Math.abs(widthFromFontProgram - widthFromDictionary) <= 1
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.10.6-1

Requirement

For all non-symbolic TrueType fonts used for rendering, the embedded TrueType font program shall contain at least Microsoft Unicode (3,1 – Platform ID=3, Encoding ID=1), or Macintosh Roman (1,0 – Platform ID=1, Encoding ID=0) 'cmap' subtable that all necessary glyph lookups are able to be carried out.

Error details

The embedded font program for a non-symbolic TrueType font does not contain contain Microsoft Symbol (3,0 – Platform ID=3, Encoding ID=0) or the Mac Roman (1,0 – Platform ID=1, Encoding ID=0) encoding.

  • Object type: TrueTypeFontProgram
  • Test condition: isSymbolic == true || cmap31Present == true || cmap10Present == true
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.10.6-2

Requirement

All non-symbolic TrueType fonts shall have either MacRomanEncoding or WinAnsiEncoding as the value for the Encoding key in the Font dictionary or as the value for the BaseEncoding key in the dictionary which is the value of the Encoding key in the Font dictionary. In addition, no non-symbolic TrueType font shall define a Differences array unless all of the glyph names in the Differences array are listed in the Adobe Glyph List and the embedded font program contains at least the Microsoft Unicode (3,1 – Platform ID=3, Encoding ID=1) encoding in the 'cmap' subtable.

Error details

A non-symbolic TrueType font encoding does not define a correct mapping to the Adobe Glyph List

  • Object type: PDTrueTypeFont
  • Test condition: isSymbolic == true || ((Encoding == "MacRomanEncoding" || Encoding == "WinAnsiEncoding") && (containsDifferences == false || differencesAreUnicodeCompliant == true))
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.10.6-3

Requirement

Symbolic TrueType fonts shall not contain an Encoding entry in the font dictionary

Error details

A symbolic TrueType font specifies an Encoding entry in its dictionary

A Font is called symbolic if it contains characters outside the Adobe standard Latin character set. It is marked by special flag in its font descriptor dictionary.

  • Object type: PDTrueTypeFont
  • Test condition: isSymbolic == false || Encoding == null
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.10.6-4

Requirement

Symbolic TrueType fonts shall not contain an Encoding entry in the font dictionary, and the 'cmap' subtable in the embedded font program shall either contain the Microsoft Symbol (3,0 – Platform ID=3, Encoding ID=0) or the Mac Roman (1,0 – Platform ID=1, Encoding ID=0) encoding.

Error details

The embedded font program for a symbolic TrueType font does not contain Microsoft Symbol (3,0 – Platform ID=3, Encoding ID=0) or the Mac Roman (1,0 – Platform ID=1, Encoding ID=0) encoding.

  • Object type: TrueTypeFontProgram
  • Test condition: isSymbolic == false || cmap30Present == true || cmap10Present == true
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.10.7-1

Requirement

If a ToUnicode CMap is present, the Unicode values specified there shall all be greater than zero (0), but not equal to either U+FEFF or U+FFFE.

Error details

The glyph has an invalid Unicode value, which is either 0, or is equal to U+FEFF or U+FFFE.

The font dictionary of all fonts, regardless of their rendering mode usage, shall include a ToUnicode entry whose value is a CMap stream object that maps character codes for at least all referenced glyphs to Unicode values, as described in ISO 32000-2:2020, 9.10.3, unless the font meets at least one of the following four conditions:

  • fonts that use the predefined encodings MacRomanEncoding, MacExpertEncoding or WinAnsiEncoding;
  • Type 1 and Type 3 fonts where the glyph names of the glyphs referenced are all contained in the Adobe Glyph List or the set of named characters in the Symbol font, as defined in ISO 32000-2:2020, Annex D;
  • Type 0 fonts whose descendant CIDFont uses the Adobe-GB1, Adobe-CNS1, Adobe-Japan1 or Adobe-KR character collections.
  • Non-symbolic TrueType fonts.

This requirement ensures that the values in the ToUnicode CMap will be useful values and not simply placeholders.

  • Object type: Glyph
  • Test condition: toUnicode == null || (toUnicode.indexOf("\u0000") == -1 && toUnicode.indexOf("\uFFFE") == -1 && toUnicode.indexOf("\uFEFF") == -1)
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.10.8-1

Requirement

The ActualText entry shall not contain any PUA values.

Error details

The ActualText entry contains a Private Unicoide Area value.

  • Object type: CosActualText
  • Test condition: containsPUA == false
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.2.10.9-1

Requirement

A PDF/A-4 compliant document shall not contain a reference to the .notdef glyph from any of the text showing operators, regardless of text rendering mode, in any content stream.

Error details

The document contains a reference to the .notdef glyph.

  • Object type: Glyph
  • Test condition: name != ".notdef"
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.3.1-1

Requirement

Annotation types not defined in ISO 32000-2:2020, 12.5.6.1, Table 171 shall not be permitted. Additionally, the Sound, Screen and Movie types shall not be permitted. 3D and RichMedia types shall only be permitted in a PDF/A-4e compliant file as described in Annex B. The FileAttachment type shall only be permitted in a PDF/A-4f compliant file as described in Annex A.

Error details

Unknown or not permitted annotation type.

An annotation associates an object such as a note, sound, or movie with a location on a page of a PDF document, or provides a means of interacting with the user via the mouse and keyboard. PDF includes a wide variety of standard annotation types, described in detail in ISO 32000-2:2020, 12.5.6, "Annotation Types."

Support for multimedia content is outside the scope of PDF/A-4, and, thus, annotations of type "Sound" and "Movie" are not permitted.

  • PDF/A-4 validation:

    • Object type: PDAnnot
    • Test condition: Subtype == "Text" || Subtype == "Link" || Subtype == "FreeText" || Subtype == "Line" || Subtype == "Square" || Subtype == "Circle" || Subtype == "Polygon" || Subtype == "PolyLine" || Subtype == "Highlight" || Subtype == "Underline" || Subtype == "Squiggly" || Subtype == "StrikeOut" || Subtype == "Stamp" || Subtype == "Caret" || Subtype == "Ink" || Subtype == "Popup" || Subtype == "Widget" || Subtype == "PrinterMark" || Subtype == "TrapNet" || Subtype == "Watermark" || Subtype == "Redact" || Subtype == "Projection"
    • Specification: ISO 19005-4:2020
    • Levels: 4
    • Additional references:
      • ISO 32000-2:2020, 12.5.6.1, Table 171
  • PDF/A-4e validation:

    • Object type: PDAnnot
    • Test condition: Subtype == "Text" || Subtype == "Link" || Subtype == "FreeText" || Subtype == "Line" || Subtype == "Square" || Subtype == "Circle" || Subtype == "Polygon" || Subtype == "PolyLine" || Subtype == "Highlight" || Subtype == "Underline" || Subtype == "Squiggly" || Subtype == "StrikeOut" || Subtype == "Stamp" || Subtype == "Caret" || Subtype == "Ink" || Subtype == "Popup" || Subtype == "Widget" || Subtype == "PrinterMark" || Subtype == "TrapNet" || Subtype == "Watermark" || Subtype == "Redact" || Subtype == "3D" || Subtype == "Projection" || Subtype == "RichMedia"
    • Specification: ISO 19005-4:2020
    • Levels: 4E
    • Additional references:
      • ISO 32000-2:2020, 12.5.6.1, Table 171
  • PDF/A-4f validation:

    • Object type: PDAnnot
    • Test condition: Subtype == "Text" || Subtype == "Link" || Subtype == "FreeText" || Subtype == "Line" || Subtype == "Square" || Subtype == "Circle" || Subtype == "Polygon" || Subtype == "PolyLine" || Subtype == "Highlight" || Subtype == "Underline" || Subtype == "Squiggly" || Subtype == "StrikeOut" || Subtype == "Stamp" || Subtype == "Caret" || Subtype == "Ink" || Subtype == "Popup" || Subtype == "FileAttachment" || Subtype == "Widget" || Subtype == "PrinterMark" || Subtype == "TrapNet" || Subtype == "Watermark" || Subtype == "Redact" || Subtype == "Projection"
    • Specification: ISO 19005-4:2020
    • Levels: 4F
    • Additional references:
      • ISO 32000-2:2020, 12.5.6.1, Table 171

Rule 6.3.2-1

Requirement

Except for annotation dictionaries whose Subtype value is Popup, all annotation dictionaries shall contain the F key.

Error details

A dictionary of a non-Popup annotation does not contain F key

  • Object type: PDAnnot
  • Test condition: Subtype == "Popup" || F != null
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.3.2-2

Requirement

If present, the F key's Print flag bit shall be set to 1 and its Hidden, Invisible, ToggleNoView, and NoView flag bits shall be set to 0.

Error details

Annotation flags are set the annotation to be hidden/invisible or non-printable

The value of the annotation dictionary's F entry is an unsigned 32-bit integer containing flags specifying various characteristics of the annotation. Bit positions within the flag word are numbered from 1 (low-order) to 32 (high-order).

Flag "Invisible" (bit position 1), if clear, allows to display an unknown annotation using an appearance stream specified by its appearance dictionary, if any.

Flag "Hidden" (bit position 2), if set, specifies to not display or print the annotation or allow it to interact with the user, regardless of its annotation type.

Flag "Print" (bit position 3), if set, specifies to print the annotation when the page is printed.

Flag "NoView" (bit position 6), if set, specifies to not display the annotation on the screen or allow it to interact with the user.

Flag "ToggleNoView" (bit position 9), if set, inverts the interpretation of the NoView flag for certain events.

The restrictions on annotation flags prevent the use of annotations that are hidden or that are viewable but not printable.

  • Object type: PDAnnot
  • Test condition: F == null || ((F & 1) == 0 && (F & 2) == 0 && (F & 4) == 4 && (F & 32) == 0 && (F & 256) == 0)
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.3.3-1

Requirement

Every annotation (including those whose Subtype value is Widget, as used for form fields), except for the two cases listed below, shall have at least one appearance dictionary: Annotations where the value of the Rect key consists of an array where the value at index 1 is equal to the value at index 3 and the value at index 2 is equal to the value at index 4; - annotations whose Subtype value is Popup, Link or Projection.

Error details

An annotation does not contain an appearance dictionary

  • Object type: PDAnnot
  • Test condition: (width == 0 && height == 0) || Subtype == "Popup" || Subtype == "Link" || Subtype == "Projection" || AP != null
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 12.5.2, Table 166

Rule 6.3.3-2

Requirement

For all annotation dictionaries containing an AP key, the appearance dictionary that it defines as its value shall contain only the N key.

Error details

Annotation's appearance dictionary contains entries other than N

An annotation can define as many as three separate appearances. The normal appearance (N key in the appearance dictionary) is used when the annotation is not interacting with the user. This is also the appearance that is used for printing the annotation. The rollover appearance (R key) is used when the user moves the cursor into the annotation's active area without pressing the mouse button. The down appearance (R key) is used when the mouse button is pressed or held down within the annotation's active area.

In accordance with the ISO 32000-2:2020, 12.5.5, a Button form field needs to have multiple appearance states, each one associated with the specific values that the button can take.

PDF Validation Technical Working Group notes

Even if a Button form field has only one state, it is still required to have an appearance subdictionary with a single key as its default (and only) state.

  • Object type: PDAnnot
  • Test condition: AP == null || AP == "N"
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.3.3-3

Requirement

If an annotation dictionary's Subtype key has a value of Widget and its FT key has a value of Btn, the value of the N key shall be an appearance subdictionary.

Error details

An annotation dictionary's Subtype key has a value of Widget and its FT key has a value of Btn, but the value of the N key different from of appearance subdictionary

See also rule 6.3.3-2.

  • Object type: PDAnnot
  • Test condition: AP != "N" || Subtype != "Widget" || FT != "Btn" || (N_type == "Dict" && appearance_size > 0)
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.3.3-4

Requirement

If an annotation dictionary's Subtype key has value other than Widget, or if FT key associated with Widget annotation has value other than Btn, the value of the N key shall be an appearance stream.

Error details

An annotation dictionary's Subtype key has a value other than Widget or/and its FT key different from Btn, but the value of the N key is not an appearance stream

See also rule 6.3.3-2.

  • Object type: PDAnnot
  • Test condition: AP != "N" || (Subtype == "Widget" && FT == "Btn") || N_type == "Stream"
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.4.1-1

Requirement

A Widget annotation dictionary shall not contain the A key.

Error details

A Widget annotation contains A entry

Such actions may modify the values of the forms and alter the visual representation of the document.

  • Object type: PDWidgetAnnot
  • Test condition: containsA == false
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.4.1-2

Requirement

The NeedAppearances flag of the interactive form dictionary shall either not be present or shall be false.

Error details

The interactive form dictionary contains the NeedAppearances flag with value true

Every form field shall have an appearance dictionary associated with the field's data. A conforming reader shall render the field according to the appearance dictionary without regard to the form data. Requiring an appearance dictionary ensures the reliable rendering of the form.

  • Object type: PDAcroForm
  • Test condition: NeedAppearances == null || NeedAppearances == false
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.4.2-1

Requirement

The document's interactive form dictionary that forms the value of the AcroForm key in the document's Catalog of a PDF/A-4 file, if present, shall not contain the XFA key.

Error details

The interactive form dictionary contains the XFA key

  • Object type: PDAcroForm
  • Test condition: containsXFA == false
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.4.2-2

Requirement

A document's Catalog shall not contain the NeedsRendering key.

Error details

A document's Catalog contains NeedsRendering flag set to true

  • Object type: CosDocument
  • Test condition: NeedsRendering == false
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.6.1-1

Requirement

The Launch, Sound, Movie, ResetForm, ImportData, Hide, Rendition and Trans actions shall not be permitted. Additionally, the obsoleted set-state and no-op actions, that were defined in earlier PDF specifications, shall not be permitted. The SetOCGState and GoTo3DView actions shall only be permitted in a PDF/A-4e compliant file as described in Annex B.

Error details

Unknown or not permitted action type

Support for multimedia content is outside the scope of PDF/A-4. The ResetForm action changes the rendered appearance of a form. The ImportData action imports form data from an external file.

  • PDF/A-4 validation:

    • Object type: PDAction
    • Test condition: S == "GoTo" || S == "GoToR" || S == "GoToE" || S == "Thread" || S == "URI" || S == "Named" || S == "SubmitForm" || S == "JavaScript" || S == "RichMediaExecute" || S == "GoToDp"
    • Specification: ISO 19005-4:2020
    • Levels: 4, 4F
    • Additional references:
      • ISO 32000-2:2010, 12.6.4.1, Table 201
  • PDF/A-4e validation:

    • Object type: PDAction
    • Test condition: S == "GoTo" || S == "GoToR" || S == "GoToE" || S == "Thread" || S == "URI" || S == "Named" || S == "SubmitForm" || S == "SetOCGState" || S == "GoTo3DView" || S == "JavaScript" || S == "RichMediaExecute" || S == "GoToDp"
    • Specification: ISO 19005-4:2020
    • Levels: 4E
    • Additional references:
      • ISO 32000-2:2010, 12.6.4.1, Table 201

Rule 6.6.1-2

Requirement

Named actions other than NextPage, PrevPage, FirstPage, and LastPage shall not be permitted.

Error details

Unknown or not permitted named action

In response to each of the four allowed named actions, conforming interactive readers shall perform the appropriate action described in ISO 32000-2:2020 Table 215. Viewer applications may support additional, nonstandard named actions, but any document using them will not be portable.

  • Object type: PDNamedAction
  • Test condition: N == "NextPage" || N == "PrevPage" || N == "FirstPage" || N == "LastPage"
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 12.6.4.12, Table 215

Rule 6.6.3-1

Requirement

If a document catalog dictionary or page dictionary or an annotation dictionary (other than a Widget annotation dictionary) include an AA entry, its value (which is an additional-actions dictionary) shall only contain keys from the following list: E, X, D, U, Fo, and Bl.

Error details

A document catalog dictionary include an AA entry and its value (which is an additional-actions dictionary) contains key(s) not from the following list: E, X, D, U, Fo, and Bl

  • Object type: PDAdditionalActions
  • Test condition: parentType == 'WidgetAnnot' || parentType == 'FormField' || entries.split('&').filter(elem => elem != 'E' && elem != 'X' && elem != 'D' && elem != 'U' && elem != 'Fo' && elem != 'Bl').length == 0 || entries == ''
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.7.2.1-1

Requirement

The document catalog dictionary of a conforming file shall contain the Metadata key whose value is a metadata stream as defined in ISO 32000-2:2020, 14.3.2.

Error details

The document catalog dictionary doesn't contain metadata key.

Metadata is essential for effective management of a file throughout its life cycle. A file depends on metadata for identification and description, as well as for describing appropriate technical and administrative matters.

Metadata, both for an entire document and for components within a document, can be stored in PDF streams called metadata streams. The contents of a metadata stream is the metadata represented in Extensible Markup Language (XML). The format of the XML representing the metadata is defined as part of a framework called the Extensible Metadata Platform (XMP).

A metadata stream can be attached to a document through the Metadata entry in the document catalog.

  • Object type: PDDocument
  • Test condition: metadata_size == 1
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 14.3.2

Rule 6.7.2.1-2

Requirement

The bytes attribute shall not be used in the header of an XMP packet.

Error details

The XMP Package contains bytes attribute.

Both the bytes and encoding attributes are deprecated in XMP Specification.

  • Object type: XMPPackage
  • Test condition: bytes == null
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.7.2.1-3

Requirement

The encoding attribute shall not be used in the header of an XMP packet.

Error details

The XMP Package contains encoding attribute.

Both the bytes and encoding attributes are deprecated in XMP Specification.

  • Object type: XMPPackage
  • Test condition: encoding == null
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.7.2.1-4

Requirement

All content of all XMP packets located in any metadata stream present in the PDF shall be well-formed as defined by XMP (ISO 16684-1).

Error details

A metadata stream is serialized incorrectly and can not be parsed.

  • Object type: XMPPackage
  • Test condition: isSerializationValid
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 16684-1:2019

Rule 6.7.2.1-5

Requirement

All metadata streams present in the PDF shall conform to the XMP Specification. The XMP package must be encoded as UTF-8.

Error details

The XMP package uses an encoding different from UTF-8.

  • Object type: XMPPackage
  • Test condition: actualEncoding == "UTF-8"
  • Specifications: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • XMP Specification September 2005, 5 - Embedding XMP Metadata in Application Files - PDF

Rule 6.7.3-1

Requirement

The PDF/A version and conformance level of a file shall be specified using the PDF/A Identification extension schema. The PDF/A Identification schema defined in Table 2 uses the namespace URI "http://www.aiim.org/pdfa/ns/id/".

Error details

The document metadata stream doesn't contains PDF/A Identification Schema.

  • Object type: MainXMPPackage
  • Test condition: Identification_size == 1
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.7.3-2

Requirement

The value of pdfaid:part shall be the part number of ISO 19005 to which the file conforms.

Error details

The "part" property of the PDF/A Identification Schema is does not match the PDF/A profile part number.

  • Object type: PDFAIdentification
  • Test condition: part == 4
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.7.3-3

Requirement

A PDF/A-4e conforming file (as described in Annex B) shall specify the value of pdfaid:conformance as E. A PDF/A-4f conforming file (as described in Annex A) shall specify the value of pdfaid:conformance as F. A file that does not conform to either PDF/A-4e or PDF/A-4f shall not provide any pdfaid:conformance.

Error details

A file that does not conform to either PDF/A-4e or PDF/A-4f provides pdfaid:conformance.

A PDF/A-4e conforming file does not specify the value of pdfaid:conformance as E.

A PDF/A-4f conforming file does not specify the value of pdfaid:conformance as F.

  • PDF/A-4 validation:

    • Object type: PDFAIdentification
    • Test condition: conformance == null
    • Specification: ISO 19005-4:2020
    • Levels: 4
  • PDF/A-4e validation:

    • Object type: PDFAIdentification
    • Test condition: conformance == "E"
    • Specification: ISO 19005-4:2020
    • Levels: 4E
    • Additional references:
      • ISO 19005-4:2020, Annex B
  • PDF/A-4f validation:

    • Object type: PDFAIdentification
    • Test condition: conformance == "F"
    • Specification: ISO 19005-4:2020
    • Levels: 4F
    • Additional references:
      • ISO 19005-4:2020, Annex A

Rule 6.7.3-4

Requirement

Property part of the PDF/A Identification Schema shall have namespace prefix pdfaid

Error details

Property part of the PDF/A Identification Schema has an invalid namespace prefix

  • Object type: PDFAIdentification
  • Test condition: partPrefix == null || partPrefix == "pdfaid"
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.7.3-5

Requirement

The value of pdfaid:rev shall be the four digit year.

Error details

The value of pdfaid:rev not the four digit year.

  • Object type: PDFAIdentification
  • Test condition: /^\d{4}$/.test(rev)
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.7.3-6

Requirement

Property rev of the PDF/A Identification Schema shall have namespace prefix pdfaid

Error details

Property rev of the PDF/A Identification Schema has an invalid namespace prefix

  • Object type: PDFAIdentification
  • Test condition: revPrefix == null || revPrefix == "pdfaid"
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.9-1

Requirement

The embedded file stream dictionary shall include a valid MIME type value for the Subtype key. If the MIME type is not known, the value "application/octet-stream" shall be used.

Error details

The MIME type information (Subtype entry) of an embedded file is missing or invalid

  • Object type: EmbeddedFile
  • Test condition: Subtype != null && /^[-\w+\.]+\/[-\w+\.]+$/.test(Subtype)
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 14.13.2

Rule 6.9-2

Requirement

The file specification dictionary for an embedded file shall contain the F and UF keys

Error details

The file specification dictionary for an embedded file does not contain either F or UF key

  • Object type: CosFileSpecification
  • Test condition: containsEF == false || (F != null && UF != null)
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.9-3

Requirement

All of the embedded files shall be compliant with ISO 19005-1, 19005-2 or 19005-4

Error details

An embedded file does not comply to either ISO 19005-1, ISO 19005-2 or ISO 19005-4

  • Object type: EmbeddedFile
  • Test condition: isValidPDFA124 == true
  • Specification: ISO 19005-4:2020
  • Levels: 4

Rule 6.9-4

Requirement

Each embedded file’s file specification dictionary shall contain an AFRelationship key (ISO 32000-2, 7.11.3) that describes how this embedded file relates to the content of the PDF.

Error details

The file specification dictionary for an embedded file does not contain the AFRelationship key

  • Object type: CosFileSpecification
  • Test condition: containsEF == false || AFRelationship != null
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
  • Additional references:
    • ISO 32000-2:2020, 7.11.3

Rule 6.9-5

Requirement

A PDF/A-4f conforming file shall contain an EmbeddedFiles key in the name dictionary of the document catalog dictionary.

Error details

A PDF/A-4f conforming file does not contain an EmbeddedFiles key in the name dictionary of the document catalog dictionary.

  • Object type: CosDocument
  • Test condition: containsEmbeddedFiles == true
  • Specification: ISO 19005-4:2020
  • Levels: 4F
  • Additional references:
    • ISO 19005-4:2020, Annex A

Rule 6.10-1

Requirement

Each optional content configuration dictionary that forms the value of the D key, or that is an element in the array that forms the value of the Configs key in the OCProperties dictionary, shall contain the Name key.

Error details

Missing Name entry of the optional content configuration dictionary

  • Object type: PDOCConfig
  • Test condition: Name != null
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.10-2

Requirement

Each optional content configuration dictionary shall contain the Name key, whose value shall be unique amongst all optional content configuration dictionaries within the PDF/A-4 file.

Error details

Optional content configuration dictionary has duplicated name

  • Object type: PDOCConfig
  • Test condition: hasDuplicateName == false
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.10-3

Requirement

If an optional content configuration dictionary contains the Order key, the array which is the value of this Order key shall contain references to all OCGs in the conforming file.

Error details

Not all optional content groups are present in the Order entry of the optional content configuration dictionary

  • Object type: PDOCConfig
  • Test condition: OCGsNotContainedInOrder == null
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.11-1

Requirement

There shall be no AlternatePresentations entry in the document's name dictionary

Error details

The document's name dictionary contains the AlternatePresentations entry

  • Object type: PDDocument
  • Test condition: containsAlternatePresentations == false
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.11-2

Requirement

There shall be no PresSteps entry in any Page dictionary

Error details

A Page dictionary contains the PresSteps entry

  • Object type: PDPage
  • Test condition: containsPresSteps == false
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F

Rule 6.12-1

Requirement

The document catalog shall not contain the Requirements key

Error details

The document catalog contains the Requirements key

  • Object type: CosDocument
  • Test condition: Requirements == null
  • Specification: ISO 19005-4:2020
  • Levels: 4, 4E, 4F
Clone this wiki locally