# St/ISO/PDF 

PDF stands for Portable Document Format, which was created by Adobe[^1], 
and currently maintained by the International Organization for Standardization (ISO) as an open source international standard [^2]. 

Some commonly used specialized PDF types include: 

- ISO 14289/PDF/UA  for accessible PDF documents and processors (extends PDF/A conformance level A) 
- ISO 15930/PDF/X  for printing 
- ISO 19005/PDF/**A  for long-term archiving** [^3] 
    - Sub-parts:
        - ISO 19005-1:2005/PDF/A-1 (based on PDF v1.4)
        - ISO 19005-2:2011/PDF/A-2 (based on PDF v1.7)
        - ISO 19005-3:2012/PDF/A-3 (add file)
        - ISO 19005-4:2020/PDF/A-4 (based on PDF v2.0)
    - Not allow: audio, video, 3d objects, JS, certain actions, encryption, non-standard metadata
    - Require: embedding font with proper license 
- ISO 24517/PDF/E  for representing engineering documents (CAD, etc.).

For regulatory submission, FDA currently support "PDF versions 1.4 through 1.7, PDF/A-1 and PDF/A-2"[^4].
Steps for **creating and validating** PDF/A files can be found in reference [^5]<sup>,</sup>[^6].

The module `stiso.pdfsummary` depends on package `pypdf` [^7].
The module `stiso.pdfsummary` include functions for creating summaries about a specified PDF file.


To use `mtbp3.stiso`:

In [None]:
from mtbp3.stiso.pdfsummary import pdfSummary

pfr = pdfSummary(path="")
print(pfr.get_summary_string())

If the path left as empty, an example pdf file will be loaded for illustration.

To view the outline tree:

In [None]:
print(pfr.show_outline_tree())

## ISO 32000-1:2008/PDF 1.7

### Character 

Classes

- Regular
    - hex outside 21h to 7Eh (! to ~)
- Delimiter
    - 28h: '(' (literal string)
    - 29h: ')'
    - 3Ch: '<' (hex string)
    - 3Eh: '>'
    - 5Bh: '[' (array)
    - 5Dh: ']'
    - 7Bh: '{'
    - 7Dh: '}'
    - 2Fh: '/' 
    - 25h: '%' (comment; other than %PDF-n.m and %%EOF)
- White space
    - 00h: null
    - 09h: tab
    - 0Ch: form feed
    - 20h: space
    - newline
        - 0Dh: cr 
        - 0Ah: lf 
        - 0Dh0Ah: crlf

### Objects

- Indirect 
    - Define: N+ NN- obj\n...\nendobj
    - Reference: N+ NN- R
    - Stream Define: dictionary\nstream\n...\nendstream or dictionary\n\rstream\n\r...\n\rendstream
- Direct
    - if not indirect
- Name
    - Define: 2Fh(/)name(atomic)

## Reference

[^1]: Adobe. (2024). Everything you need to know about the PDF. ([web page](https://www.adobe.com/acrobat/about-adobe-pdf.html))
[^2]: ISO. (2021). The standard for PDF is revised. ([web page](https://www.iso.org/news/ref2608.html))
[^3]: pdfa.org. (2013). PDF/A in a Nutshell 2.0. ([web page](https://pdfa.org/resource/pdfa-in-a-nutshell-2-0/))
[^4]: FDA. (2016). Portable Document Format (PDF) Specifications. ([pdf](https://www.fda.gov/files/drugs/published/Portable-Document-Format-Specifications.pdf))
[^5]: Adobe. (2023). PDF/X-, PDF/A-, and PDF/E-compliant files (Acrobat Pro). ([web page](https://helpx.adobe.com/acrobat/using/pdf-x-pdf-a-pdf.html))
[^6]: pypdf Contributors. (2024). PDF/A Compliance. ([web page](https://pypdf.readthedocs.io/en/stable/user/pdfa-compliance.html))
[^7]: pypdf Contributors. (2024). pypdf. ([web page](https://pypdf.readthedocs.io/en/stable/index.html))




