Skip to content

Dealing with Embedded Files

Jorj X. McKie edited this page Mar 6, 2018 · 7 revisions

Since MuPDF v1.11, PyMuPDF with its v1.11.0 can deal with embedded files.

This feature (PDF 1.4 format) allows attaching arbitrary data or files to PDF documents. With PyMuPDF, such embedded data can be added, deleted, extracted and modified.

Here is a script that packs a bunch of files into a new PDF.

import fitz
import os
sdir = "D:/Jorj/Wissen/Spektrum"

flist = os.listdir(sdir)
flist.sort()
doc = fitz.open()
page = doc.newPage()
text = ["This file contains the following documents:", ""]
for f in flist:
    if not (f.endswith(".pdf") and f.startswith("sdw_2017")):
        continue
    text.append(f)
    buffer = open(os.path.join(sdir, f), "rb").read()
    doc.embeddedFileAdd(buffer, f)

page.insertText(fitz.Point(50,100), text)
doc.save("embedded.pdf", garbage=4, deflate=True)

Most PDF viewers will offer to display an embedded file:

Clone this wiki locally