📄 `PDF` Links

A convinient way to create / layout / maintain PDF link annotations in Adobe Illustrator.

Motivation

While you can create automatic links in a PDF (by put the actual url into a textbox), it is limiting in various ways (create links on graphics, create custom hotspot). Also, when iterating on a document design, I found it pretty cumbersome to create / update link annotations in external apps, so after some research on Apple PDFKit I put together this tiny tool.

Usage

Create a layer for the links (that you can hide later on).

Create a text starting with "Link " followed by the actual url.

Wrap into a clipping rectangle to define link hotspot.

Hide layer containing the links before export PDF.

Launch PDF Links, drag PDF into.

Enjoy linked PDF.

Install

An installer is packaged at PDF_Links.dmg.

Background

Besides the use case, this repository is a prototype for PDF content processing in Swift.

The pages, annotations, textual content is pretty accessable with the high-level PDFKit.PDFDocument APIs. However, the actual content streams in a PDF (images / graphics) are only accessible as raw data via PDFKit.CGPDFDocument.

The project contains a Parser.swift class that crawls a PDF object hierarchy and maps out the content as a JSON for further inspection. Using that JSON you can plan out various processing implementations (images / fonts / graphics / layers / metadata / etc.).

// Parse PDF into JSON.
PDFParser.parse(pdfUrl: pdfFileURL, into: jsonFileURL)

// Parse PDF into Dictionary.
let pdf: [String:Any?] = PDFParser.parse(pdfUrl: pdfFileURL)

The resulting JSON gives you the entire PDF content (with type information in angle brackets).

{
  "Catalog" : {
    "Pages<Dictionary>" : {
      "MediaBox<Array>" : [
        0,
        0,
        612,
        792
      ],
      "Type<Name>" : "Pages",
      "Kids<Array>" : [
        {
          "Rotate<Integer>" : 0,
          "MediaBox<Array>" : [
            0,
            0,
            595.27499999999998,
            841.88999999999999
          ],
          "Parent<Dictionary>" : "<PARENT_NOT_SERIALIZED>",
          "Resources<Dictionary>" : {
            "ColorSpace<Dictionary>" : {
              "Cs1<Array>" : [
                "ICCBased",
                {
                  "N<Integer>" : 3,
                  "Filter<Name>" : "FlateDecode",
                  "Alternate<Name>" : "DeviceRGB",
                  "Length<Integer>" : 2612
                }
              ]
            }
...

You can get the PDF content as a Swift dictionary as well (see console output below).

Optional(["Pages<Dictionary>": Optional({
    "Count<Integer>" = 1;
    "Kids<Array>" =     (
                {
            "ArtBox<Array>" =             (
                "28.3465",
                "325.193",
                "393.389",
                "813.543"
            );
            "Contents<Stream>" =             {
                Data = "q Q q 0 0 595.276 841.89 re W n 1 0 1 0 k /Gs1 gs 201.8862 420.9449 m 201.8862\n473.8269 244.7562 516.6959 297.6372 516.6959 c 350.5192 516.6959 393.3892\n473.8269 393.3892 420.9449 c 393.3892 368.0629 350.5192 325.1939 297.6372\n325.1939 c 244.7562 325.1939 201.8862 368.0629 201.8862 420.9449 c f Q q 28.346 530.078 283.464 283.465\nre W n 0 0 0 1 k /Gs1 gs BT 12 0 0 12 28.3467 803.499 Tm /Tc1 1 Tf [ (h) 4\n(ttp://epp) 7 (z.eu) ] TJ ET Q";
                "Filter<Name>" = FlateDecode;
                "Length<Integer>" = 237;
            };
            "MediaBox<Array>" =             (
                0,
                0,
                "595.2760000000001",
                "841.89"
            );
            "Parent<Dictionary>" = "<PARENT_NOT_SERIALIZED>";
            "Resources<Dictionary>" =             {
                "ExtGState<Dictionary>" =                 {
                    "Gs1<Dictionary>" =                     {
                        "OPM<Integer>" = 1;
                        "Type<Name>" = ExtGState;
                    };
                };
...

See Parser.swift for more.

Graphics data is serialized using COS (Carousel Object System). Although Carousel was only a code name for what later became Acrobat, the name is still used to refer to the way a PDF file is composed. From the documentation: "...the data in a content stream is interpreted as a sequence of operators and their operands, expressed as basic data objects according to standard PDF syntax...". See official PDF Reference for more.

Here is what a slice of the contents of the example PDF used in Usage section looks like.

...
/OC /MC1 BDC 
0.02 0.655 0.502 rg
0 586.77 595.275 255.119 re
f
EMC 
/OC /MC2 BDC 
BT
1 1 1 rg
/TT0 1 Tf
32.0407 0 0 32.0407 255.1182 714.3296 Tm
(Geri Borb\\341s)Tj
/TT1 1 Tf
14 0 0 14 255.1182 670.0706 Tm
[(I lo)19.1 (v)17.9 (e this industry)48 (. In the past 8 )28 (y)18 (ears I made )]TJ
0 -1.286 Td
[(numer)26 (ous )31 (Apps and Games )]TJ
/TT0 1 Tf
[(fr)26 (om z)14.1 (er)26 (o t)13 (o )]TJ
0 -1.286 Td
[(mark)27 (e)4 (t)]TJ
/TT1 1 Tf
[(, bo)7.1 (th t)13 (eamed and solo.)]TJ
ET
...

It is somewhat human readable, seemingly designed to direct draw using the operators. In this project I used the regex below to parse link text data with the bounds of the corresponding clipping rectangles. See the expression on Regex101 for more.

# Clipping Rectangle (x, y, width, height)
(?<x>\b[-0-9.]+\b)\s
(?<y>\b[-0-9.]+\b)\s
(?<width>\b[-0-9.]+\b)\s
(?<height>\b[-0-9.]+\b)\s
re\nW

# Spacing
(?:
    .   # Any character
    (?! # Except followed by
        # Clipping Rectangle
        (\b[-\d.]+\b\s){4}
        re\nW
    )
)*? # 0 or more times

# URL
BT
    # Spacing
    (?:
        .      # Any character 
        (?!ET) # Except followed by 'ET'
    )*?        # 0 or more times
\n
    # Link
    (?<URL>
        .[^\n]*? # Any character except new-line 0 or more times
        Link     # Containing 'Link'
        .*?      # Any character 0 or more times
    )
    # Followed by 'TJ' or 'Tj' at the end of the line
    (?:TJ\n|Tj\n)
ET

It parses the graphic content into nicely usable Swift Codable structs. See PageLinks.parseLinks(from contents:) for more. After parsing it can be encoded into JSON easily.

{
"pages" : [
  {
    "links" : [
      {
        "bounds" : {
          "y" : 43.936999999999998,
          "x" : 43.936999999999998,
          "width" : 39.685000000000002,
          "height" : 39.686
        },
        "urlString" : "http:\/\/bit.ly\/GeriBorbasLinkedIn"
      },
      {
        "bounds" : {
          "y" : 43.936999999999998,
          "x" : 86.456999999999994,
          "width" : 39.685000000000002,
          "height" : 39.686
        },
        "urlString" : "http:\/\/bit.ly\/GeriBorbasTwitter"
      },
      {
        "bounds" : {
          "y" : 43.936999999999998,
          "x" : 128.976,
          "width" : 39.685000000000002,
          "height" : 39.686
        },
        "urlString" : "http:\/\/bit.ly\/GeriBorbasGitHub"
      },
...

To create PDFKit.PDFAnnotation, the same coordinate system can be used. Having that, a parsed Link can be directly converted into a PDFKit.PDFAnnotation. Those can be added to a PDF page easily with PDFKit.PDFPage.addAnnotation(_:).

extension Link
{


    var annotation: PDFAnnotation
    {
        PDFAnnotation(
            bounds: CGRect(x: bounds.x, y: bounds.y, width: bounds.width, height: bounds.height),
            forType: PDFAnnotationSubtype.link,
            withProperties: nil
        ).with(url: url)
    }
}

License

Licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
Documentation		Documentation
PDFLinks.xcodeproj		PDFLinks.xcodeproj
PDFLinks		PDFLinks
Regex.playground		Regex.playground
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation

Documentation

PDFLinks.xcodeproj

PDFLinks.xcodeproj

PDFLinks

PDFLinks

Regex.playground

Regex.playground

.gitignore

.gitignore

CHANGELOG.md

CHANGELOG.md

README.md

README.md

Repository files navigation

📄 `PDF` Links

Motivation

Usage

Install

Background

License

About

Releases 1

Packages

Languages

Geri-Borbas/macOS.Production.PDF_Links

Folders and files

Latest commit

History

Repository files navigation

📄 PDF Links

Motivation

Usage

Install

Background

License

About

Topics

Resources

Stars

Watchers

Forks

Languages

📄 `PDF` Links