-
Notifications
You must be signed in to change notification settings - Fork 0
/
qpdf-manual.xml
3875 lines (3866 loc) · 155 KB
/
qpdf-manual.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE book [
<!ENTITY ldquo "“">
<!ENTITY rdquo "”">
<!ENTITY mdash "—">
<!ENTITY ndash "–">
<!ENTITY nbsp " ">
<!ENTITY swversion "4.1.0">
<!ENTITY lastreleased "April 14, 2013">
]>
<book>
<bookinfo>
<title>QPDF Manual</title>
<subtitle>For QPDF Version &swversion;, &lastreleased;</subtitle>
<author>
<firstname>Jay</firstname><surname>Berkenbilt</surname>
</author>
<copyright>
<year>2005–2013</year>
<holder>Jay Berkenbilt</holder>
</copyright>
</bookinfo>
<preface id="acknowledgments">
<title>General Information</title>
<para>
QPDF is a program that does structural, content-preserving
transformations on PDF files. QPDF's website is located at <ulink
url="http://qpdf.sourceforge.net/">http://qpdf.sourceforge.net/</ulink>.
QPDF's source code is hosted on github at <ulink
url="https://github.com/qpdf/qpdf">https://github.com/qpdf/qpdf</ulink>.
</para>
<para>
QPDF has been released under the terms of <ulink
url="http://www.opensource.org/licenses/artistic-license-2.0.php">Version
2.0 of the Artistic License</ulink>, a copy of which appears in the
file <filename>Artistic-2.0</filename> in the source distribution.
</para>
<para>
QPDF was originally created in 2001 and modified periodically
between 2001 and 2005 during my employment at <ulink
url="http://www.apexcovantage.com">Apex CoVantage</ulink>. Upon my
departure from Apex, the company graciously allowed me to take
ownership of the software and continue maintaining as an open
source project, a decision for which I am very grateful. I have
made considerable enhancements to it since that time. I feel
fortunate to have worked for people who would make such a decision.
This work would not have been possible without their support.
</para>
</preface>
<chapter id="ref.overview">
<title>What is QPDF?</title>
<para>
QPDF is a program that does structural, content-preserving
transformations on PDF files. It could have been called something
like <emphasis>pdf-to-pdf</emphasis>. It also provides many useful
capabilities to developers of PDF-producing software or for people
who just want to look at the innards of a PDF file to learn more
about how they work.
</para>
<para>
With QPDF, it is possible to copy objects from one PDF file into
another and to manipulate the list of pages in a PDF file. This
makes it possible to merge and split PDF files. The QPDF library
also makes it possible for you to create PDF files from scratch.
In this mode, you are responsible for supplying all the contents of
the file, while the QPDF library takes care off all the syntactical
representation of the objects, creation of cross references tables
and, if you use them, object streams, encryption, linearization,
and other syntactic details. You are still responsible for
generating PDF content on your own.
</para>
<para>
QPDF has been designed with very few external dependencies, and it
is intentionally very lightweight. QPDF is
<emphasis>not</emphasis> a PDF content creation library, a PDF
viewer, or a program capable of converting PDF into other formats.
In particular, QPDF knows nothing about the semantics of PDF
content streams. If you are looking for something that can do
that, you should look elsewhere. However, once you have a valid
PDF file, QPDF can be used to transform that file in ways perhaps
your original PDF creation can't handle. For example, many
programs generate simple PDF files but can't password-protect them,
web-optimize them, or perform other transformations of that type.
</para>
</chapter>
<chapter id="ref.installing">
<title>Building and Installing QPDF</title>
<para>
This chapter describes how to build and install qpdf. Please see
also the <filename>README</filename> and
<filename>INSTALL</filename> files in the source distribution.
</para>
<sect1 id="ref.prerequisites">
<title>System Requirements</title>
<para>
The qpdf package has relatively few external dependencies. In
order to build qpdf, the following packages are required:
<itemizedlist>
<listitem>
<para>
zlib: <ulink url="http://www.zlib.net/">http://www.zlib.net/</ulink>
</para>
</listitem>
<listitem>
<para>
pcre: <ulink url="http://www.pcre.org/">http://www.pcre.org/</ulink>
</para>
</listitem>
<listitem>
<para>
gnu make 3.81 or newer: <ulink url="http://www.gnu.org/software/make">http://www.gnu.org/software/make</ulink>
</para>
</listitem>
<listitem>
<para>
perl version 5.8 or newer:
<ulink url="http://www.perl.org/">http://www.perl.org/</ulink>;
required for <command>fix-qdf</command> and the test suite.
</para>
</listitem>
<listitem>
<para>
GNU diffutils (any version): <ulink
url="http://www.gnu.org/software/diffutils/">http://www.gnu.org/software/diffutils/</ulink>
is required to run the test suite. Note that this is the
version of diff present on virtually all GNU/Linux systems.
This is required because the test suite uses <command>diff
-u</command>.
</para>
</listitem>
<listitem>
<para>
A C++ compiler that works well with STL and has the <type>long
long</type> type. Most modern C++ compilers should fit the
bill fine. QPDF is tested with gcc and Microsoft Visual C++.
</para>
</listitem>
</itemizedlist>
</para>
<para>
Part of qpdf's test suite does comparisons of the contents PDF
files by converting them images and comparing the images. The
image comparison tests are disabled by default. Those tests are
not required for determining correctness of a qpdf build if you
have not modified the code since the test suite also contains
expected output files that are compared literally. The image
comparison tests provide an extra check to make sure that any
content transformations don't break the rendering of pages.
Transformations that affect the content streams themselves are off
by default and are only provided to help developers look into the
contents of PDF files. If you are making deep changes to the
library that cause changes in the contents of the files that qpdf
generates, then you should enable the image comparison tests.
Enable them by running <command>configure</command> with the
<option>--enable-test-compare-images</option> flag. If you enable
this, the following additional requirements are required by the
test suite. Note that in no case are these items required to use
qpdf.
<itemizedlist>
<listitem>
<para>
libtiff: <ulink url="http://www.remotesensing.org/libtiff/">http://www.remotesensing.org/libtiff/</ulink>
</para>
</listitem>
<listitem>
<para>
GhostScript version 8.60 or newer: <ulink
url="http://www.ghostscript.com">http://www.ghostscript.com</ulink>
</para>
</listitem>
</itemizedlist>
If you do not enable this, then you do not need to have tiff and
ghostscript.
</para>
<para>
If Adobe Reader is installed as <command>acroread</command>, some
additional test cases will be enabled. These test cases simply
verify that Adobe Reader can open the files that qpdf creates.
They require version 8.0 or newer to pass. However, in order to
avoid having qpdf depend on non-free (as in liberty) software, the
test suite will still pass without Adobe reader, and the test
suite still exercises the full functionality of the software.
</para>
<para>
Pre-built documentation is distributed with qpdf, so you should
generally not need to rebuild the documentation. In order to
build the documentation from its docbook sources, you need the
docbook XML style sheets (<ulink
url="http://downloads.sourceforge.net/docbook/">http://downloads.sourceforge.net/docbook/</ulink>).
To build the PDF version of the documentation, you need Apache fop
(<ulink
url="http://xml.apache.org/fop/">http://xml.apache.org/fop/</ulink>)
version 0.94 or higher.
</para>
</sect1>
<sect1 id="ref.building">
<title>Build Instructions</title>
<para>
Building qpdf on UNIX is generally just a matter of running
<programlisting>./configure
make
</programlisting>
You can also run <command>make check</command> to run the test
suite and <command>make install</command> to install. Please run
<command>./configure --help</command> for options on what can be
configured. You can also set the value of
<varname>DESTDIR</varname> during installation to install to a
temporary location, as is common with many open source packages.
Please see also the <filename>README</filename> and
<filename>INSTALL</filename> files in the source distribution.
</para>
<para>
Building on Windows is a little bit more complicated. For
details, please see <filename>README-windows.txt</filename> in the
source distribution. You can also download a binary distribution
for Windows. There is a port of qpdf to Visual C++ version 6 in
the <filename>contrib</filename> area generously contributed by
Jian Ma. This is also discussed in more detail in
<filename>README-windows.txt</filename>.
</para>
<para>
There are some other things you can do with the build. Although
qpdf uses <application>autoconf</application>, it does not use
<application>automake</application> but instead uses a
hand-crafted non-recursive Makefile that requires gnu make. If
you're really interested, please read the comments in the
top-level <filename>Makefile</filename>.
</para>
</sect1>
</chapter>
<chapter id="ref.using">
<title>Running QPDF</title>
<para>
This chapter describes how to run the qpdf program from the command
line.
</para>
<sect1 id="ref.invocation">
<title>Basic Invocation</title>
<para>
When running qpdf, the basic invocation is as follows:
<programlisting><command>qpdf</command><option> [ <replaceable>options</replaceable> ] <replaceable>infilename</replaceable> [ <replaceable>outfilename</replaceable> ]</option>
</programlisting>
This converts PDF file <option>infilename</option> to PDF file
<option>outfilename</option>. The output file is functionally
identical to the input file but may have been structurally
reorganized. Also, orphaned objects will be removed from the
file. Many transformations are available as controlled by the
options below. In place of <option>infilename</option>, the
parameter <option>--empty</option> may be specified. This causes
qpdf to use a dummy input file that contains zero pages. The only
normal use case for using <option>--empty</option> would be if you
were going to add pages from another source, as discussed in <xref
linkend="ref.page-selection"/>.
</para>
<para>
<option>outfilename</option> does not have to be seekable, even
when generating linearized files. Specifying
“<option>-</option>” as <option>outfilename</option>
means to write to standard output. However, you can't specify the
same file as both the input and the output because qpdf reads data
from the input file as it writes to the output file.
</para>
<para>
Most options require an output file, but some testing or
inspection commands do not. These are specifically noted.
</para>
</sect1>
<sect1 id="ref.basic-options">
<title>Basic Options</title>
<para>
The following options are the most common ones and perform
commonly needed transformations.
<variablelist>
<varlistentry>
<term><option>--password=password</option></term>
<listitem>
<para>
Specifies a password for accessing encrypted files.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--linearize</option></term>
<listitem>
<para>
Causes generation of a linearized (web-optimized) output file.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--copy-encryption=file</option></term>
<listitem>
<para>
Encrypt the file using the same encryption parameters,
including user and owner password, as the specified file. Use
<option>--encrypt-file-password</option> to specify a password
if one is needed to open this file. Note that copying the
encryption parameters from a file also copies the first half
of <literal>/ID</literal> from the file since this is part of
the encryption parameters.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--encrypt-file-password=password</option></term>
<listitem>
<para>
If the file specified with <option>--copy-encryption</option>
requires a password, specify the password using this option.
Note that only one of the user or owner password is required.
Both passwords will be preserved since QPDF does not
distinguish between the two passwords. It is possible to
preserve encryption parameters, including the owner password,
from a file even if you don't know the file's owner password.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--encrypt options --</option></term>
<listitem>
<para>
Causes generation an encrypted output file. Please see <xref
linkend="ref.encryption-options"/> for details on how to
specify encryption parameters.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--decrypt</option></term>
<listitem>
<para>
Removes any encryption on the file. A password must be
supplied if the file is password protected.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--pages options --</option></term>
<listitem>
<para>
Select specific pages from one or more input files. See <xref
linkend="ref.page-selection"/> for details on how to do page
selection (splitting and merging).
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
<para>
Password-protected files may be opened by specifying a password.
By default, qpdf will preserve any encryption data associated with
a file. If <option>--decrypt</option> is specified, qpdf will
attempt to remove any encryption information. If
<option>--encrypt</option> is specified, qpdf will replace the
document's encryption parameters with whatever is specified.
</para>
<para>
Note that qpdf does not obey encryption restrictions already
imposed on the file. Doing so would be meaningless since qpdf can
be used to remove encryption from the file entirely. This
functionality is not intended to be used for bypassing copyright
restrictions or other restrictions placed on files by their
producers.
</para>
<para>
In all cases where qpdf allows specification of a password, care
must be taken if the password contains characters that fall
outside of the 7-bit US-ASCII character range to ensure that the
exact correct byte sequence is provided. It is possible that a
future version of qpdf may handle this more gracefully. For
example, if a password was encrypted using a password that was
encoded in ISO-8859-1 and your terminal is configured to use
UTF-8, the password you supply may not work properly. There are
various approaches to handling this. For example, if you are
using Linux and have the iconv executable (part of the ICU
package) installed, you could pass <option>--password=`echo
<replaceable>password</replaceable> | iconv -t
iso-8859-1`</option> to qpdf where
<replaceable>password</replaceable> is a password specified in
your terminal's locale. A detailed discussion of this is out of
scope for this manual, but just be aware of this issue if you have
trouble with a password that contains 8-bit characters.
</para>
</sect1>
<sect1 id="ref.encryption-options">
<title>Encryption Options</title>
<para>
To change the encryption parameters of a file, use the --encrypt
flag. The syntax is
<programlisting><option>--encrypt <replaceable>user-password</replaceable> <replaceable>owner-password</replaceable> <replaceable>key-length</replaceable> [ <replaceable>restrictions</replaceable> ] --</option>
</programlisting>
Note that “<option>--</option>” terminates parsing of
encryption flags and must be present even if no restrictions are
present.
</para>
<para>
Either or both of the user password and the owner password may be
empty strings.
</para>
<para>
The value for
<option><replaceable>key-length</replaceable></option> may be 40,
128, or 256. The restriction flags are dependent upon key length.
When no additional restrictions are given, the default is to be
fully permissive.
</para>
<para>
If <option><replaceable>key-length</replaceable></option> is 40,
the following restriction options are available:
<variablelist>
<varlistentry>
<term><option>--print=[yn]</option></term>
<listitem>
<para>
Determines whether or not to allow printing.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--modify=[yn]</option></term>
<listitem>
<para>
Determines whether or not to allow document modification.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--extract=[yn]</option></term>
<listitem>
<para>
Determines whether or not to allow text/image extraction.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--annotate=[yn]</option></term>
<listitem>
<para>
Determines whether or not to allow comments and form fill-in
and signing.
</para>
</listitem>
</varlistentry>
</variablelist>
If <option><replaceable>key-length</replaceable></option> is 128,
the following restriction options are available:
<variablelist>
<varlistentry>
<term><option>--accessibility=[yn]</option></term>
<listitem>
<para>
Determines whether or not to allow accessibility to visually
impaired.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--extract=[yn]</option></term>
<listitem>
<para>
Determines whether or not to allow text/graphic extraction.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--print=<replaceable>print-opt</replaceable></option></term>
<listitem>
<para>
Controls printing access.
<option><replaceable>print-opt</replaceable></option> may be
one of the following:
<itemizedlist>
<listitem>
<para>
<option>full</option>: allow full printing
</para>
</listitem>
<listitem>
<para>
<option>low</option>: allow low-resolution printing only
</para>
</listitem>
<listitem>
<para>
<option>none</option>: disallow printing
</para>
</listitem>
</itemizedlist>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--modify=<replaceable>modify-opt</replaceable></option></term>
<listitem>
<para>
Controls modify access.
<option><replaceable>modify-opt</replaceable></option> may be
one of the following, each of which implies all the options
that follow it:
<itemizedlist>
<listitem>
<para>
<option>all</option>: allow full document modification
</para>
</listitem>
<listitem>
<para>
<option>annotate</option>: allow comment authoring and form operations
</para>
</listitem>
<listitem>
<para>
<option>form</option>: allow form field fill-in and signing
</para>
</listitem>
<listitem>
<para>
<option>assembly</option>: allow document assembly only
</para>
</listitem>
<listitem>
<para>
<option>none</option>: allow no modifications
</para>
</listitem>
</itemizedlist>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--cleartext-metadata</option></term>
<listitem>
<para>
If specified, any metadata stream in the document will be left
unencrypted even if the rest of the document is encrypted.
This also forces the PDF version to be at least 1.5.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--use-aes=[yn]</option></term>
<listitem>
<para>
If <option>--use-aes=y</option> is specified, AES encryption
will be used instead of RC4 encryption. This forces the PDF
version to be at least 1.6.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--force-V4</option></term>
<listitem>
<para>
Use of this option forces the <literal>/V</literal> and
<literal>/R</literal> parameters in the document's encryption
dictionary to be set to the value <literal>4</literal>. As
qpdf will automatically do this when required, there is no
reason to ever use this option. It exists primarily for use
in testing qpdf itself. This option also forces the PDF
version to be at least 1.5.
</para>
</listitem>
</varlistentry>
</variablelist>
If <option><replaceable>key-length</replaceable></option> is 256,
the minimum PDF version is 1.7 with extension level 8, and the
AES-based encryption format used is the PDF 2.0 encryption method
supported by Acrobat X. the same options are available as with
128 bits with the following exceptions:
<variablelist>
<varlistentry>
<term><option>--use-aes</option></term>
<listitem>
<para>
This option is not available with 256-bit keys. AES is always
used with 256-bit encryption keys.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--force-V4</option></term>
<listitem>
<para>
This option is not available with 256 keys.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--force-R5</option></term>
<listitem>
<para>
If specified, qpdf sets the minimum version to 1.7 at
extension level 3 and writes the deprecated encryption format
used by Acrobat version IX. This option should not be used in
practice to generate PDF files that will be in general use,
but it can be useful to generate files if you are trying to
test proper support in another application for PDF files
encrypted in this way.
</para>
</listitem>
</varlistentry>
</variablelist>
The default for each permission option is to be fully permissive.
</para>
</sect1>
<sect1 id="ref.page-selection">
<title>Page Selection Options</title>
<para>
Starting with qpdf 3.0, it is possible to split and merge PDF
files by selecting pages from one or more input files. Whatever
file is given as the primary input file is used as the starting
point, but its pages are replaced with pages as specified.
<programlisting><option>--pages <replaceable>input-file</replaceable> [ <replaceable>--password=password</replaceable> ] <replaceable>page-range</replaceable> [ ... ] --</option>
</programlisting>
Multiple input files may be specified. Each one is given as the
name of the input file, an optional password (if required to open
the file), and the range of pages. Note that
“<option>--</option>” terminates parsing of page
selection flags.
</para>
<para>
For each file that pages should be taken from, specify the file, a
password needed to open the file (if any), and a page range. The
password needs to be given only once per file. If any of the
input files are the same as the primary input file or the file
used to copy encryption parameters (if specified), you do not need
to repeat the password here. The same file can be repeated
multiple times. If a file that is repeated has a password, the
password only has to be given the first time. All non-page data
(info, outlines, page numbers, etc.) are taken from the primary
input file. To discard these, use <option>--empty</option> as the
primary input.
</para>
<para>
It is not presently possible to specify the same page from the
same file directly more than once, but you can make this work by
specifying two different paths to the same file (such as by
putting <filename>./</filename> somewhere in the path). This can
also be used if you want to repeat a page from one of the input
files in the output file. This may be made more convenient in a
future version of qpdf if there is enough demand for this feature.
</para>
<para>
The page range is a set of numbers separated by commas, ranges of
numbers separated dashes, or combinations of those. The character
“z” represents the last page. Pages can appear in any
order. Ranges can appear with a high number followed by a low
number, which causes the pages to appear in reverse. Repeating a
number will cause an error, but you can use the workaround
discussed above should you really want to include the same page
twice.
</para>
<para>
Example page ranges:
<itemizedlist>
<listitem>
<para>
<literal>1,3,5-9,15-12</literal>: pages 1, 2, 3, 5, 6, 7, 8,
9, 15, 14, 13, and 12.
</para>
</listitem>
<listitem>
<para>
<literal>z-1</literal>: all pages in the document in reverse
</para>
</listitem>
</itemizedlist>
</para>
<para>
Note that qpdf doesn't presently do anything special about other
constructs in a PDF file that may know about pages, so semantics
of splitting and merging vary across features. For example, the
document's outlines (bookmarks) point to actual page objects, so
if you select some pages and not others, bookmarks that point to
pages that are in the output file will work, and remaining
bookmarks will not work. On the other hand, page labels (page
numbers specified in the file) are just sequential, so page labels
will be messed up in the output file. A future version of
<command>qpdf</command> may do a better job at handling these
issues. (Note that the qpdf library already contains all of the
APIs required in order to implement this in your own application
if you need it.) In the mean time, you can always use
<option>--empty</option> as the primary input file to avoid
copying all of that from the first file. For example, to take
pages 1 through 5 from a <filename>infile.pdf</filename> while
preserving all metadata associated with that file, you could use
<programlisting><command>qpdf</command> <option>infile.pdf --pages infile.pdf 1-5 -- outfile.pdf</option>
</programlisting>
If you wanted pages 1 through 5 from
<filename>infile.pdf</filename> but you wanted the rest of the
metadata to be dropped, you could instead run
<programlisting><command>qpdf</command> <option>--empty --pages infile.pdf 1-5 -- outfile.pdf</option>
</programlisting>
If you wanted to take pages 1–5 from
<filename>file1.pdf</filename> and pages 11–15 from
<filename>file2.pdf</filename> in reverse, you would run
<programlisting><command>qpdf</command> <option>file1.pdf --pages file1.pdf 1-5 file2.pdf 15-11 -- outfile.pdf</option>
</programlisting>
If, for some reason, you wanted to take the first page of an
encrypted file called <filename>encrypted.pdf</filename> with
password <literal>pass</literal> and repeat it twice in an output
file, and if you wanted to drop metadata (like page numbers and
outlines) but preserve encryption, you would use
<programlisting><command>qpdf</command> <option>--empty --copy-encryption=encrypted.pdf --encryption-file-password=pass
--pages encrypted.pdf --password=pass 1 ./encrypted.pdf --password=pass 1 --
outfile.pdf</option>
</programlisting>
Note that we had to specify the password all three times because
giving a password as <option>--encryption-file-password</option>
doesn't count for page selection, and as far as qpdf is concerned,
<filename>encrypted.pdf</filename> and
<filename>./encrypted.pdf</filename> are separated files. These
are all corner cases that most users should hopefully never have
to be bothered with.
</para>
</sect1>
<sect1 id="ref.advanced-transformation">
<title>Advanced Transformation Options</title>
<para>
These transformation options control fine points of how qpdf
creates the output file. Mostly these are of use only to people
who are very familiar with the PDF file format or who are PDF
developers. The following options are available:
<variablelist>
<varlistentry>
<term><option>--stream-data=<replaceable>option</replaceable></option></term>
<listitem>
<para>
Controls transformation of stream data. The value of
<option><replaceable>option</replaceable></option> may be one
of the following:
<itemizedlist>
<listitem>
<para>
<option>compress</option>: recompress stream data when
possible (default)
</para>
</listitem>
<listitem>
<para>
<option>preserve</option>: leave all stream data as is
</para>
</listitem>
<listitem>
<para>
<option>uncompress</option>: uncompress stream data when
possible
</para>
</listitem>
</itemizedlist>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--normalize-content=[yn]</option></term>
<listitem>
<para>
Enables or disables normalization of content streams.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--suppress-recovery</option></term>
<listitem>
<para>
Prevents qpdf from attempting to recover damaged files.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--object-streams=<replaceable>mode</replaceable></option></term>
<listitem>
<para>
Controls handling of object streams. The value of
<option><replaceable>mode</replaceable></option> may be one of
the following:
<itemizedlist>
<listitem>
<para>
<option>preserve</option>: preserve original object streams
(default)
</para>
</listitem>
<listitem>
<para>
<option>disable</option>: don't write any object streams
</para>
</listitem>
<listitem>
<para>
<option>generate</option>: use object streams wherever
possible
</para>
</listitem>
</itemizedlist>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--ignore-xref-streams</option></term>
<listitem>
<para>
Tells qpdf to ignore any cross-reference streams.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--qdf</option></term>
<listitem>
<para>
Turns on QDF mode. For additional information on QDF, please
see <xref linkend="ref.qdf"/>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--min-version=<replaceable>version</replaceable></option></term>
<listitem>
<para>
Forces the PDF version of the output file to be at least
<replaceable>version</replaceable>. In other words, if the
input file has a lower version than the specified version, the
specified version will be used. If the input file has a
higher version, the input file's original version will be
used. It is seldom necessary to use this option since qpdf
will automatically increase the version as needed when adding
features that require newer PDF readers.
</para>
<para>
The version number may be expressed in the form
<replaceable>major.minor.extension-level</replaceable>, in
which case the version is interpreted as
<replaceable>major.minor</replaceable> at extension level
<replaceable>extension-level</replaceable>. For example,
version <literal>1.7.8</literal> represents version 1.7 at
extension level 8. Note that minimal syntax checking is done
on the command line.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--force-version=<replaceable>version</replaceable></option></term>
<listitem>
<para>
This option forces the PDF version to be the exact version
specified <emphasis>even when the file may have content that
is not supported in that version</emphasis>. The version
number is interpreted in the same way as with
<option>--min-version</option> so that extension levels can be
set. In some cases, forcing the output file's PDF version to
be lower than that of the input file will cause qpdf to
disable certain features of the document. Specifically,
256-bit keys are disabled if the version is less than 1.7 with
extension level 8 (except R5 is disabled if less than 1.7 with
extension level 3), AES encryption is disabled if the version
is less than 1.6, cleartext metadata and object streams are
disabled if less than 1.5, 128-bit encryption keys are
disabled if less than 1.4, and all encryption is disabled if
less than 1.3. Even with these precautions, qpdf won't be
able to do things like eliminate use of newer image
compression schemes, transparency groups, or other features
that may have been added in more recent versions of PDF.
</para>
<para>
As a general rule, with the exception of big structural things
like the use of object streams or AES encryption, PDF viewers
are supposed to ignore features in files that they don't
support from newer versions. This means that forcing the
version to a lower version may make it possible to open your
PDF file with an older version, though bear in mind that some
of the original document's functionality may be lost.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
<para>
By default, when a stream is encoded using non-lossy filters that
qpdf understands and is not already compressed using a good
compression scheme, qpdf will uncompress and recompress streams.
Assuming proper filter implements, this is safe and generally
results in smaller files. This behavior may also be explicitly
requested with <option>--stream-data=compress</option>.
</para>
<para>
When <option>--stream-data=preserve</option> is specified, qpdf
will never attempt to change the filtering of any stream data.
</para>
<para>
When <option>--stream-data=uncompress</option> is specified, qpdf
will attempt to remove any non-lossy filters that it supports.
This includes <literal>/FlateDecode</literal>,
<literal>/LZWDecode</literal>, <literal>/ASCII85Decode</literal>,
and <literal>/ASCIIHexDecode</literal>. This can be very useful
for inspecting the contents of various streams.
</para>
<para>
When <option>--normalize-content=y</option> is specified, qpdf
will attempt to normalize whitespace and newlines in page content
streams. This is generally safe but could, in some cases, cause
damage to the content streams. This option is intended for people
who wish to study PDF content streams or to debug PDF content.
You should not use this for “production” PDF files.
</para>
<para>
Ordinarily, qpdf will attempt to recover from certain types of
errors in PDF files. These include errors in the cross-reference
table, certain types of object numbering errors, and certain types
of stream length errors. Sometimes, qpdf may think it has
recovered but may not have actually recovered, so care should be
taken when using this option as some data loss is possible. The
<option>--suppress-recovery</option> option will prevent qpdf from
attempting recovery. In this case, it will fail on the first
error that it encounters.
</para>
<para>
Object streams, also known as compressed objects, were introduced
into the PDF specification at version 1.5, corresponding to
Acrobat 6. Some older PDF viewers may not support files with
object streams. qpdf can be used to transform files with object
streams to files without object streams or vice versa. As
mentioned above, there are three object stream modes:
<option>preserve</option>, <option>disable</option>, and
<option>generate</option>.
</para>
<para>
In <option>preserve</option> mode, the relationship to objects and
the streams that contain them is preserved from the original file.
In <option>disable</option> mode, all objects are written as
regular, uncompressed objects. The resulting file should be
readable by older PDF viewers. (Of course, the content of the
files may include features not supported by older viewers, but at
least the structure will be supported.) In
<option>generate</option> mode, qpdf will create its own object
streams. This will usually result in more compact PDF files,
though they may not be readable by older viewers. In this mode,
qpdf will also make sure the PDF version number in the header is
at least 1.5.
</para>
<para>
Ordinarily, qpdf reads cross-reference streams when they are
present in a PDF file. If <option>--ignore-xref-streams</option>
is specified, qpdf will ignore any cross-reference streams for
hybrid PDF files. The purpose of hybrid files is to make some
content available to viewers that are not aware of cross-reference
streams. It is almost never desirable to ignore them. The only
time when you might want to use this feature is if you are testing
creation of hybrid PDF files and wish to see how a PDF consumer
that doesn't understand object and cross-reference streams would
interpret such a file.
</para>
<para>
The <option>--qdf</option> flag turns on QDF mode, which changes
some of the defaults described above. Specifically, in QDF mode,
by default, stream data is uncompressed, content streams are
normalized, and encryption is removed. These defaults can still
be overridden by specifying the appropriate options as described
above. Additionally, in QDF mode, stream lengths are stored as
indirect objects, objects are laid out in a less efficient but
more readable fashion, and the documents are interspersed with
comments that make it easier for the user to find things and also
make it possible for <command>fix-qdf</command> to work properly.
QDF mode is intended for people, mostly developers, who wish to
inspect or modify PDF files in a text editor. For details, please
see <xref linkend="ref.qdf"/>.
</para>
</sect1>
<sect1 id="ref.testing-options">
<title>Testing, Inspection, and Debugging Options</title>
<para>
These options can be useful for digging into PDF files or for use
in automated test suites for software that uses the qpdf library.
When any of the options in this section are specified, no output
file should be given. The following options are available:
<variablelist>
<varlistentry>
<term><option>--static-id</option></term>
<listitem>
<para>
Causes generation of a fixed value for /ID. This is intended
for testing only. Never use it for production files.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--static-aes-iv</option></term>
<listitem>
<para>
Causes use of a static initialization vector for AES-CBC.
This is intended for testing only so that output files can be
reproducible. Never use it for production files. This option
in particular is not secure since it significantly weakens the