# PyPDF2

Basic Example with Reading, Writing, and Copying PDFs

## PdfFileWriter Class

class PyPDF2.PdfFileWriter<br><br>
This class supports writing PDF files out, given pages produced by another class (typically PdfFileReader).

addAttachment(fname, fdata)<br><br>
Embed a file inside the PDF.

Parameters:<br><br>
fname (str) – The filename to display.<br>
fdata (str) – The data in the file.<br>
Reference: https://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf<br>
Section 7.11.3

In [4]:

from PyPDF2 import PdfFileReader, PdfFileWriter
import sys

path = open(file='./Python- OCR for PDF or Compare textract, pytesseract, and pyocr.pdf', mode='rb')
pdf_reader = PdfFileReader(stream=path, strict=True, warndest=sys.stderr, overwriteWarnings=True)
pdf_writer = PdfFileWriter()
pdf_




addBlankPage(width=None, height=None)<br><br>
Appends a blank page to this PDF file and returns it. If no page size is specified, use the size of the last page.

Parameters:<br><br>
width (float) – The width of the new page expressed in default user space units.<br>
height (float) – The height of the new page expressed in default user space units.<br>
Returns:	
the newly appended page

Return type:	
PageObject

Raises PageSizeNotDefinedError:
 	
if width and height are not defined and previous page does not exist.

addBookmark(title, pagenum, parent=None, color=None, bold=False, italic=False, fit='/Fit', *args)<br><br>
Add a bookmark to this PDF file.<br>

Parameters:<br><br>
title (str) – Title to use for this bookmark.<br>
pagenum (int) – Page number this bookmark will point to.<br>
parent – A reference to a parent bookmark to create nested bookmarks.<br>
color (tuple) – Color of the bookmark as a red, green, blue tuple from 0.0 to 1.0<br>
bold (bool) – Bookmark is bold<br>
italic (bool) – Bookmark is italic<br>
fit (str) – The fit of the destination page. See addLink() for details.

addJS(javascript)<br><br>
Add Javascript which will launch upon opening this PDF.

Parameters:<br><br>
javascript (str) – Your Javascript.

addLink(pagenum, pagedest, rect, border=None, fit='/Fit', *args)<br><br>
Add an internal link from a rectangular area to the specified page.

Parameters:<br><br>
pagenum (int) – index of the page on which to place the link.<br>
pagedest (int) – index of the page to which the link should go.<br>
rect – RectangleObject or array of four integers specifying the clickable rectangular area <br>[xLL, yLL, xUR, yUR], or string in the form "[ xLL yLL xUR yUR ]".<br>
border – if provided, an array describing border-drawing properties. See the PDF spec for details. No border will be drawn if this argument is omitted.<br>
fit (str) – Page fit or ‘zoom’ option (see below). Additional arguments may need to be supplied. Passing None will be read as a null value for that coordinate.

Valid zoom arguments (see Table 8.2 of the PDF 1.7 reference for details):<br>
/Fit	No additional arguments<br>
/XYZ	[left] [top] [zoomFactor]<br>
/FitH	[top]<br>
/FitV	[left]<br>
/FitR	[left] [bottom] [right] [top]<br>
/FitB	No additional arguments<br>
/FitBH	[top]<br>
/FitBV	[left]<br>


addMetadata(infos)<br><br>
Add custom metadata to the output.<br>

Parameters:<br><br>
infos (dict) – a Python dictionary where each key is a field and each value is your new metadata.

addPage(page)<br><br>
Adds a page to this PDF file. The page is usually acquired from a PdfFileReader instance.<br>

Parameters:<br><br>
page (PageObject) – The page to add to the document. Should be an instance of PageObject

appendPagesFromReader(reader, after_page_append=None)<br><br>
Copy pages from reader to writer. Includes an optional callback parameter which is invoked after pages are appended to the writer.<br><br>

Parameters:<br><br>
reader – a PdfFileReader object from which to copy page annotations to this writer object. The writer’s annots will then be updated :callback after_page_append (function): Callback function that is invoked after each page is appended to the writer.<br><br>
Callback signature:<br>
param writer_pageref (PDF page reference):<br>
 Reference to the page appended to the writer.

cloneDocumentFromReader(reader, after_page_append=None)<br><br>
Create a copy (clone) of a document from a PDF file reader<br><br>

Parameters:<br><br>
reader – PDF file reader instance from which the clone should be created.

Callback after_page_append (function):
 	
Callback function that is invoked after each page is appended to the writer. Signature includes a reference to the appended page (delegates to appendPagesFromReader). Callback signature:

param writer_pageref (PDF page reference):
 	Reference to the page just appended to the document.

cloneReaderDocumentRoot(reader)<br><br>
Copy the reader document root to the writer.<br>

Parameters:<br><br>
reader – PdfFileReader from the document root should be copied.
:callback after_page_append

encrypt(user_pwd, owner_pwd=None, use_128bit=True)<br><br>
Encrypt this PDF file with the PDF Standard encryption handler.<br><br>

Parameters:<br><br>
user_pwd (str) – The “user password”, which allows for opening and reading the PDF file with the restrictions provided.<br>
owner_pwd (str) – The “owner password”, which allows for opening the PDF files without any restrictions. By default, the owner password is the same as the user password.<br>
use_128bit (bool) – flag as to whether to use 128bit encryption. When false, 40bit encryption will be used. By default, this flag is on.

insertBlankPage(width=None, height=None, index=0)<br><br>
Inserts a blank page to this PDF file and returns it. If no page size is specified, use the size of the last page.<br><br>

Parameters:<br><br>
width (float) – The width of the new page expressed in default user space units.<br>
height (float) – The height of the new page expressed in default user space units.<br>
index (int) – Position to add the page.<br><br>
Returns:<br>
the newly appended page<br><br>

Return type:<br>
PageObject<br>

Raises PageSizeNotDefinedError:<br>
 	
if width and height are not defined and previous page does not exist.

insertPage(page, index=0)<br><br>
Insert a page in this PDF file. The page is usually acquired from a PdfFileReader instance.<br><br>

Parameters:<br><br>
page (PageObject) – The page to add to the document. This argument should be an instance of PageObject.<br>
index (int) – Position at which the page will be inserted.

removeImages(ignoreByteStringObject=False)<br><br>
Removes images from this output.<br>

Parameters:<br><br>
ignoreByteStringObject (bool) – optional parameter to ignore ByteString Objects.

removeLinks()<br><br>
Removes links and annotations from this output.

removeLinks()<br><br>
Removes links and annotations from this output.

updatePageFormFieldValues(page, fields)<br><br>
Update the form field values for a given page from a fields dictionary. Copy field texts and values from fields to page.<br><br>

Parameters:<br><br>
page – Page reference from PDF writer where the annotations and field data will be updated.<br>
fields – a Python dictionary of field names (/T) and text values (/V)

write(stream)<br><br>
Writes the collection of pages added to this object out as a PDF file.

Parameters:<br><br>
stream – An object to write the file to. The object must support the write method and the tell method, similar to a file object.