# PyPDF PDF Manipulation

Alejandro Ricciardi (Omegapy)  
created date: 01/10/2024   
[GitHub](https://github.com/Omegapy)  

Credit: 
[Control PDF with Python & PyPDF2](https://www.udemy.com/course/control-pdf-with-python-pypdf2) Udemy - Conny Soderholm
The original code was substantially modified to meet my requirements and to add functionally to the program.

Projects Description:  
Using the PyPDF to manipulate PDF files.
- How to work with pages
- How to scale, rotate, crop, clip, and watermark pages
- How to split and join pages
- How to read a pdf to memory instead of having to write to disk

The [PageOject Class](https://pypdf.readthedocs.io/en/stable/modules/PageObject.html?highlight=add_transformation#the-pageobject-class) represents a single within a PDF file. 

Typically this object will be created by accessing the ```pdf_reader.get_page()``` single page method of the PdfReader class, but it is also possible to create an empty page with the ```create_blank_page()``` static and get all the pages with the ```pdf_reader.pages``` method.


Project map:
- Transformation Matrix -```page.add_transformation( (scale,0,0,scale,0,0) )```-
    - Sheer Transformation x Axis To The Right
    - Scaling Pages -```page.add_transformation( (scale,0,0,scale,0,0) )```-
- Rotated Page -```page.rotate(90)```-


In [1]:
from pypdf import PdfReader, PdfWriter

### Transformation Matrix

[The Transformation Class](https://pypdf.readthedocs.io/en/stable/modules/Transformation.html)

Represent a 2D transformation.

The transformation between two coordinate systems is represented by a 3-by-3 transformation matrix matrix with the following form:

a b **0**
c d **0**
e f **1**

Because a transformation matrix has only six elements that can be changed,  
it is usually specified in PDF as the six-element array ```[ a b c d e f ]```.
```page.add_transformation( (a,b,c,b,e,f) )```

Coordinate transformations are expressed as matrix multiplications:

                           
 $\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;$ a b **0**
[ x′ y′ 1 ] = [ x y 1 ] × c d **0**
$\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;$ e f **1**


<p></p>
<img src="pics/2D Matrix Transformation.png" alt="Alternative text" />


#### Sheer Transformation x Axis To The Right

In [28]:
pdf_reader = PdfReader("docs/docs_pdf/Section 4 Manipulating Pages/camera.pdf")

# Creat an object write
pdf_writer_x = PdfWriter()

# Get all the pages
pages = pdf_reader.pages

for page in pages:
    page.add_transformation( (1,1,0,0,1,0) ) # Apply a transformation matrix to the page.
    pdf_writer_x.add_page(page)

# Save the new PDF to disk
with open("Manipulated PDFs/camera_sheer_xright.pdf", "wb") as f:
        pdf_writer_x.write(f)

#### Scale Transformation Using 

In [30]:
pdf_reader = PdfReader("docs/docs_pdf/Section 4 Manipulating Pages/camera.pdf")
# Creat an object write
pdf_writer = PdfWriter()

# Get all the pages
pages = pdf_reader.pages

# scale factor
scale = 0.5

for page in pages:
    page.add_transformation( (scale,0,0,scale,0,0) )# Apply a transformation matrix to the page.
    pdf_writer.add_page(page)

# Save the new PDF to disk
with open("Manipulated PDFs/camera_T_scaledby_05.pdf", "wb") as f:
        pdf_writer.write(f)

### Rotated Page
Rotates by increments of 90 degrees

In [35]:
pdf_reader = PdfReader("docs/docs_pdf/Section 4 Manipulating Pages/camera.pdf")
# Creat an object write
pdf_writer = PdfWriter()

# Get all the pages
pages = pdf_reader.pages

for page in pages:
    page.rotate(90)
    pdf_writer.add_page(page)

# Save the new PDF to disk
with open("Manipulated PDFs/camera_rotated_clockwise_90.pdf", "wb") as f:
        pdf_writer.write(f)

ValueError: Rotation angle must be a multiple of 90

### Creating Blank Pages

addBlankPage(width=None, height=None)
Appends a blank page to this PDF file and returns it. If no page size is specified, use the size of the last page.

Parameters: 
width (float) – The width of the new page expressed in default user space units.
height (float) – The height of the new page expressed in default user space units.
Returns:    
the newly appended page

Return type:    
PageObject

Raises PageSizeNotDefinedError:
    
if width and height are not defined and previous page does not exist.

***************************
insertBlankPage(width=None, height=None, index=0)
Inserts a blank page to this PDF file and returns it. If no page size is specified, use the size of the last page.

Parameters: 
width (float) – The width of the new page expressed in default user space units.
height (float) – The height of the new page expressed in default user space units.
index (int) – Position to add the page.
Returns:    
the newly appended page

Return type:    
PageObject

Raises PageSizeNotDefinedError:
    
if width and height are not defined and previous page does not exist.

***************************
https://pythonhosted.org/PyPDF2/PageObject.html#PyPDF2.pdf.PageObject.createBlankPage
static createBlankPage(pdf=None, width=None, height=None)
Returns a new blank page. If width or height is None, try to get the page size from the last page of pdf.

Parameters: 
pdf – PDF file the page belongs to
width (float) – The width of the new page expressed in default user space units.
height (float) – The height of the new page expressed in default user space units.
Returns:    
the new blank page:

Return type:    
PageObject

Raises PageSizeNotDefinedError:
    
if pdf is None or contains no page


In [ ]:
pdf_reader = PdfReader("docs/docs_pdf/Section 4 Manipulating Pages/camera.pdf")
# Creat an object write
pdf_writer = PdfWriter()

# Get all the pages
pages = pdf_reader.pages

