# PDF Files - Pages Selection

The libraries required for this code are:

- **PyPDF2** `pip install PyPDF2`: to handle PDF files and mainpulation
- **glob**: this module is in the standard module list (no need to install); the role is to select all the files in a folder
- **ipywidgets** `pip install ipywidgets`: provides interactive widgets and it is used to build simple graphical user interfaces within Jupyter Notebook

In [None]:
import os
import glob
from PyPDF2 import PdfReader, PdfWriter
import ipywidgets as widgets
from IPython.display import display, clear_output

### Function 1 - select_pages
The following function will select specific pages from a PDF file and create a new PDF containing only those pages.

In [None]:
def select_pages(input_file, output_file, pages_to_select):
    with open(input_file, "rb") as file:
        pdf_reader = PdfReader(file)
        pdf_writer = PdfWriter()

        for page_num in pages_to_select:
            if page_num <= len(pdf_reader.pages) and page_num > 0:
                pdf_writer.add_page(pdf_reader.pages[page_num - 1])

        with open(output_file, "wb") as output:
            pdf_writer.write(output)

### Function 2 - process_folder
The following function find all the PDF files in a folder and process each file.

In [None]:
def process_folder(input_folder, output_folder, pages_to_select):
    os.makedirs(output_folder, exist_ok=True)
    pdf_files = glob.glob(os.path.join(input_folder, "*.pdf"))

    for pdf_file in pdf_files:
        file_name = os.path.basename(pdf_file)
        output_file = os.path.join(output_folder, file_name)

        select_pages(pdf_file, output_file, pages_to_select)

        print(f"Selected pages from '{file_name}' exported to '{output_file}'.")

### GUI Widgets
The following code will generate text input boxes at the end of this cell. The user can type the input path (folder which contains the PDF files to be processed), output path (the folder to store the processed PDF files) and page numbers (separated by ",").

In [1]:
# Create widgets for user input
input_folder_widget = widgets.Text(
    value="path/to/input/folder",
    description="Input Folder:",
)
output_folder_widget = widgets.Text(
    value="path/to/output/folder",
    description="Output Folder:",
)
pages_to_select_widget = widgets.Text(
    value="1, 3, 5",
    description="Pages to Select:",
)

# Define a function to trigger the file processing
def process_files(button):
    input_folder = input_folder_widget.value.strip()
    output_folder = output_folder_widget.value.strip()
    pages_to_select = [int(page) for page in pages_to_select_widget.value.split(",")]

    process_folder(input_folder, output_folder, pages_to_select)

# Create a button widget to trigger the file processing
process_button = widgets.Button(description="Process Files")
process_button.on_click(process_files)

# Display the widgets and button
display(input_folder_widget)
display(output_folder_widget)
display(pages_to_select_widget)
display(process_button)

Text(value='path/to/input/folder', description='Input Folder:')

Text(value='path/to/output/folder', description='Output Folder:')

Text(value='1, 3, 5', description='Pages to Select:')

Button(description='Process Files', style=ButtonStyle())

Selected pages from 'Atomic Pd@GCN.pdf' exported to 'C:\Users\GadolS01\OneDrive - Johnson Matthey\Solar2Chem - Electrochemical Transformations\Important Papers\Atomic Pd@GCN.pdf'.
Selected pages from 'B and P co-doped EGCN.pdf' exported to 'C:\Users\GadolS01\OneDrive - Johnson Matthey\Solar2Chem - Electrochemical Transformations\Important Papers\B and P co-doped EGCN.pdf'.
Selected pages from 'B-doped EGCN Thermal Air.pdf' exported to 'C:\Users\GadolS01\OneDrive - Johnson Matthey\Solar2Chem - Electrochemical Transformations\Important Papers\B-doped EGCN Thermal Air.pdf'.
Selected pages from 'B-doped EGCN.pdf' exported to 'C:\Users\GadolS01\OneDrive - Johnson Matthey\Solar2Chem - Electrochemical Transformations\Important Papers\B-doped EGCN.pdf'.
Selected pages from 'Colloidal Au Catalyst Preparation Selective Removal of PVP from Active Au sites.pdf' exported to 'C:\Users\GadolS01\OneDrive - Johnson Matthey\Solar2Chem - Electrochemical Transformations\Important Papers\Colloidal Au Catal