PdfSnipper

A package to help manage PDF pages, images, and their conversions during different NLP, CV, or other tasks to avoid repetitive code blocks and provide a simple function call for operations.

Installation

To install PdfSnipper, use:

pip install -i https://test.pypi.org/simple/ pdf-snip

Dependencies

If you face an error involving `poppler-utils`

For Google Colab:
```
!apt-get install -y poppler-utils
```
For Ubuntu/Debian:
```
sudo apt install poppler-utils
```
For Windows:
Download the latest release from here. After installation in /ProgramFiles, set the PATH environment variable:
```
import os
os.environ['PATH'] += os.pathsep + r'C:\path\to\poppler\bin'
```

Features

1. Remove First N Pages

Removes the first N pages from all PDFs in a folder.

remove_first_pages(input_folder: str, output_folder: str, pages_to_remove: int)

Arguments

input_folder: Path to the folder containing PDFs.
output_folder: Path to save modified PDFs.
pages_to_remove: Number of pages to remove from the start.

Usage

from PDFSNIPPER import remove_first_pages
remove_first_pages('/content/input', '/content/output', 2)

2. Remove Last N Pages

Removes the last N pages from all PDFs in a folder.

remove_last_pages(input_folder: str, output_folder: str, pages_to_remove: int)

Arguments

input_folder: Path to the folder containing PDFs.
output_folder: Path to save modified PDFs.
pages_to_remove: Number of pages to remove from the end.

Usage

from PDFSNIPPER import remove_last_pages
remove_last_pages('/content/input', '/content/output', 3)

3. Remove Pages Outside a Specified Range

Keeps only the pages within a specified range [start_page, end_page] inclusive, removing all others.

remove_pages_outside_range(input_folder: str, output_folder: str, start_page: int, end_page: int)

Arguments

input_folder: Path to the folder containing PDFs.
output_folder: Path to save modified PDFs.
start_page: First page to keep (0-indexed).
end_page: Last page to keep (0-indexed).

Usage

from PDFSNIPPER import remove_pages_outside_range
remove_pages_outside_range('/content/input', '/content/output', 2, 5)

4. Save Specific Pages

Saves only specific pages from PDFs into a new folder.

save_specific_pages(input_folder: str, output_folder: str, pages_to_save: list)

Arguments

input_folder: Path to the folder containing PDFs.
output_folder: Path to save modified PDFs.
pages_to_save: List of page numbers (0-indexed) to keep.

Usage

from PDFSNIPPER import save_specific_pages
save_specific_pages('/content/input', '/content/output', [0, 2, 3])

5. Save Pages as Images

Saves specific pages as PNG images in a new folder.

save_pages_as_images(input_folder: str, output_folder: str, pages_to_save: list)

Arguments

input_folder: Path to the folder containing PDFs.
output_folder: Path to save PNG images.
pages_to_save: List of page numbers (0-indexed) to save as images.

Usage

from PDFSNIPPER import save_pages_as_images
save_pages_as_images('/content/input', '/content/output', [0, 2, 4])

6. Split PDF

Splits each page of a PDF into individual PDF files.

split_pdf(input_folder: str, output_folder: str)

Arguments

input_folder: Path to the folder containing PDFs.
output_folder: Path to save split PDFs.

Usage

from PDFSNIPPER import split_pdf
split_pdf('/content/input', '/content/output')

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
PDFSNIPPER		PDFSNIPPER
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PdfSnipper

Installation

Dependencies

If you face an error involving `poppler-utils`

Features

1. Remove First N Pages

Arguments

Usage

2. Remove Last N Pages

Arguments

Usage

3. Remove Pages Outside a Specified Range

Arguments

Usage

4. Save Specific Pages

Arguments

Usage

5. Save Pages as Images

Arguments

Usage

6. Split PDF

Arguments

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Aleptonic/PdfSnipper

Folders and files

Latest commit

History

Repository files navigation

PdfSnipper

Installation

Dependencies

If you face an error involving poppler-utils

Features

1. Remove First N Pages

Arguments

Usage

2. Remove Last N Pages

Arguments

Usage

3. Remove Pages Outside a Specified Range

Arguments

Usage

4. Save Specific Pages

Arguments

Usage

5. Save Pages as Images

Arguments

Usage

6. Split PDF

Arguments

Usage

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

If you face an error involving `poppler-utils`

Packages