PDF Splitter CLI

A simple and efficient Python CLI tool for splitting PDF files based on their outline (bookmarks).

Originally optimized for O'Reilly-style technical books (handling structures like Parts -> Chapters), this tool can be used with any PDF that has a valid outline.

Features

Smart Splitting: Automatically detects the PDF outline and splits the document into separate files for each section.
Filename Sanitization: Generates safe filenames from chapter titles, removing illegal characters.
Flexible Output: Allows specifying a custom output directory. Defaults to a folder named after the input file.
Cross-Platform: Works on Windows, macOS, and Linux (Python based).

Requirements

Python 3.9+
pypdf

Installation

Recommended: `uv`

Clone this repository:

git clone https://github.com/katsuki-a/pdf-splitter.git
cd pdf-splitter

Create a virtual environment with uv:
```
uv venv
```
Install dependencies:
```
uv pip install -r requirements.txt
```
Run commands with uv run:
```
uv run python -m src.cli --help
```

Alternative: `venv` + `pip`

If you don't use uv, you can still use the standard Python workflow:

python3 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -r requirements.txt

Usage

To split a PDF file, run the tool as a module from the project root directory.

Basic Usage

# Recommended
uv run python -m src.cli <input_file_path> [-o <output_directory>] [-d <max_depth>] [--dry-run]

Alternatively, if you are not using uv:

python -m src.cli <input_file_path> [-o <output_directory>] [-d <max_depth>] [--dry-run]

If you encounter module import errors, you can explicitly set the PYTHONPATH:

PYTHONPATH=. python src/cli.py <input_file_path> ...

Arguments

input_file: Path to the input PDF file (Required).
-o, --output: Directory to save the split PDF files. If omitted, a directory named <input_filename>_split will be created in the same location as the input file.
-d, --max-depth: Maximum depth of the outline to process.
- 1: Top-level chapters only (default).
- 2: Chapters and sub-sections.
--dry-run: Print the planned split without writing PDF files.

Examples

1. Split a file using default settings (top-level chapters only):

uv run python -m src.cli my_book.pdf

2. Split a file including nested sections (up to depth 2):

uv run python -m src.cli my_book.pdf --max-depth 2

3. Split a file and save to a specific directory:

uv run python -m src.cli my_book.pdf --output ./chapters/

4. Preview the split without writing files:

uv run python -m src.cli my_book.pdf --dry-run

Development

Install development dependencies:

uv venv
uv pip install -r requirements-dev.txt

Run the checks locally:

uv run ruff check .
uv run ruff format --check .
uv run python -m pytest

Apply formatting:

uv run ruff format .

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.agent/skills/pdf-splitting		.agent/skills/pdf-splitting
.github		.github
docs		docs
src		src
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
GEMINI.md		GEMINI.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDF Splitter CLI

Features

Requirements

Installation

Recommended: `uv`

Alternative: `venv` + `pip`

Usage

Basic Usage

Arguments

Examples

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PDF Splitter CLI

Features

Requirements

Installation

Recommended: uv

Alternative: venv + pip

Usage

Basic Usage

Arguments

Examples

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Recommended: `uv`

Alternative: `venv` + `pip`

Packages