Skip to content

Quickly remove useless page from a huge pdf to get a readable pdf

License

Notifications You must be signed in to change notification settings

lucasmrdt/pdf-sanitizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

sanitized PDF

Quickly remove useless page from a huge pdf to get a readable pdf.

Table of Contents


Installation

git clone https://github.com/lucasmrdt/pdf-sanitizer
cd pdf-sanitizer
pip3 install -r requirements.txt --user

Usage

> ./pdf-sanitizer -h
usage: pdf-sanitizer [-h] [--title-ratio TITLE_RATIO]
                     [--content-ratio CONTENT_RATIO]
                     input_file output_file

Quickly remove useless page from a huge pdf to get a readable pdf

positional arguments:
  input_file            pdf file to be sanitized
  output_file           output sanitized pdf file name

optional arguments:
  -h, --help            show this help message and exit
  --title-ratio TITLE_RATIO
                        float between [0, 1] which is responsible of detecting
                        similar pages from title. The higher the ratio, the
                        more sensitive the sanitizer will be to any changes.
                        (default: 0.5)
  --content-ratio CONTENT_RATIO
                        float between [0, 1] which is responsible of detecting
                        similar pages from content. The higher the ratio, the
                        more sensitive the sanitizer will be to any changes.
                        (default: 0.8)

Example

> ./pdf-sanitizer my_huge_file.pdf my_readable_file.pdf
✅  Your file has been sanitized at my_readable_file.pdf

Contributing

  • Test on windows

Fell free to add more useful features, test it and report issues.

Support

Reach out to me at one of the following places!

License

License

About

Quickly remove useless page from a huge pdf to get a readable pdf

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages