High-Quality Open Source Batch PDF Compressor
Modern, privacy-respecting, and built for both end-users & developers by Divyesh Vishwakarma
- Best-in-class PDF compression (multi-layered, lossless/lossy as needed)
- Batch compress multiple PDFs at once (UI & backend)
- Aggressive image recompression (PyMuPDF) – JPEGs optimized per user quality
- Stream/object deduplication and linearization (pikepdf/qpdf)
- Maximal vector/font/structure optimization (Ghostscript, industry standard)
- No cloud upload, 100% on-prem/local privacy
- Beautiful, modern UI with file listing, size breakdown, and easy download
- Fully open source, MIT licensed
Deploy locally and open http://localhost:5000 in your browser!
Sample screenshot:
- Backend: Python 3, Flask
- Image Recompression: PyMuPDF
- PDF Optimization: pikepdf (qpdf under the hood)
- Advanced Compression: Ghostscript
- Frontend: HTML5, CSS3, JS, Material Icons
- Batch Uploads: Native HTML5 + Flask multi-file POST
git clone https://github.com/divyesh1099/compressPDF.git
cd compressPDFIt is recommended to use a virtual environment.
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txtrequirements.txt contains:
Flask
PyMuPDF
pikepdf
-
Linux/macOS:
sudo apt install ghostscript- or
brew install ghostscript
-
Windows:
- Download Ghostscript and add the install directory (containing
gswin64c.exe) to your PATH.
- Download Ghostscript and add the install directory (containing
python app.pyNow open http://localhost:5000 in your browser.
- Visit the site and select one or more PDF files.
- (Optional): Adjust JPEG image quality (10–100) for best compression/quality tradeoff.
- Click Compress PDFs.
- Wait for processing – results (original/optimized size, percent reduction) are shown per file.
- Download the compressed PDFs with a single click.
Each PDF is processed as follows:
-
Image Optimization (PyMuPDF):
- All grayscale/RGB images are re-encoded as JPEG at your desired quality if that makes them smaller.
-
Stream & Structure Optimization (pikepdf):
- Deduplicate, compress, and object-stream all PDF internal data.
-
Maximal Compression (Ghostscript):
- Downsamples vectors, fonts, and further linearizes and compresses the entire PDF.
Temp files are automatically cleaned. Batch uploads are supported out-of-the-box.
-
Compression Quality: The default is
75(good for most cases). Lower = smaller PDF, higher = better image quality. -
Ghostscript Quality Presets:
/screen– 72dpi, smallest, low-res/ebook– 150dpi, perfect for sharing/archival (default)/printer– 300dpi, best for print
(You can add more or change defaults in the Python code.)
-
Output Directory: All compressed files are saved in
compressed/, which is cleared before each batch.
-
For development:
python app.py(runs onlocalhost:5000) -
For production: Use a WSGI server (e.g., Gunicorn, Waitress) and a reverse proxy (nginx, etc).
gunicorn -w 4 app:app
-
Ghostscript not found?
- Make sure it's installed and on your
PATH(gswin64c.exefor Windows,gsfor Linux/macOS).
- Make sure it's installed and on your
-
"Compression failed" errors:
- Check the console for details – usually a corrupt or non-PDF file.
-
Large PDFs or memory errors:
- Consider increasing system RAM or compressing in smaller batches.
- Branding & UI:
Tweak
index.htmlandsuccess.htmlfor your own style, colors, and footer info. - Support more filetypes: Extend the allowed files/extensions in the Python code if needed.
- Add authentication, analytics, Sentry, etc.: Standard Flask best-practices apply.
Created and maintained by Divyesh Vishwakarma
PDF compression pipeline inspired by the best of the open-source Python/PDF community. Uses PyMuPDF, PikePDF, Ghostscript, and Flask.
MIT License – free for commercial and personal use. Attribution appreciated but not required. Star ⭐️ the repo if you found it useful!
Happy compressing! For professional projects, consulting, or dev work, contact me on LinkedIn.
❤️Moti
