Releases: YM162/gulagcleaner
v0.14.1
v0.13.0
- Added a new function to the python package that allows to directly clean a PDF file using the bytes.
- @carlosiborra Refactored the PDF tests for the Rust distribution. Also added a README for the Rust distribution.
What's Changed
- Refactor PDF Cleaning Tests for Improved Modularity and Error Handling + Add Rust Distribution README by @carlosiborra in #18
New Contributors
- @carlosiborra made their first contribution in #18
Full Changelog: v0.12.2...v0.13.0
0.12.2
Re-enabled the Wuolah cleaning method.
Full Changelog: v0.12.1...v0.12.2
v0.12.1
WARNING: This release has the "Wuolah" method disabled. It will clean those PDFs using the "Naive" method instead.
This is meant to be a temporary fix to keep things working while we work on fixing the rest.
What's Changed
New Contributors
Full Changelog: v0.11.1...v0.12.1
0.11.1
- Temporarily removed the method code in the gulagcleaner_wasm crate due to issues with serde serialization messing up the data.
Full Changelog: v0.11.0...v0.11.1
0.11.0
WARNING: The clean_pdf function now returns the numerical code of the method used to clean the pdf along with the cleaned PDF itself. The change is very small, but WILL break your code if you don't change it.
Example for the gulagcleaner_wasm package:
//Previous:
var cleaned_pdf= await clean_pdf(data,0);
//Current:
var cleaning_result = await clean_pdf(data,0);
var cleaned_pdf = cleaning_result.result
var method_code = cleaning_result.method
NEW:
- Added a method to clean StuDocu PDFs.
- Now the clean_pdf function also returns the method used to clean the PDF.
Full Changelog: v0.5.2...v0.11.0
0.10.3
NEW:
- The main functionality of the package was rewritten in Rust. Bindings for Python are provided, making the instalation of the package and CLI/Python code use identical.
- The Rust code can be used importing the crate https://crates.io/crates/gulagcleaner_rs
- The JS code can be used importing the npm package https://www.npmjs.com/package/gulagcleaner_wasm
- The Python code can be still used importing the pypi package https://pypi.org/project/gulagcleaner
- Added workflows to automate publishing on the 3 platforms/languages at the same time.
Full Changelog: v0.10.2...v0.10.3
0.10.2
v0.5.2
NEW: Fixes for newer files.
All PDFs downloaded after 18/05/23 have a different internal structure, making the old method of extracting pages obsolete.
This version introduces a new method for extracting the original page via /Contents dictionary manipulation. The old method of PDF.Form extraction can still be used with the '-o' flag.
v0.8.2
NEW:
- Fixed edge case for PDFs with strange MediaBoxes
- Added support for cleaning multiple pdfs or full folders recursively.
What's Changed
- Multiple files / folders feature added by @jseg380 in #6
- Fixes for MediaBoxes not starting in (0,0) by @YM162 in #8
- Fix for PDFs with unusual MediaBoxes by @YM162 in #9
New Contributors
Full Changelog: v0.7.0...v0.8.1