- Fixed segfault when overwriting the pikepdf file that is currently open on Linux.
- Fixed removal of an attribute metadata value when values were present on the same node.
- Avoid canonical XML since it is apparently too strict for XMP.
- Fixed several issues related to generating XMP metadata that passed veraPDF validation.
- Fixed a random test suite failure for very large negative integers.
- The lxml library is now required.
- Added all of the commonly used XML namespaces to XMP metadata handling, so we are less likely to name something 'ns1', etc.
- Skip a test that fails on Windows.
- Fixed build errors in documentation.
- Fix
Object.write()
accepting positional arguments it wouldn't use - Fix handling of XMP data with timezones (or missing timezone information) in a few cases
- Fix generation of XMP with invalid XML characters if the invalid characters were inside a non-scalar object
- New API to access and edit PDF metadata and make consistent edits to the new and old style of PDF metadata.
- 32-bit binary wheels are now available for Windows
- PDFs can now be saved in QPDF's "qdf" mode
- The Python package defusedxml is now required
- The Python package python-xmp-toolkit and its dependency libexempi are suggested for testing, but not required
- Fixed handling of filenames that contain multibyte characters on non-UTF-8 systems
- The
Pdf.metadata
property was removed, and replaced with the new metadata API Pdf.attach()
has been removed, because the interface as implemented had no way to deal with existing attachments.
- Add API for inline images to unparse themselves
- Performance of reading files from memory improved to avoid unnecessary copies.
- It is finally possible to use
for key in pdfobj
to iterate contents of PDF Dictionary, Stream and Array objects. Generally these objects behave more like Python containers should now. - Package API declared beta.
Pdf.save(...stream_data_mode=...)
has been dropped in favor of the newercompress_streams=
andstream_decode_level
parameters.
- A use-after-free memory error that caused occasional segfaults and "QPDFFakeName" errors when opening from stream objects has been resolved.
- pybind11 vendoring has ended now that v2.2.4 has been released
- libqpdf 8.2.1 is now required
- Improved support for working with JPEG2000 images in PDFs
- Added progress callback for saving files,
Pdf.save(..., progress=)
- Updated pybind11 subtree
del obj.AttributeName
was not implemented. The attribute interface is now consistent- Deleting named attributes now defers to the attribute dictionary for Stream objects, as get/set do
- Fixed handling of JPEG2000 images where metadata must be retrieved from the file
- Added support for direct image extraction of CMYK and grayscale JPEGs, where previously only RGB (internally YUV) was supported
Array()
now creates an empty array properly- The syntax
Name.Foo in Dictionary()
, e.g.Name.XObject in page.Resources
, now works
pikepdf.open
now validates its keyword arguments properly, potentially breaking code that passed invalid arguments- libqpdf 8.1.0 is now required - libqpdf 8.1.0 API is now used for creating Unicode strings
- If a non-existent file is opened with
pikepdf.open
, aFileNotFoundError
is raised instead of a generic error - We are now temporarily vendoring a copy of pybind11 since its main branch contains unreleased and important fixes for Python 3.7.
- The syntax
Name.Thing
(e.g.Name.DecodeParms
) is now supported as equivalent toName('/Thing')
and is the recommended way to refer names within a PDF - New API
Pdf.remove_unneeded_resources()
which removes objects from each page's resource dictionary that are not used in the page. This can be used to create smaller files.
- Fixed an error parsing inline images that have masks
- Fixed several instances of catching C++ exceptions by value instead of by reference
- Modified
Object.write
method signature to requirefilter
anddecode_parms
as keyword arguments - Implement automatic type conversion from the PDF Null type to
None
- Removed
Object.unparse_resolved
in favor ofObject.unparse(resolved=True)
- libqpdf 8.0.2 is now required at minimum
- Improved IPython/Jupyter interface to directly export temporary PDFs
- Updated to qpdf 8.1.0 in wheels
- Added Python 3.7 support for Windows
- Added a number of missing options from QPDF to
Pdf.open
andPdf.save
- Added ability to delete a slice of pages
- Began using Jupyter notebooks for documentation
- Added Python 3.7 support to build and test (not yet available for Windows, due to lack of availability on Appveyor)
- Removed setter API from
PdfImage
because it never worked anyway - Improved handling of
PdfImage
with trivial palettes
Object.check_owner
renamed toObject.is_owned_by
Object.objgen
andObject.get_object_id
are now public functions- Major internal reorganization with
pikepdf.models
becoming the submodule that holds support code to ease access to PDF objects as opposed to wrapping QPDF.
- Implemented automatic type conversion for
int
,bool
andDecimal
, eliminating thepikepdf.{Integer,Boolean,Real}
types. Removed a lot of associated numerical code.
Everything before v0.2.0 can be considered too old to document.