Skip to content


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Fetching contributors…

Cannot retrieve contributors at this time

93 lines (80 sloc) 4.804 kb
v0.7.7 (11th September 2009)
- Trigger callbacks contained in Form XObjects when we encounter them in a
content stream
- Fix inheritance of page resources to comply with section 3.6.2
v0.7.6 (28th August 2009)
- Various bug fixes that increase the files we can successfully parse
- Treat float and integer tokens differently (thanks Neil)
- Correctly handle PDFs where the Kids element of a Pages dict is an indirect
reference (thanks Rob Holland)
- Fix conversion of PDF strings to Ruby strings on 1.8.6 (thanks Andrès Koetsier)
- Fix decoding with ASCII85 and ASCIIHex filters (thanks Andrès Koetsier)
- Fix extracting inline images from content streams (thanks Andrès Koetsier)
- Fix extracting [ ] from content streams (thanks Christian Rishøj)
- Fix conversion of text to UTF8 when the cmap uses bfrange (thanks Federico Gonzalez Lutteroth)
v0.7.5 (27th August 2008)
- Fix a 1.8.7ism
v0.7.4 (7th August 2008)
- Raise a MalformedPDFError if a content stream contains an unterminated string
- Fix an bug that was causing an endless loop on some OSX systems
- valid strings were incorrectly thought to be unterminated
- thanks to Jeff Webb for playing email ping pong with me as I tracked this
issue down
v0.7.3 (11th June 2008)
- Add a high level way to get direct access to a PDF object, including a new executable: pdf_object
- Fix a hard loop bug caused by a content stream that is missing a final operator
- Significantly simplified the internal code for encoding conversions
- Fixes YACC parsing bug that occurs on Fedora 8's ruby VM
- New callbacks
- page_count
- pdf_version
- Fix a bug that prevented a font's BaseFont from being recorded correctly
v0.7.2 (20th May 2008)
- Throw an UnsupportedFeatureError if we try to open an encrypted/secure PDF
- Correctly handle page content instruction sets with trailing whitespace
- Represent PDF Streams with a new object, PDF::Reader::Stream
- their really wasn't any point in separating the stream content from it's associated dict. You need both
parts to correctly interpret the content
v0.7.1 (6th May 2008)
- Non-page strings (ie. metadata, etc) are now converted to UTF-8 more accurately
- Fixed a regression between 0.6.2 and 0.7 that prevented difference tables from being applied
correctly when translating text into UTF-8
v0.7 (6th May 2008)
- API INCOMPATIBLE CHANGE: any hashes that are passed to callbacks use symbols as keys instead of PDF::Reader::Name instances.
- Improved support for converting text in some PDF files to unicode
- Behave as expected if the Contents key in a Page Dict is a reference
- Include some basic metadata callbacks
- Don't interpret a comment token (%) inside a string as a comment
- Small fixes to improve 1.9 compatibility
- Improved our Zlib deflating to make it slightly more robust - still some more issues to work out though
- Throw an UnsupportedFeatureError if a pdf that uses XRef streams is opened
- Added an option to PDF::Reader#file and PDF::Reader#string to enable parsing of only parts of a PDF file(ie. only metadata, etc)
v0.6.2 (22nd March 2008)
- Catch low level errors when applying filters to a content stream and raise a MalformedPDFError instead.
- Added support for processing inline images
- Support for parsing XRef tables that have multiple subsections
- Added a few callbacks to improve the way we supply information on page resources
- Ignore whitespace in hex strings, as required by the spec (section 3.2.3)
- Use our "unknown character box" when a single character in an Identity-H string fails to decode
- Support ToUnicode CMaps that use the bfrange operator
- Tweaked tokenising code to ensure whitespace doesn't get in the way
v0.6.1 (12th March 2008)
- Tweaked behaviour when we encounter Identity-H encoded text that doesn't have a ToUnicode mapping. We
just replace each character with a little box.
- Use the same little box when invalid characters are found in other encodings instead of throwing an ugly
- Added a method to RegisterReceiver that returns all occurrences of a callback
v0.6.0 (27th February 2008)
- all text is now transparently converted to UTF-8 before being passed to the callbacks.
before this version, text was just passed as a byte level copy of what was in the PDF file, which
was mildly annoying with some encodings, and resulted in garbled text for Unicode encoded text.
- Fonts that use a difference table are now handled correctly
- fixed some 1.9 incompatible syntax
- expanded RegisterReceiver class to record extra info
- expanded rspec coverage
- tweaked a README example
v0.5.1 (1st January 2008)
- Several documentation tweaks
- Improve support for parsing PDFs under windows (thanks to Jari Williamsson)
v0.5 (14th December 2007)
- Initial Release
Jump to Line
Something went wrong with that request. Please try again.