Browse files

update TODO

  • Loading branch information...
1 parent 9165f5f commit eacd7d835d6ef928d46541885ccbf8fd6d0e5391 @yob committed Jan 16, 2012
Showing with 6 additions and 17 deletions.
  1. +6 −17 TODO
@@ -1,27 +1,19 @@
-- add extra callbacks
- - list implemented features
- - encrypted? tagged? bookmarks? annotated? optimised?
-- Allow more than just page content and metadata to be parsed (see spec section 3.6.1)
+This stuff would be great
+- improved access to document level objects and data
- bookmarks?
- outline?
- articles?
- viewer prefs?
-- Don't remove comment when tokenising in the middle of a string
+- Improve the speed of Encoding#to_utf8
- Tweak encoding mappings to differentiate between bytes that are invalid for an encoding, and bytes that are unchanged.
poppler seems to do this in a quite reasonable way. Original Encoding -> Glyph Names -> Unicode. As of 0.6 we go straight
from the Original encoding to Unicode.
- detect when a font's encoding is a CMap (generally used for pre-Unicode, multibyte asian encodings), and display a user friendly error
- Improve interpretation of non content stream data (ie metadata). recognise dates, etc
-- Fix inheritance of page attributes. Resources has been done, but plenty of other attributes
- are inheritable. See table 3.2.7 in the spec
-- Add a way to extract raster images
- - see XObjects section of spec (section 4.7)
-- Add a way to extract font data?
+This might be useful, more research required
- Support for CJK text (convert to UTF-8 like all other encodings. See Section 5.9 of the PDF spec)
- Will require significantly improved handling of CMaps, including creating a bunch of predefined ones
@@ -30,10 +22,7 @@ Sometime
- Ship some extra receivers in the standard package, particuarly ones that are useful for running
rspec over generated PDF files
-- When we encounter Identity-H encoded text with no ToUnicode CMap, render the glyphs and treat them as images, as there's no
- sensible way to convert them to unicode
-- Add support for additional filters: ASCIIHexDecode, ASCII85Decode, LZWDecode, RunLengthDecode, CCITTFaxDecode, JBIG2Decode, DCTDecode, JPXDecode, Crypt?
+- Add support for additional filters: CCITTFaxDecode, JBIG2Decode, DCTDecode, JPXDecode
- Add support for additional encodings:
- Identity-V(I *think* this relates to vertical text. Not sure how we'd support it sensibly)

0 comments on commit eacd7d8

Please sign in to comment.