Commits on Jun 10, 2012
  1. experiment with caching the tokenised content of Form XObjects

    * Form XObjects are designed to save disk space and parsing time by
      reusing content across pages
    * it doesn't make sense to parse them over and over
    * This is a silly global variable cache to just test the theory and see
      what happens to memory usage and CPU time
    committed Jun 10, 2012
Commits on Jun 9, 2012
Commits on May 23, 2012
  1. these strings are binary, not UTF-8

    * it shouldn't matter, but a test is blowing up on RBX
    committed May 23, 2012
Commits on May 22, 2012
  1. use Fixnums instead of Strings in our homegrown lexer

    * perftools.rb was showing ~49% of the time extracting text from a large
      sample file was spent in Buffer#prepare_regular_token
    * with this patch that falls to ~40% of time
    * still not great, but a significant improvement
    * this shows the dangers of writing our own lexer and parser. It's
      probably worth investigating if we can switch to a parser gem that
      does a better job
    committed May 22, 2012
Commits on May 14, 2012
  1. add a spec PDF that contains an outline

    * for Sergey Zavilkin to use in specs
    committed May 14, 2012
Commits on May 9, 2012
  1. prepare for release

    committed May 9, 2012
Commits on May 7, 2012
  1. always de-ref the document page count

    * just in case it's an indirect object
    * thanks to Igor Jorobus for reporting
    committed May 7, 2012
Commits on Apr 24, 2012
  1. Merge pull request #52 from rstawarz/master

    Add spec and fix issue for extra token whitespace during token parsing
    committed Apr 24, 2012
  2. Modify buffer token parsing to account for extra token whitspace that…

    … may be well beyond 10-20 characters
    rstawarz committed Apr 24, 2012
Commits on Apr 10, 2012
  1. always join multiple content streams with whitespace

    * to avoid smashing two unrelated tokens together
    committed Apr 10, 2012
Commits on Mar 25, 2012
  1. Revert "add a cache_stats methos to ObjectHash"

    This reverts commit 6727e84.
    committed Mar 25, 2012
  2. Revert "add a utility class for reporting on cache stats"

    This reverts commit c751718.
    committed Mar 25, 2012
  3. Revert "alter CacheReport to generate a CSV"

    This reverts commit c74fd4d.
    committed Mar 25, 2012
  4. update changelog

    committed Mar 25, 2012
  5. bump minor version

    committed Mar 25, 2012
Commits on Mar 5, 2012
  1. a Hash is not an inline image

    * these specs are getting unwieldly. It will soon be time to break out a
      proper lexer/parser
    committed Mar 5, 2012
Commits on Feb 25, 2012
  1. require YAML where it's needed

    * closes #49
    committed Feb 25, 2012
Commits on Feb 19, 2012
  1. Merge pull request #48 from bradediger/master

    Properly tokenize "/\n" as an empty PDF name
    committed Feb 19, 2012
  2. Properly tokenize "/\n" as empty PDF name

    Per PDF 32000-1:2008 sec 7.3.5, whitespace in a PDF name must always be
    escaped hexadecimally, and no whitespace may come between the / and the
    start of the name. So a name starting "/\n" should always be tokenized
    as an empty name.
    bradediger committed Feb 19, 2012
  3. PageState needs public helpers to find resources

    * it encapsulates the search for resources up the stack of nested Form
    committed Feb 19, 2012
  4. ignore xref table entries that point to byte offset 0

    * they can't be correct
    committed Feb 19, 2012
Commits on Feb 18, 2012
  1. update a require call

    committed Feb 18, 2012
  2. Use a PageState object to track state as a page is rendered

    * the tracked state includes current transforms, the current font and
      font size, etc
    * receivers that need to understand state should delegate much of their
      logic to a PageState instance
    * PageTextReceiver is the cannonical example of how to use PageState
    committed Feb 13, 2012
Commits on Feb 11, 2012
  1. alter CacheReport to generate a CSV

    committed Feb 11, 2012
  2. add a cache_stats methos to ObjectHash

    * handy for debugging performance issues
    committed Feb 11, 2012
Commits on Feb 7, 2012