Skip to content


Subversion checkout URL

You can clone with
Download ZIP
Commits on Oct 25, 2011
  1. Bump version

Commits on Oct 4, 2011
  1. Fix a corner case in lazy text search.

    On a chunk boundary, we were not passing the correct mask and skip values
    along to the function that would process the next chunk.
  2. Silence a compiler warning.

  3. Eliminate a useless fromIntegral.

Commits on Aug 22, 2011
  1. Widen dependency on directory

  2. Add top-level QuickCheck test support.

    The "real" tests remain in tests/tests - this test suite is built without
    optimization, and simply lets us do a quick pass/fail during automated builds.
  3. Merge

  4. Merge 1b33e08 into 3845ffd

    GitHub Merge Button authored
Commits on Aug 18, 2011
  1. @jaspervdj
Commits on Aug 13, 2011
  1. @jaspervdj
  2. @jaspervdj
  3. @jaspervdj

    Add Streaming benchmarks

    jaspervdj authored
Commits on Jul 22, 2011
Commits on Jul 20, 2011
  1. Bump version

  2. Fix an overly cautious bit of arithmetic checking.

    Even though the value behind a Size is an Int, we actually intend that those
    values should always be non-negative. (We don't use the notionally more
    appropriate Word because GHC doesn't do a very good job with it.)
    But non-negative means that 0+0 should be 0! Um, oops.
Commits on Jul 15, 2011
  1. Merge e1bc8a8 into 9e9d83e

    GitHub Merge Button authored
  2. @tibbe

    Bump dependency on integer-gmp

    tibbe authored
Commits on Jul 11, 2011
  1. Change where we look for test data

  2. Update

Commits on Jul 10, 2011
  1. Portable native UTF-8 decoder gives 3.7x faster decoding

    This code is derived from Björn Höhrmann's UTF-8 decoder.  Compared
    to the original Haskell decoder from cac7dbcbc392, it's between
    2.17 and 3.68 times faster.  It's even between 1.18 and 3.58 times
    faster than the improved Haskell decoder from 71ead801296a.
    The x86-specific decoding path gives a substantial win for entirely
    and partly ASCII text, e.g. HTML and XML, at the cost of being about
    17% slower than the portable C decoder for entirely non-ASCII text.
  2. Merge

Commits on Jul 8, 2011
  1. Benchmark the performance of iconv.

    On my Mac, it takes 33ms, vs about 20ms for the Haskell code.
  2. Bump version

  3. Merge

  4. Speed up UTF-8 decoding by a little over 2x

    The previous code was more concise, but alas GHC boxed each Word8
    it read from the ByteString, which resulted in poor performance.
    This mankier code adds (seemingly required) strictness annotations,
    along with a little bit of manual CSE.
    Timing of the DecodeUtf8/Strict benchmark went from 41.8ms to 19.6ms,
    a pleasing improvement.
Commits on Jun 29, 2011
  1. Bump version

  2. Merge

Commits on Jun 28, 2011
  1. Oh noes! I was miscalculating the initial buffer size!

    When performance testing encodeUtf8, I noticed that for some reason I
    was still seeing "ensure" show up in the profile, when I expected it
    shouldn't have been.
    Turns out I was using a "min" where I should have been using a "max",
    and thus allocating an initial bytestring that would almost always be
    too small, thus forcing reallocations and copying. Boo!
  2. Eliminate unnecessary resizes from encodeUtf8.

    We had been performing a resize any time that (a) we had data to write
    and (b) we got to within 4 bytes of filling the target bytestring.
    This was safe, but suboptimal, as it meant that in the common case of
    encoding ASCII text, we would *always* perform a resize.
    Now, we check the exact number of bytes we need to fit, and resize
    only if they won't fit.  This eliminates resizes for ASCII data, and
    makes them a little less likely for other data.
Something went wrong with that request. Please try again.