Skip to content
Commits on Dec 23, 2011
  1. Oops! Back out part of 59aad6977070 - it was wrong

    My assertion that it was safe to skip the "do I have 1 byte available?" check
    was incorrect.
  2. Make encoding slightly faster.

    The improvement mainly comes from dropping a redundant check when decoding
    an ASCII byte.
Commits on Oct 4, 2011
  1. Silence a compiler warning.

Commits on Jul 11, 2011
Commits on Jul 10, 2011
  1. Portable native UTF-8 decoder gives 3.7x faster decoding

    This code is derived from Björn Höhrmann's UTF-8 decoder.  Compared
    to the original Haskell decoder from cac7dbcbc392, it's between
    2.17 and 3.68 times faster.  It's even between 1.18 and 3.58 times
    faster than the improved Haskell decoder from 71ead801296a.
    The x86-specific decoding path gives a substantial win for entirely
    and partly ASCII text, e.g. HTML and XML, at the cost of being about
    17% slower than the portable C decoder for entirely non-ASCII text.
Commits on Jul 8, 2011
  1. Speed up UTF-8 decoding by a little over 2x

    The previous code was more concise, but alas GHC boxed each Word8
    it read from the ByteString, which resulted in poor performance.
    This mankier code adds (seemingly required) strictness annotations,
    along with a little bit of manual CSE.
    Timing of the DecodeUtf8/Strict benchmark went from 41.8ms to 19.6ms,
    a pleasing improvement.
Commits on Jun 28, 2011
  1. Oh noes! I was miscalculating the initial buffer size!

    When performance testing encodeUtf8, I noticed that for some reason I
    was still seeing "ensure" show up in the profile, when I expected it
    shouldn't have been.
    Turns out I was using a "min" where I should have been using a "max",
    and thus allocating an initial bytestring that would almost always be
    too small, thus forcing reallocations and copying. Boo!
  2. Eliminate unnecessary resizes from encodeUtf8.

    We had been performing a resize any time that (a) we had data to write
    and (b) we got to within 4 bytes of filling the target bytestring.
    This was safe, but suboptimal, as it meant that in the common case of
    encoding ASCII text, we would *always* perform a resize.
    Now, we check the exact number of bytes we need to fit, and resize
    only if they won't fit.  This eliminates resizes for ASCII data, and
    makes them a little less likely for other data.
Commits on Mar 16, 2011
  1. Improve error message.

  2. Add decodeUtf8'.

Commits on Nov 30, 2010
Commits on Oct 14, 2010
  1. Get rid of the old decode function

    extra : convert_revision : 7e57067874ddcc3e6e87160e474005285c488abf
  2. Add a rewrite rule for fusion

    extra : convert_revision : ad8873d09eb7c252ca36bdcfe1b56ad0ad473bca
  3. Write a faster UTF-8 decoder

    extra : convert_revision : 60bd4f818c27426be5f00e5ab9fb9092c369b2ce
  4. Remove old UTF-8 encoding functions

    extra : convert_revision : c10b0a8c51e03f7f3f0d7913eddf1052952a0e8b
  5. Update copyright

    extra : convert_revision : 2ab281eea400bc63ee7dbff76bf715d1def98398
  6. Rewrite encodeUtf8 for speed

    This was inspired by a patch from Simon Meier, who wrote a direct
    implementation of encodeUtf8 using his 'blaze-builder' package.  His code
    showed a very impressive speedup.  My code is similar in both structure
    and performance, its chief difference being that it doesn't require
    extra : convert_revision : 1b338ee3a345ac1e437be1f5d8cd0919d9690c14
Commits on Apr 29, 2010
  1. Change Tom's email address

    extra : convert_revision : e868dbddc8fdaff7a034db9689b0a8b3caeb1010
Commits on Jun 6, 2009
  1. Add controllable error handling and recovery code.

    extra : convert_revision : 3795901067732c91b235f9281f8e3691756dc5d3
Commits on May 23, 2009
  1. Update copyrights and maintainers.

    extra : convert_revision : 9b03b888951923de7ea04083728df0641c6d2515
Commits on Feb 27, 2009
  1. Fix Haddocks

    extra : convert_revision : de2ceb4afe3fb2418470beb76bec7ec9b46e6b90
Commits on Feb 24, 2009
  1. Move Utf* modules into Data.Text.Encoding

    extra : convert_revision : 9c17f649ac992fe759e0ab4693e9ec48687283ba
  2. Test the remaining supported encodings

    extra : convert_revision : c6d3dc3494fe5bd49d18cc49a589a03a9f3347d5
Commits on Jan 27, 2009
  1. Split encoding support out into new modules

    extra : convert_revision : 1132eb117b6ebdfe42a897ff200a34c914f415d3
Something went wrong with that request. Please try again.