TODO

TODO:

    * back port streams fusion code.
    * show instance for LPS
    * stress testing
    * strictness testing
    * rewrite C code to Haskell
    * eliminate use of -fno-warn-orphans

TODO before full builder release
--------------------------------

  [DONE] rename: encodeWithSize ~~> encodeSizePrefixed
  [DONE] fix naming scheme for varInt encodings
  [DONE] switch to Int64 and Int for representing sizes
  [DONE] remove 'Monoidal' type-class
  [DONE] ensure correspondance of 'Contravariant' with 'contravariant' package
  [DONE] merge BasicEncoding.Extras to BasicEncoding
  [DONE] add 'variable length encoding builders'
  [DONE] add fast decimal float/double conversion based on blaze-textual.
    Our findings are that the 'blaze-textual' float version is always slower
    than Haskell's built-in version due to the 'realToFrac' conversion. For
    'Double's, Haskell's built-in conversion is sometimes faster and sometimes
    slower than 'blaze-textual'. In no case, one function significantly
    outperforms the other function. We therefore use Haskell's built-in
    version instead of maintaining a complicated piece of code that is anyways
    a lot slower than the 'double-conversion' package. The 'BoundedEncoding'
    type will serve perfectly to wrap the functionality of the
    'double-conversion' package.
  [DONE] add intro documentation mentioning lazy bytestring builders
  [DONE] fix documentation in BasicEncoding.Extras
  [DONE] check all documentation
  [DONE] add 'Put' documentation
  [DONE] also export 'Put' transformers from internals: exported from
    hidden "Transformers"
  [DONE] use an allocation strategy for the inner builder in 'putSizePrefixed'
  [DONE] remove 'intXXXBase128LE' encodings. Their semantics is unclear and
    must be determined by the library user.
  [DONE] Upload to Github repo
  [DONE] Make a second call for review.
  [DONE] Comment on trick of using 'ensureFree' to avoid too big closures.
  [DONE] switch back to 'PaddedEncoding Word64' for builder transformers
  [DONE] Use a 'Builder' instead of a 'BoundedEncoding' for terminating chunks.
    Not recommended, as it makes impossible reserving the new buffer size such
    that it fits directly.
  [DONE] document all functions in 'Put' module. Decide which functions to add to
    public API.

  [DONE] investigate performance of 'encodeWithListF E.word8'
  [DONE] improve performance of varInt encodings
  [DONE] benchmark all varInt encodings
  [DONE] improve performance of padded varInt encodings
  [DONE] improve performance of encodeChunked and encodeWithSize
  [DONE] investigate performance of running a one byte builder.
  [DONE] investigate performance of bytestring insertion
  [DONE] improve the Fixed varint encoding performance using the same
    construction as the Google code: not possible due different control flow.
  [DONE] test 'varInt' encodings
  - test with GHC 6.12, 7.0, 7.2, 7.4
  [DONE] consider putting builder transformers into a separate package
    'bytestring-builder-transformers' (its just too experimental):
    no they have a clean semantics, thei are well-documented and form an
    important building block for higher-level libraries. Moreover, they
    strongly depend on the internals of the library and must be maintained
    jointly with it.
  - Reread documentation and check if it guides the user well-enough with
    respect what to expect from what module.
  - release library


Todo items
----------

* check that api again.
    - in particular, unsafeHead/Tail for Char8?
    - scanr,scanr1... in Char8

* would it make sense to move the IO bits into a different module too?
        - System.IO.ByteString
        - Data.ByteString.IO

* can we avoid joinWithByte? 
        - Hard. Can't do it easily with a rule.

* think about Data.ByteString.hGetLines. is it needed in the presence of
    the cheap "lines =<< Data.ByteString.Lazy.getContents" ?

* unchunk, Data.ByteString.Lazy -> [Data.ByteString]
    -  and that'd work for any Lazy.ByteString, not just hGetContents >>= lines

* It might be nice to have a trim MutableByteArray primitive that can release
  the tail of an array back to the GC. This would save copying in cases where
  we choose to realloc to save space. This combined with GC-movable strings
  might improve fragmentation / space usage for the many small strings case.

* if we can be sure there is very little variance then it might be interesting to look 
 into the cases where we're doing slightly worse eg the map/up, filter/up cases
 and why we're doing so much better in the up/up case!?  that one makes no sense
 since we should be doing the exact same thing as the old loopU for the up/up
 case

* then there are the strictness issues eg our current foldl & foldr are
  arguably too strict we could fuse unpack/unpackWith if they wern't so strict

* look at shrinking the chunk size, based on our cache testing.

* think about horizontal fusion (esp. when considering nofib code)

* fuseable reverse.

* 'reverse' is very common in list code, but unnecessary in bytestring
  code, since it takes a symmertric view.
    look to eliminate it with rules. loopUp . reverse --> loopDown

* work out how robust the rules are .

* benchmark against C string library benchmarks

* work out if we can convince ghc to remove NoAccs in map and filter.

* Implement Lazy:
    scanl1
    partition
    unzip

* fix documentation in Fusion.hs

* Prelude Data.ByteString.Lazy> List.groupBy (/=) $ [97,99,103,103]
  [[97,99,103,103]]
  Prelude Data.ByteString.Lazy> groupBy (/=) $ pack [97,99,103,103]
  [LPS ["ac","g"],LPS ["g"]]