Commits on Dec 2, 2010
  1. Reimplementation of huffman-rle encoding using a fixed unit size of 6…

    …4 bits. This version is extremely hacky
    committed Dec 2, 2010
  2. Merge branch 'master' into huffman

    committed Dec 2, 2010
Commits on Dec 1, 2010
  1. Added code to place markers for Huffman/RLE BWT. Encoding/decoding no…

    …w functional but needs to be cleaned up and moved to the BWTWriter, not reader. Occurrence calculations not implemented yet.
    committed Dec 1, 2010
Commits on Nov 26, 2010
  1. Cleaned up encoding code

    committed Nov 26, 2010
Commits on Nov 25, 2010
  1. Working symbol encoder/decoder. Encodes an arbitrary stream of symbol…

    …s into a buffer of unsigned chars using a double-huffman encoding of symbol,runlength pairs.
    committed Nov 25, 2010
Commits on Nov 24, 2010
  1. Merge branch 'huffman' of /nfs/users/nfs_j/js18/work/git_repository/s…

    …ga into huffman
    committed Nov 24, 2010
  2. Hacked together huffman tree implementation. Good improvement of run …

    …length compression. Needs to be refactored/cleaned up
    committed Nov 24, 2010
Commits on Nov 22, 2010
  1. Improvements to experimental huffman encoder. buildHuffman now return…

    …s the minimum number of free bits needed to encode a character
    committed Nov 22, 2010
  2. Very experimental implementation of a mixed-unit BWT string. Developm…

    …ent only, this does not function as an FM-index.
    committed Nov 22, 2010
Commits on Nov 21, 2010
  1. Started implementation of huffman coded BWT. Added class to hold enco…

    …ded structure using a simple 3-bit encoder
    committed Nov 21, 2010
  2. Refactored RLUnit into its own class

    committed Nov 21, 2010
  3. Refactored the BWT Markers into their own file. Also moved accumulati…

    …on code out of RLBWT into the RLUnit
    committed Nov 21, 2010
Commits on Nov 19, 2010
  1. Updated version to 0.9.4. The main difference in this version is an i…

    …mproved strategy for managing the Occurrence array in the BWT, which requires substantially less memory.
    committed Nov 19, 2010
Commits on Nov 18, 2010
  1. Fixed bug in two-tier implementation where the count for the last Sma…

    …llBlock placed was incorrect.
    committed Nov 18, 2010
  2. Implemented second version of two-tier occurrence array markers.

    This version keeps the symbol counts since the last superblock in a
    2-byte integer as opposed to keeping the count since the last relative block
    in a 1-byte integer. The second approach is faster for most ranges of sample rates
    and can allow even lower memory use than the 1-byte approach. This version
    will be merged into the master branch.
    committed Nov 18, 2010
Commits on Nov 17, 2010
  1. Removed gcc force-inline attributes

    committed Nov 17, 2010
  2. More clean up of two-tier code.

    committed Nov 17, 2010
  3. Cleaned up two-tier code.

    committed Nov 17, 2010
  4. Fixed error in SmallMarker - was using size_t to hold the unitCount w…

    …hen it will be at most 128. Changed to uint8_t which for a huge memory saving.
    Changed default sample rate for LargeMarker to be 1024.
    committed Nov 17, 2010
  5. Complete re-write of how the BWT occurrence array markers are represe…

    We are using a two-tier system where LargeMarkers are placed every 2048 symbols with the absolute count of the number of symbols seen up to that point.
    Every 128 symbols a SmallMarker is placed which holds the count of symbols over the last 128 symbols. This allows us to store these relative counts with
    an 8 bit integer instead of a 64 bit for the absolute counters. The absolute counts every 128 symbols are interpolated from the relative counts.
    This is a much more space efficient representation - thanks to Travis Wheeler for this suggestion.
    This version of the code has the old marker system (absolute counts every 128 symbols) left in and a testing function
    is placed at the end of initializeFMIndex(). This version should only be used for testing/debugging the
    two-tier system. The old system will be removed in the next revision.
    committed Nov 17, 2010