Commits on Nov 19, 2010
Commits on Nov 18, 2010
  1. Fixed bug in two-tier implementation where the count for the last Sma…

    …llBlock placed was incorrect.
    committed Nov 18, 2010
  2. Implemented second version of two-tier occurrence array markers.

    This version keeps the symbol counts since the last superblock in a
    2-byte integer as opposed to keeping the count since the last relative block
    in a 1-byte integer. The second approach is faster for most ranges of sample rates
    and can allow even lower memory use than the 1-byte approach. This version
    will be merged into the master branch.
    committed Nov 18, 2010
Commits on Nov 17, 2010
  1. Removed gcc force-inline attributes

    committed Nov 17, 2010
  2. More clean up of two-tier code.

    committed Nov 17, 2010
  3. Cleaned up two-tier code.

    committed Nov 17, 2010
  4. Fixed error in SmallMarker - was using size_t to hold the unitCount w…

    …hen it will be at most 128. Changed to uint8_t which for a huge memory saving.
    Changed default sample rate for LargeMarker to be 1024.
    committed Nov 17, 2010
  5. Complete re-write of how the BWT occurrence array markers are represe…

    We are using a two-tier system where LargeMarkers are placed every 2048 symbols with the absolute count of the number of symbols seen up to that point.
    Every 128 symbols a SmallMarker is placed which holds the count of symbols over the last 128 symbols. This allows us to store these relative counts with
    an 8 bit integer instead of a 64 bit for the absolute counters. The absolute counts every 128 symbols are interpolated from the relative counts.
    This is a much more space efficient representation - thanks to Travis Wheeler for this suggestion.
    This version of the code has the old marker system (absolute counts every 128 symbols) left in and a testing function
    is placed at the end of initializeFMIndex(). This version should only be used for testing/debugging the
    two-tier system. The old system will be removed in the next revision.
    committed Nov 17, 2010
Commits on Nov 16, 2010
  1. Rewrote AlphaCount class to take in a template parameter indicating t…

    …he storage size. Replaced all existing uses of AlphaCount in the code with AlphaCount64, the 64-bit storage version.
    committed Nov 16, 2010
Commits on Nov 15, 2010
  1. Fixed typo in README spotted by Matthias Haimel. Added instruction fo…

    …r running
    if the program was downloaded from github.
    committed Nov 15, 2010
Commits on Nov 14, 2010
  1. Added --run-lengths parameter to sga-stats to print the run length di…

    …stribution of the BWT
    committed Nov 14, 2010
Commits on Nov 10, 2010
  1. Minor formatting change in configure

    committed Nov 10, 2010
  2. Added --with-hoard=PATH option to configure to allow the use of the H…

    …oard memory allocator.
    committed Nov 10, 2010
Commits on Nov 9, 2010
  1. Fixed bug in the bubble popper. The counter would never be incremente…

    …d so it would always be reported that no bubbles were popped.
    committed Nov 9, 2010
Commits on Nov 8, 2010
Commits on Nov 6, 2010
  1. Added new statistics to sga-stats. Now outputs the estimated error ra…

    …te in the reads and the mean overlap depth.
    committed Nov 6, 2010
Commits on Nov 5, 2010
Commits on Nov 4, 2010
  1. Rewrote Util/HashMap.h logic to explicitly define the StringHasher fu…

    …nction. This is to fix a problem where tr1::unordered_map was available but the sparsehash was still trying to use __gnu_cxx::hash<std::string> which does not exist.
    committed Nov 4, 2010
Commits on Oct 27, 2010
  1. Update the new sga-connect program to mark vertices in the graph that…

    … are covered by a pe-walk
    committed Oct 27, 2010
  2. Implemented gmap subprogram which is a very basic read-read mapper.

    Currently this is used to map reads that were rmdup'd into an unmerged graph
    for use in the connect program.
    committed Oct 27, 2010
Commits on Oct 26, 2010
  1. Added new subprogram sga-stats which prints out a histogram of the km…

    …er counts for a read set.
    committed Oct 26, 2010
Commits on Oct 25, 2010
  1. Added new output file to sga-connect to record the pe reads that coul…

    …d not be connected
    Added new parameter to sga-connect to specify the maximum distance to search for
    committed Oct 25, 2010
Commits on Oct 21, 2010
  1. Implemented sga qc subprogram. This program looks for, and discards, …

    …problematic reads. Right now, the qc check requires each read to have a tiling of high confidence k-mers (with a short kmer length).
    committed Oct 21, 2010
Commits on Oct 18, 2010
Commits on Oct 15, 2010