v1.4.1

@gatb-admin gatb-admin released this Dec 14, 2017 · 43 commits to master since this release

This is a bugfix release.

  • fixed a segfault in some multi-threaded situations.
  • removed some files to make the distrib less large.
  • fixed a bug with the -storage-type file option.

v1.4.0

@gatb-admin gatb-admin released this Nov 20, 2017 · 50 commits to master since this release

  • Integration of Leon compressor into GATB-Core :

  • Time and memory optimisations :

    • Faster k-mer counting (inspired by KMC3 but not yet as fast :)

    • More efficient graph representation using compressed vectors (in GraphUnitigs.cpp)

    • Faster unitigs compaction (engineering improvements in BCALM code)

    • New compact encoding scheme to load the abundance values in memory (encoded on 8 bits, value range = 0 to 50k with 5% max error)

  • Parameterizable graph simplifications steps (see Graph.hpp and Minia): optional tip-clipping, bulge and erroneous connection removal

  • Preliminary support for loading unitigs (in GraphUnitigs.cpp) from a GFA1 graph format generated by BCALM (using scripts/convertToGFA.py in BCALM repository)

  • Adding new ways to compile, making compilation easier :

    • Added a simple makefile to compile a GATB tool without CMake (see examples/Makefile)

    • Added support for Docker. Using docker/Dockerfile one can build a docker image containing GATB-core.

    • 2 new ways to compile example codes snippets :

      • cmake -DGATB_CORE_INCLUDE_EXAMPLES=True ..
        or
      • cd example ; make [folder]/[examplename.cpp] for instance, make kmer/kmer2 will compile kmer2.cpp
  • Various bugfixes

v1.3.0

@gatb-admin gatb-admin released this Mar 16, 2017 · 237 commits to master since this release

Summary

  • A new graph object is introduced: GraphUnitigs, optimized to traverse unitigs but not to query individual kmers.
  • A few graph API functions changed.
  • Updated MPHF and HDF5.
  • This releases now requires your compiler to be C++11-compatible.

Details

  • Tech notice

    • Compiling GATB-Core library now requires c++/11 capable compilers.

    • CMake 3.1.0 is the minimum release of CMake required to compile GATB-Core.

    • HDF5 library (use for data storage) upgraded to latest release 1.8.18

    • Parameters "-mphf none", "-mphf emphf" and "-mphf boophf" and variable WITH_MPHF are deprecated. Please remove them from your applications (e.g. in Graph::create()). BooPHF is now the default MPHF object and it is always compiled. Emphf has been removed from the library.

    • Debug compilation is now done using standard Cmake rule "-DCMAKE_BUILD_TYPE=Debug", instead of "-Ddebug=1".

  • API changes

    • Developers, please pay attention to these breaking changes:

      • Graph::Vector is now ``GraphVector`
      • Graph::Iterator is now GraphIterator
      • Graph::create()does not accept anymore '-mphf ...' (see Tech Notice, above)
  • New features

    • New GraphUnitigs class that offers a de Bruijn graph representation based on unitigs (created by BCALM2) loaded in memory. It has the same API as the Graph class although some functions aren't implemented, as accessing a node that is not an extremity of a unitig isn't supported in this representation. The representation is designed to traverse unitigs quickly, skipping over all non-branching nodes. This representation doesn't use the Bloom filter nor the MPHF. To use this representation, have a look at Minia's code: https://github.com/GATB/minia/blob/ee00a34f1a49a1fcdd757e0bdaf7d03190896322/src/Minia.cpp#L116

    • New functions to traverse the graph have been added . See simplePath* in Graph.hpp. These functions are mostly designed to take advantage of GraphUnitigs and they have the same API in Graph too. They also will replace the Traversal class. Partial compatibility with the original Graph class has been implemented so far.

    • BooPHF is now the default MPHF object used by GATB-Core

    • In addition to HDF5, we introduce a new experimental support for raw file format. It was made for two reasons: avoid potential memory leaks due to hdf5 (unclear at this point), and avoid hdf5 file corruption (whenever a job is interrupted after kmer counting, sometimes the h5 file containing the kmer counts cannot be re-opened). The format is experimental, so use at your own risks. The file format is basically the same content as the previous HDF5 format but with each dataset being into its own file. Also, JSON is used instead of XML for structured configuration. To enable this format, pass "-storage-type file" in your configuration string (e.g. Graph::create()).

v1.2.2

@gatb-admin gatb-admin released this Jul 20, 2016 · 441 commits to master since this release

GATB-Core version 1.2.2, release notes

This is a bug-fix release :

  • fixed a compilation issue with old version of clang compilers (prior to clang 4.3 on mac). This gatb-core release is the last one to officially support clang version older than 4.3 on mac and 3.2 on linux.

v1.2.1

@gatb-admin gatb-admin released this Jun 28, 2016 · 446 commits to master since this release

GATB-Core version 1.2.1, release notes

This is a bug-fix release :

  • bug fixes when MPHF is queried on a false positive node.
  • bug fixes that caused "Pool allocation failed" on some large instances.
  • fixed some compilation issues regarding clang version (version number incoherence between mac/linux).
  • fixed include problem in binary distribution that caused undue dependency on boost.

Released on 2016-06-28/10:42:44

v1.2.0

@gatb-admin gatb-admin released this May 24, 2016 · 479 commits to master since this release

GATB-Core version 1.2.0, release notes

Assembly-inspired de Bruijn graph simplifications

// removes tips, bubbles and erroneous connections,
// similar to some of the algorithms implemented in the SPAdes assembler
graph.simplify();

Faster graph traversal can be activated using a single command.

// allocates 1 byte/node to precompute adjacency for each nodes
// in the MPHF.
// Faster graph traversal (especially using neighbors()).
graph.precomputeAdjacency();

Improvements in MPHF and kmer counting

  • New implementation for the minimal perfect hash function (switched from emphf to BooPHF)
  • Non-canonical k-mer counting is supported via "cmake -DNONCANONICAL=1"

Breaking API changes

  • neighbors(..) becomes neighbors(..)
  • neighbors(..) becomes neighborsEdge(..)
  • iterator(..) becomes iterator(..)
  • iterator(..) becomes iteratorBranching(..)
  • node.kmer.get() becomes node.template getKmer()
  • successors(..) becomes successors(..)
  • const Node& becomes Node& (as MPHF indices are now cached in Node objects)

and so on for all graph fonctions:
- xxx,
- xxx,
- xxx,
- xxx.

Technicalities

  • The basic kmer type (Kmer<>::Type) no longer has a constructor. Use [kmer].setVal(0) to set the value of the variable [kmer] to zero.

For instance, the following code:

optimum = Kmer<span>::Type(0)

becomes:

optimum.setVal(0);
  • Graph is now a templated object (GraphTemplate<Node_t, Edge_t, GraphDataVariant_t>) behind the scenes. However this change is transparent to users of previous versions of GATB-core, as compatibility with the Graph class is preserved.
  • bug fixes in how queries with dir=DIR_INCOMING are handled.
  • various minor bug fixes

CMake files update only

@cdeltel cdeltel released this Feb 8, 2016 · 775 commits to master since this release

This is a maintenance that does not impact GATB-Core. Please use v1.1.0 binaries.