Skip to content
A succinct colored dBG representation
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
include/cereal
.clang-format
.gitignore
CMakeLists.txt
LICENSE
Makefile Added a few fixes/improvements such as removing color file from input… Sep 11, 2018
README.md Merge branch 'master' of github.com:COMBINE-lab/rainbowfish Sep 11, 2018
ScopedTimer.hpp
rb-filesystem.cpp
rb-filesystem.hpp Added a few fixes/improvements such as removing color file from input… Sep 11, 2018
rb-find-bubble.cpp Added a few fixes/improvements such as removing color file from input… Sep 11, 2018
rb-find-bubble.hpp Merge branch 'master' of github.com:COMBINE-lab/rainbowfish Sep 11, 2018
rb-pack-color.cpp
rb-pack-color.hpp
rb-query.cpp
rb-query.hpp Added a few fixes/improvements such as removing color file from input… Sep 11, 2018
rb-validate.cpp
rb-vec.cpp Added a few fixes/improvements such as removing color file from input… Sep 11, 2018
rb-vec.hpp
xxhash.c
xxhash.h

README.md

Rainbowfish

A succinct data structure to store and query colored DBG.

Note : Rainbowfish builds atop the BOSS representation of the dBG, as implemented in VARI. Currently, installing and using Rainbowfish requires first properly installing VARI and then checking out the Rainbowfish repository as a subdirectory of that project.

Installation Guide

Dependencies:

  1. VARI and all its dependencies:
    1. KMC2
    2. sdsl-lite
    3. stxxl
    4. tclap
  2. sparsepp

How to install:

  1. go through all the steps of installing VARI
  2. git clone sparsepp into 3rd_party_inst/include
  3. make (in cosmo directory)
  4. git clone rainbowfish inside the main cosmo directory.
  5. cd into the rainbowfish sub-directory:
    • >mkdir build
    • >cd build
    • >cmake ..
    • >make

How to run:

  1. Run all the first four steps in VARI, building KMC2 files and sorting kmers. After that building color matrix using command "cosmo-build"

To compare rb results with VARI you need to run "pack-color" command from VARI too.

  1. pack colors using the color file created by cosmo and save them in hard disc:
> rb-pack-color kmc2_list_1000.colors 1000 <Address to bitvector dir> <1pass/2pass>
  1. validating our query of (color & edge) vs VARI's:
> rb-validate kmc2_list_1000.dbg kmc2_list_1000.colors.sd_vector <Address to bitvector dir> <validation-type> > rb-validate 

validation-type should be one of the following words :

  • compare (to compare results with cosmo),
  • query (to go over all pairs of edge/color sequentially),
  • cosmo-query (to go over all pairs of edge/color on cosmo data structure),
  • random-query (to go over a random pair of edge/color)
  1. find bubbles:
> rb-find-bubble  -a 1 -b 2 ecoli6_63kmc2_list.dbg ecoli6_63kmc2_list.colors.sd_vector \<Address to bitvector dir\> \<type of the bitvector (uuu/uuc/ucc/ccc)\> > rb-bubbles

type of compression shows for each of the three bitvectors (label, boundary, and equivalence table) if the bitvector is compressed (c) or uncompressed (u), so for example ucc means label bitvector is uncompressed, boundary is compressed, and equivalence table bitvector is also compressed. As in paper, we use ucc.

You can’t perform that action at this time.