Normalized Conditional Compression Distance
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
docs
examples
imgs
scripts
README.md
run.sh

README.md

NCCD

Programs to compute the NCCD (Normalized Conditional Compression Distance) and perform phylogenomics (whole genome) on 48 bird species. It will use a state-of-the-art genomic compressor, based on a mixture of finite-context models, as a metric distance.

INSTALLATION

Simply run:

wget https://github.com/pratas/NCCD/archive/master.zip
unzip master.zip
cd NCCD-master

EXECUTION

Make shore you have at least 200 GB of space in the hard drive. Then, simply run:

. run.sh 

It will download and install GeCo (https://github.com/pratas/geco/), although it might be needed to install cmake. Then, it will download all the the 48 bird sequences and run the NCCD.

For other purposes, such as a simple information distance between two sequences (fileA and fileB), go to scripts:

cd scripts

and run

. NCCD.sh ../examples/fileA ../examples/fileB

It will calculate the NCCD on two synthetic sequence examples included in the system.

ISSUES

For any issue let us know at issues link.

LICENSE

GPL v2.

For more information:

http://www.gnu.org/licenses/gpl-2.0.html