Discriminative discovery of sequence motifs with hidden Markov models
Copyright 2011, Jonas Maaskola. This is free software under the GPL version 3, or later. See the file COPYING for detailed conditions of distribution.
To build Discrover you need to have some software. During the configuration phase it is checked whether these packages are found on your system. Some functionality of Discrover is enabled if you have certain, optional dependencies. The required and optional dependencies include:
Discrover is written in C++11, so it is necessary to use an up-to-date version of your compiler. The GNU compiler collection (GCC) supports all necessary features to compile this project as of version 4.6.
Link-time optimization (LTO) will be used if your compiler is new enough. LTO support in GCC prior to version 4.8 is somewhat unreliable, and LTO is only enabled if you have at least version 4.8.
We use CMake to construct Makefiles for building Discrover. As of version 1.5.0 of Discrover we require >= CMake 2.8.8.
We make use of code in the Boost C++ libraries, so please install them. These libraries are available under the Boost License. The version of Boost must be recent enough to include the V3 filesystem library. Version 1.48 and more recent versions are known to work.
We use OpenMP to support parallelization.
As of version 4.2 the GCC supports OpenMP out of the box, so we suggest building with a recent compiler version.
At the time of writing clang++
from the LLVM project does not yet have full OpenMP support.
As of Discrover version 1.4.0 Cairo is used to generate sequence logos in PDF and PNG format.
We are using code from the R library to compute the logarithm of the chi-square distribution function. During configuration it is checked if the R library is found. If that is the case then we link to it, otherwise code extracted from it is built and used.
Optionally, the program DREME from the MEME suite of motif analysis tools can be integrated for seeding of HMM motifs.
During configuration it is checked if any programs named dreme
or meme-dreme
can be found.
If that is the case, you will be able to use DREME for motif seeding by specifying --algo dreme
.
LaTeX is another optional dependency.
In particular, pdflatex
is used to compile the manual.
During configuration it is checked if your system has pdflatex
.
If it has, then the manual is built, otherwise building of the manual is skipped.
Note that some LaTeX-packages are used that may not be part of your core LaTeX installation.
Unfortunately, while our build-system will detect the presence of pdflatex
, it is currently not sophisticated enough to figure out if these LaTeX packages are present.
Thus, if later during compilation you see errors while building the manual, please ensure that you have the requisite LaTeX packages available on your system.
For example, on Debian, they are part of the texlive-latex-extra
package.
Similarly, on Gentoo, they are part of the dev-texlive/texlive-latexextra
package.
Note that the above-given list of dependencies is required only for BUILDING, not for running.
On Debian and Ubuntu, you can install all necessary and optional software to build Discrover with the following command:
apt-get install git cmake g++ libboost-all-dev texlive texlive-latex-base latex-xcolor texlive-latex-extra pgf libcairo2-dev
Similarly, on Gentoo you can use:
emerge -av dev-vcs/git dev-util/cmake sys-devel/gcc dev-libs/boost dev-texlive/texlive-latexextra x11-libs/cairo
The corresponding command for Arch linux:
pacman -S git cmake make gcc boost texlive-core texlive-latexextra cairo
On Fedora 20 and 21 the following command will install all dependencies required for building:
yum install gcc-c++ cmake git boost boost-devel texlive-latex-bin texlive-pgf texlive-xcolor texlive-collection-latexextra cairo-devel
On Mac OS X, git is provided with XCode. Using brew, you can install CMake and ImageMagick like this:
brew install cmake imagemagick
While brew also provides a binary package for GCC, we cannot use it, as it does not support OpenMP. For this reason you need to rebuild GCC on your system:
brew install gcc --without-multilib
The boost brew package can also not be used because it relies on the GCC brew package and does not work with the manualy built GCC.
Therefore, please build boost according to these instructions.
Finally, adapt the SET(BOOST_ROOT "...")
statement in CMakeLists.txt
to point to the place where you installed boost.
For TeX, please install MacTex.
The code contained in this package is built in four steps.
- Edit the CMake build script (optional)
- Execute the CMake build script
- Compile the source code
- Link and install the binary file
The file CMakeLists.txt in the root directory of the source code tree contains the instructions to configure and build Discrover.
Among other things, you can set the installation target directory in it.
By default, it will be installed into the directory rooted at /usr/local
.
To change the installation target directory, find the following lines, and modify them to your liking.
SET(LOCAL_PREFIX "/usr/local")
SET(CMAKE_INSTALL_PREFIX ${LOCAL_PREFIX})
SET(CMAKE_PREFIX_PATH ${LOCAL_PREFIX})
Explanation:
The variable CMAKE_INSTALL_PREFIX
determines where the software will be installed after building.
In particular, programs will be installed in ${CMAKE_INSTALL_PREFIX}/bin
, libraries in ${CMAKE_INSTALL_PREFIX}/lib
, and documentation into ${CMAKE_INSTALL_PREFIX}/share/doc/discrover
.
The variable CMAKE_PREFIX_PATH
has to be set such that the required dependencies may be found.
In particular, the headers of Boost need to be found in ${CMAKE_PREFIX_PATH}/include
, if not installed in a standard system directory like /usr/include
.
Similarly, the directory ${CMAKE_PREFIX_PATH}/lib
tells CMake where to find the Boost libraries if they are not installed in a standard system directory.
By default, both CMAKE_INSTALL_PREFIX
and CMAKE_PREFIX_PATH
are constructed from LOCAL_PREFIX
.
Hence, setting LOCAL_PREFIX
to match your system might be the only change required in most cases, if necessary at all.
After this variable has been adapted to your system, you may proceed with the next step.
The software needs to be built out-of-source.
For this, Change to the root directory of the package, create a new subdirectory (e.g. name it build
), change into it and run cmake
from there:
cd <discrover_directory>
mkdir build && cd build && cmake ..
Alternatively, you can execute
mkdir build && cd build && cmake -DCMAKE_INSTALL_PREFIX:PATH=/desired/installation/path ..
where you would replace /desired/installation/path
by the path to which you want to install the package.
Note that you would have to issue this every time you configure the package, so it may be preferable to set this permanently as described in step 1 above.
Explanation: This will create a directory called build, search for the paths to the required headers and libraries, and prepare everything for the subsequent compilation. It will also check if your compiler supports the required features (C++11 and OpenMP support).
If anything fails at this step please have a look at CMakeLists.txt and see if some of the commented-out statements may help you.
Note again, that only out-of-source builds are supported, and for this reason it is necessary for you to create a subdirectory, change into it, and run cmake
from there.
If difficulties persists, please contact the author of this software.
Execute from the package's root directory
make
Explanation: This will compile the source code. You may make use of parallel building by running
make -j N
where N
is the number of CPUs that you want to use.
While still in the root directory of this package, execute
make install
Explanation:
This will copy the libraries and binaries into the default installation path.
If you did not specify otherwise via CMakeLists.txt in step 1, or via the command line in step 2, then the installation will go to /usr/local/bin
, /usr/local/lib
, /usr/local/share/doc
and /usr/local/share/man
.
If these locations are part of your PATH
, LD_LIBRARY_PATH
, and MANPATH
environment variables you can then simply run the discrover
program from anywhere.
Otherwise, you might still have to add the directories you installed to the PATH
, LD_LIBRARY_PATH
, and MANPATH
variables.
This can be done with a command like
export PATH="HERE/bin:$PATH"
export LD_LIBRARY_PATH="HERE/lib:$LD_LIBRARY_PATH"
export MANPATH="HERE/share/man:$MANPATH"
where HERE
will have to be the path that you installed to (the value of the CMAKE_INSTALL_PREFIX
variable that you set above).
You might consider putting these commands into your ~/.bashrc
file or some place similar such that they are executed every time you log into your machine.