-
Notifications
You must be signed in to change notification settings - Fork 1
Home
Welcome to the Brownie wiki page!
This package requires a number of packages to be install on your system. Required: CMake; Google's Sparsehash; gcc (GCC 4.7 or a more recent version) Optional: ZLIB; Googletest Unit Testing
How to install these packages:
As a root, execute the following commands:
on Redhat / Fedora distributions
* yum install cmake
* yum install sparsehash-devel
* yum install zlib-devel
(optional)
on Ubuntu / Debian distributions
* aptitude install cmake
* aptitude install libsparsehash-dev
* aptitude install libghc-zlib-dev
(optional)
The installation of Brownie is now simple. First, unzip the brownie-xxx.tar.gz file (where xxx denotes the version number):
tar -xzvf brownie-xxx.tar.gz cd brownie-xxx
From this directory, run the following commands:
-
mkdir build
-
cd build
-
cmake ..
-
make install
By executing ./brownie you will see
brownie: missing input read file
Try 'brownie --help' for more information
A useful option to specify where cmake should install the software is CMAKE_INSTALL_PREFIX. For example, to install in your local
$(HOME)/i-adhore directory you would run: cmake .. -DCMAKE_INSTALL_PREFIX=$ (HOME)/brownie
./brownie reads.fastq
reads.fastq is an input file which will be cleaned by Brownie. Initial reads will be stored in reads.corr.fastq after correction in the same order.
By executing ./brownie --help
, it prints the help info in your screen like this:
Usage: brownie [options] [file_options] file1 [[file_options] file2]...
Corrects sequence reads in file(s)
[options]
-h --help
display help page
-i --info
display information page
-s --singlestranded
enable single stranded DNA [default = false], we assume sequence data are double stranded by default, you should specify explicitly by this command if you have a single strand input data.
[options arg]
-k --kmersize
kmer size [default = 31], optional parameter in range of 9 to 31, and only odd numbers are allowed. To have a bigger kmer size change this line of code add_definitions("-DMAXKMERLENGTH=31")
in CMakeLists.txt
file to for instance add_definitions("-DMAXKMERLENGTH=63")
if your kmer size is less than 64.
-t --threads
number of threads [default = available cores]
-g --genomesize
size of the genome [default = auto]
-p --pathtotmp
path to directory to store temporary files [default = current directory], the given directory should be exist.
[file_options]
-o --output
output file name [default = inputfile.corr]
--perfectgraph
skip read and graph correction, with this option Brownie builds only De bruijn graph and don't do any modification on this graph. (please look at Brownie's output files explanations)
--graph
skip read correction, with this option Brownie builds De bruijn graph, and modify it to remove erroneous nodes.
examples:
./brownie inputA.fastq
./brownie -k 29 -t 4 -o outputA.fasta inputA.fasta -o outputB.fasta inputB.fastq
Brownie is written in a way that Graph construction and Error correction are done in two different stages. Therefore you can make a perfect graph with a reference genome sequence which is the closest one to your read file. In this way Brownie process reference genome like a fasta read file. Since Brownie ignores automatically reads occur only once in the input file, you should duplicate your genome file first. After making perfect graph, now you can correct your read file.
Tn summary, execute the following commands for reference based error correction.
cat genome.fasta genome.fasta> genome2.fasta
./brownie -t 12 -p sillyDir -k 25 --perfectgraph genome2.fasta
./brownie -t 12 -p sillyDir -k 25 -o BrownieCorrected.fastq initialReads.fastq
In order to have a Illumina simulate data you can refer to this paper: ART: a next-generation sequencing read simulator
After downloading the application you can run the following command to have reads with length 250, and coverage 100.
./art_illumina -i genome.fasta -p -l 250 -f 100 -m 300 -s 30 -o reads250
genome.fasta is your input file which you want to produce reads based on it