Skip to content
Isaac Turner edited this page Sep 6, 2016 · 5 revisions

Example:

mccortex31 build --kmer 31 --sample NA12878 --seq NA12878.bam --seq2 r1.fq:f2.fq NA12878.k31.ctx

Command help:

  usage: mccortex31 build [options] <out.ctx>

    Build a cortex graph.  

    -h, --help               This help message
    -q, --quiet              Silence status output normally printed to STDERR
    -f, --force              Overwrite output files
    -m, --memory <mem>       Memory to use
    -n, --nkmers <kmers>     Number of hash table entries (e.g. 1G ~ 1 billion)
    -t, --threads <T>        Number of threads to use [default: 2]
    -k, --kmer <kmer>        Kmer size must be odd (31 >= k >= 3)
    -s, --sample <name>      Sample name (required before any seq args)
    -1, --seq <in.fa>        Load sequence data
    -2, --seq2 <in1:in2>     Load paired end sequence data
    -i, --seqi <in.bam>      Load paired end sequence from a single file
    -Q, --fq-cutoff <Q>      Filter quality scores [default: 0 (off)]
    -O, --fq-offset <N>      FASTQ ASCII offset    [default: 0 (auto-detect)]
    -H, --cut-hp <bp>        Breaks reads at homopolymers >= <bp> [default: off]
    -p, --remove-pcr         Remove (or keep) PCR duplicate reads
    -P, --keep-pcr           Don't do PCR duplicate removal [default]
    -M, --matepair <orient>  Mate pair orientation: FF,FR,RF,RR [default: FR]
                             (for --keep_pcr only)
    -g, --graph <in.ctx>     Load samples from a graph file (.ctx)
    -I, --intersect <i.ctx>  Only load kmers that appear in i.ctx. Multiple -I
                             graphs will be merged, not intersected. Treated as
                             single colour graphs.

Note: Arguments must come before the relevant input files.

PCR duplicate removal works by ignoring read (pairs) if (both) reads start at the same k-mer as any previous read. Carried out per sample, not per file.

--sample <name> is required before sequence input can be loaded. Consecutive sequence options are loaded into the same colour.

--graph argument can have colours specifed e.g. in.ctx:0,6-8 will load samples 0,6,7,8. Graphs are loaded into new colours.

Extended Example:

   mccortex31 build -k 31 -m 60G --fq-cutoff 5 --cut-hp 8 \
                    --sample bob --seq bob.bam \
                    --sample dylan --remove_pcr --seq dylan.sam \
                                   --keep-pcr --fq-offset 33 \
                                   --seq2 dylan.1.fq.gz dylan.2.fq.gz \
                    bob.dylan.k31.ctx

See mccortex31 join to combine .ctx files.

Clone this wiki locally