/
5-mapping-and-quantitation.txt
74 lines (44 loc) · 1.69 KB
/
5-mapping-and-quantitation.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
=====================================
5. Mapping and abundance quantitation
=====================================
.. shell start
Let's do some simple mapping to do abundance estimation in final assembly.
Bowtie Mapping
---------------
Let's start by installing `bowtie <http://bowtie-bio.sourceforge.net/index.shtml>`
::
cd /root
curl -O -L http://sourceforge.net/projects/bowtie-bio/files/bowtie/0.12.7/bowtie-0.12.7-linux-x86_64.zip
unzip bowtie-0.12.7-linux-x86_64.zip
cd bowtie-0.12.7
cp bowtie bowtie-build bowtie-inspect /usr/local/bin
Next, build a bowtie reference from the assembly
::
cd /mnt/work/
bowtie-build final-assembly.fa metagenome
and then do the mapping
::
gunzip -c *.pe.qc.fq.gz | bowtie -p 4 -q metagenome - > metagenome.map
At the moment, there seems to be no good way to do automated differential
analysis of two samples, so we'll just show you how to annotate the
assembled sequences with the mapping abundance. This will allow MG-RAST
to properly weight annotation calls.
To do this, we will need to make two copies of the annotated assembly
with the first abundances.
::
python /usr/local/share/khmer/sandbox/make-coverage.py final-assembly.fa metagenome.map
::
mv final-assembly.fa.cov metagenome.fa
What you will see now is that there's a [cov] annotation for each
sequence in every file -- try
::
head -4 metagenome.fa
and you should see
>testasm.1[cov=259]
CAATTTATTTAAATTTTTCTACGATTCCAACA...
>testasm.2[cov=610]
ATTCTACTAATGTCATCTTTTTACCTTCTAGA...
This format can be uploaded directly to MG-RAST as an
abundance-annotated assembly, although there's no good way to do
comparative analysis yet.
Next: :doc:`6-annotating`