# Notes:
* Transitive alignments are not accurate, find cases where transitive alignments fail.
* Can two sequences that do not co-occur in a sample (group) ever be merged together? (Command 14)

# 1. About
This notebook runs through the Mothur MiSeq SOP tutorial found here: http://www.mothur.org/wiki/MiSeq_SOP .

# 2. Necessary Files
First, we should download all the files used in the tutorial.  The below cell does just that.  In total, the files are about 65MB and may take a few minutes to download/unizp.  See http://www.mothur.org/wiki/MiSeq_SOP#Logistics for more details.

In [7]:
#####################################################
##### Downloads and unzips all tutorial files #######
#####################################################

import urllib2
import zipfile
import os

# make a directory for our tutorial, and jump into it
root="mothur_notebook"
if os.getcwd().split('/')[-1] != root:
    if not os.path.isdir(root):
        os.mkdir(root)
    os.chdir(root)
print "Current working directory is: " + os.getcwd()

# Zips to grab
zips = ['http://www.mothur.org/w/images/5/59/Trainset9_032012.pds.zip',
        'http://www.mothur.org/w/images/9/98/Silva.bacteria.zip',
        'http://www.mothur.org/w/images/d/d6/MiSeqSOPData.zip']

# Zip Directory names:
seq='MiSeq_SOP'
silva='Silva.bacteria/silva.bacteria'
train='Trainset9_032012.pds'

# Grab and unzip the zip files
for url in zips:
    target = url.split('/')[-1]
    if not os.path.isfile(target):
        resource = urllib2.urlopen(url)
        print "Downloading " + target + "...\n"
        open(target,'wb').write(resource.read())
        
        print "Extracting " + target + "...\n"
        zipfile.ZipFile(target).extractall()

Current working directory is: /home/qiime/Desktop/Mothur_Notebook/mothur_notebook


# 3. MothurMagic
To use Mothur in IPython, we need to load the mothurmagic extension. See https://github.com/SchlossLab/ipython-mothurmagic for more details.

In [4]:
%install_ext https://raw.githubusercontent.com/SchlossLab/ipython-mothurmagic/master/mothurmagic.py
%load_ext mothurmagic



Installed mothurmagic.py. To use it, type:
  %load_ext mothurmagic


# 4. Commands

## 1.  make.contigs(file=stability.files, processors=8)
Joins left and right reads in a a fasta file.
### 1.1 Output Files:
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>stability.contigs.groups     </td><td>  Which sample each sequence belongs to</td></tr>
<tr><td>stability.scrap.contigs.fasta</td><td> 	Contigs thrown out because they don't pass some criterion</td></tr>
<tr><td>stability.scrap.contigs.qual </td><td>	Quality of contigs thrown out because they don't pass some criterion</td></tr>
<tr><td>stability.trim.contigs.fasta </td><td>	Assembled for and rev sequences</td></tr>
<tr><td>stability.contigs.qual       </td><td>  Quality assembled for and rev sequences</td></tr>
<tr><td>stability.contigs.report     </td><td>  Assembly report for each sequence</td></tr>
</table>

In [30]:
%%mothur
set.dir(input=./MiSeq_SOP)
make.contigs(file=stability.files, processors=8)

mothur > # Joins left and right reads
[ERROR]: You are missing (
Invalid.

mothur > set.dir(input=./MiSeq_SOP)
Mothur's directories:
inputDir=/home/qiime/Desktop/mothur_notebook/MiSeq_SOP/

mothur > make.contigs(file=stability.files, processors=8)

Using 8 processors.

>>>>>	Processing file pair /home/qiime/Desktop/mothur_notebook/MiSeq_SOP/F3D0_S188_L001_R1_001.fastq - /home/qiime/Desktop/mothur_notebook/MiSeq_SOP/F3D0_S188_L001_R2_001.fastq (files 1 of 20)	<<<<<
Making contigs...
Done.

It took 2 secs to assemble 7793 reads.


>>>>>	Processing file pair /home/qiime/Desktop/mothur_notebook/MiSeq_SOP/F3D141_S207_L001_R1_001.fastq - /home/qiime/Desktop/mothur_notebook/MiSeq_SOP/F3D141_S207_L001_R2_001.fastq (files 2 of 20)	<<<<<
Making contigs...
Done.

It took 2 secs to assemble 5958 reads.


>>>>>	Processing file pair /home/qiime/Desktop/mothur_notebook/MiSeq_SOP/F3D142_S208_L001_R1_001.fastq - /home/qiime/Desktop/mothur_notebook/MiSeq_SOP/F3D142_S208_L001_R2_001.fastq (files 3 of 20)

## 2.  summary.seqs(fasta=stability.trim.contigs.fasta)
Summarizes a fasta file.
### 2.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>stability.trim.contigs.summary	</td><td>  For each sequence name, keeps track of length, nbases, homopolymers and dereplication count.</td></tr>
</table>

In [32]:
%%mothur
set.dir(input=./MiSeq_SOP)
summary.seqs(fasta=stability.trim.contigs.fasta)

mothur > set.dir(input=./MiSeq_SOP)
Mothur's directories:
inputDir=/home/qiime/Desktop/mothur_notebook/MiSeq_SOP/

mothur > summary.seqs(fasta=stability.trim.contigs.fasta)

Using 1 processors.

Start	End	NBases	Ambigs	Polymer	NumSeqs
Minimum:	1	248	248	0	3	1
2.5%-tile:	1	252	252	0	3	3810
25%-tile:	1	252	252	0	4	38091
Median: 	1	252	252	0	4	76181
75%-tile:	1	253	253	0	5	114271
97.5%-tile:	1	253	253	6	6	148552
Maximum:	1	502	502	249	243	152360
Mean:	1	252.811	252.811	0.70063	4.44854
# of Seqs:	152360

Output File Names:
/home/qiime/Desktop/mothur_notebook/MiSeq_SOP/stability.trim.contigs.summary

It took 2 secs to summarize 152360 sequences.

mothur > quit()


## 3.  screen.seqs(fasta=stability.trim.contigs.fasta, group=stability.contigs.groups, maxambig=0, maxlength=275)
Removes sequences that do not match specified parameters.
### 3.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>stability.trim.contigs.good.fasta</td><td> 	Sequences that contain 0 ambig and are shorter than 275</td></tr>
<tr><td>stability.trim.contigs.bad.accnos</td><td> 	List of bad sequences and the reason they were discarded, i.e. number of ambigs or len</td></tr>
<tr><td>stability.contigs.good.groups	 </td><td> 	For each sequence, litst the group to which it belongs</td></tr>
</table>

In [33]:
%%mothur
set.dir(input=./MiSeq_SOP)
screen.seqs(fasta=stability.trim.contigs.fasta, group=stability.contigs.groups, maxambig=0, maxlength=275)

mothur > set.dir(input=./MiSeq_SOP)
Mothur's directories:
inputDir=/home/qiime/Desktop/mothur_notebook/MiSeq_SOP/

mothur > screen.seqs(fasta=stability.trim.contigs.fasta, group=stability.contigs.groups, maxambig=0, maxlength=275)

Using 1 processors.

Output File Names:
/home/qiime/Desktop/mothur_notebook/MiSeq_SOP/stability.trim.contigs.good.fasta
/home/qiime/Desktop/mothur_notebook/MiSeq_SOP/stability.trim.contigs.bad.accnos
/home/qiime/Desktop/mothur_notebook/MiSeq_SOP/stability.contigs.good.groups


It took 2 secs to screen 152360 sequences.

mothur > quit()


## 4.  unique.seqs(fasta=stability.trim.contigs.good.fasta)
Dereplicates identical sequences, keeping track of related sequences.
### 4.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>stability.trim.contigs.good.names</td><td> 	Tab file; for a sequence, list all the sequences that are completely identical</td></tr>
<tr><td>stability.trim.contigs.good.unique.fasta</td><td> 	List of unique sequences</td></tr>
</table>

In [35]:
%%mothur
set.dir(input=./MiSeq_SOP)
unique.seqs(fasta=stability.trim.contigs.good.fasta)

mothur > set.dir(input=./MiSeq_SOP)
Mothur's directories:
inputDir=/home/qiime/Desktop/mothur_notebook/MiSeq_SOP/

mothur > unique.seqs(fasta=stability.trim.contigs.good.fasta)
128872	16426

Output File Names:
/home/qiime/Desktop/mothur_notebook/MiSeq_SOP/stability.trim.contigs.good.names
/home/qiime/Desktop/mothur_notebook/MiSeq_SOP/stability.trim.contigs.good.unique.fasta


mothur > quit()


## 5.  count.seqs(name=stability.trim.contigs.good.names, group=stability.contigs.good.groups)
Creates a table counting the abundance of each sequence in each sample (group).
### 5.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>stability.trim.contigs.good.count_table</td><td> 	Table where row are unique seqs and cols are sample. Cell represent abundance of seq in samples</td></tr>
</table>

In [37]:
%%mothur
set.dir(input=./MiSeq_SOP)
count.seqs(name=stability.trim.contigs.good.names, group=stability.contigs.good.groups)

mothur > set.dir(input=./MiSeq_SOP)
Mothur's directories:
inputDir=/home/qiime/Desktop/mothur_notebook/MiSeq_SOP/

mothur > count.seqs(name=stability.trim.contigs.good.names, group=stability.contigs.good.groups)

Using 1 processors.
It took 1 secs to create a table for 128872 sequences.


Total number of sequences: 128872

Output File Names:
/home/qiime/Desktop/mothur_notebook/MiSeq_SOP/stability.trim.contigs.good.count_table


mothur > quit()


## 6.  summary.seqs(fasta=stability.trim.contigs.good.unique.fasta, count=stability.trim.contigs.good.count_table)
Summarizes a fasta file.  In this case, summarizing the list of unique sequences and updating their abundance.
### 6.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>stability.trim.contigs.good.unique.summary </td><td> 	Subset of sequences that are unique. Only difference with their description in the previous summary is the numSeqs column, which now contains the total number of sequences included into this sequence</td></tr>
</table>

In [38]:
%%mothur
set.dir(input=./MiSeq_SOP)
summary.seqs(fasta=stability.trim.contigs.good.unique.fasta, count=stability.trim.contigs.good.count_table)

mothur > set.dir(input=./MiSeq_SOP)
Mothur's directories:
inputDir=/home/qiime/Desktop/mothur_notebook/MiSeq_SOP/

mothur > summary.seqs(fasta=stability.trim.contigs.good.unique.fasta, count=stability.trim.contigs.good.count_table)

Using 1 processors.

Start	End	NBases	Ambigs	Polymer	NumSeqs
Minimum:	1	250	250	0	3	1
2.5%-tile:	1	252	252	0	3	3222
25%-tile:	1	252	252	0	4	32219
Median: 	1	252	252	0	4	64437
75%-tile:	1	253	253	0	5	96655
97.5%-tile:	1	253	253	0	6	125651
Maximum:	1	270	270	0	12	128872
Mean:	1	252.462	252.462	0	4.36693
# of unique seqs:	16426
total # of seqs:	128872

Output File Names:
/home/qiime/Desktop/mothur_notebook/MiSeq_SOP/stability.trim.contigs.good.unique.summary

It took 1 secs to summarize 128872 sequences.

mothur > quit()


## 7.  pcr.seqs(fasta=silva.bacteria/silva.bacteria.fasta, start=11894, end=25319, keepdots=F, processors=4)
Trims the fasta file (in this case the reference DataBase) to the specified region., i.e, 11894 to 25319
### 7.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>../silva.bacteria/silva.bacteria.pcr.fasta </td><td> 	The trimmed fasta file (DB)</td></tr>
</table>

In [60]:
%%mothur
set.dir(input=./MiSeq_SOP)
pcr.seqs(fasta=silva.bacteria/silva.bacteria.fasta, start=11894, end=25319, keepdots=F, processors=4)

mothur > set.dir(input=./MiSeq_SOP)
Mothur's directories:
inputDir=/home/qiime/Desktop/mothur_notebook/MiSeq_SOP/

mothur > pcr.seqs(fasta=silva.bacteria/silva.bacteria.fasta, start=11894, end=25319, keepdots=F, processors=4)

Using 4 processors.

Output File Names:
silva.bacteria/silva.bacteria.pcr.fasta


It took 7 secs to screen 14956 sequences.

mothur > quit()


## 8. system(mv silva.bacteria/silva.bacteria.pcr.fasta ref/silva.bacteria/silva.v4.fasta)
Renames the file 'silva.bacteria/silva.bacteria.pcr.fasta' to 'silva.bacteria/silva.v4.fasta'
### 8.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>silva.bacteria/silva.v4.fasta </td><td> 	The renamed file</td></tr>
</table>

In [64]:
%%mothur
system(mv silva.bacteria/silva.bacteria.pcr.fasta silva.bacteria/silva.v4.fasta)

mothur > system(mv silva.bacteria/silva.bacteria.pcr.fasta silva.bacteria/silva.v4.fasta)


mothur > quit()


## 9.  align.seqs(fasta=stability.trim.contigs.good.unique.fasta, reference=silva.bacteria/silva.v4.fasta)
Align our unique sequences against the reference DataBase
### 9.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>stability.trim.contigs.good.unique.align </td><td> 	The reads alignment file (not including DB seqs)</td></tr>
<tr><td>stability.trim.contigs.good.unique.align.report </td><td> 	Contains pairwise alignment information for the sequences and their hits. Ex. alignment length, start in query, start in hit, end in query, etc.</td></tr>
</table>

In [6]:
%%mothur
set.dir(input=./MiSeq_SOP)
align.seqs(fasta=stability.trim.contigs.good.unique.fasta, reference=silva.bacteria/silva.v4.fasta)

mothur > set.dir(input=./MiSeq_SOP)
Mothur's directories:
/home/qiime/Desktop/Mothur_Notebook/MiSeq_SOP/ directory does not exist or is not writable.

mothur > align.seqs(fasta=stability.trim.contigs.good.unique.fasta, reference=silva.bacteria/silva.v4.fasta)
Unable to open stability.trim.contigs.good.unique.fasta. Trying default /home/qiime/mothur/mothur/stability.trim.contigs.good.unique.fasta
Unable to open /home/qiime/mothur/mothur/stability.trim.contigs.good.unique.fasta. It will be disregarded.
no valid files.

Using 1 processors.
Unable to open silva.bacteria/silva.v4.fasta. Trying default /home/qiime/mothur/mothur/silva.v4.fasta
Unable to open /home/qiime/mothur/mothur/silva.v4.fasta
[ERROR]: did not complete align.seqs.

mothur > quit()


************************************************************
************************************************************
************************************************************
Detected 1 [ERROR] messages, please review.
***************

## 10.  summary.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table)
Summarizes a fasta file.  In this case, updates the start & stop columns to reflect the new alignment.
### 10.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>stability.trim.contigs.good.unique.summary</td><td> 	Updates the new start and end values for each of the sequence within the context of the larger alignement</td></tr>
</table>
 

In [8]:
%%mothur
set.dir(input=./MiSeq_SOP)
summary.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table)

mothur > set.dir(input=./MiSeq_SOP)
Mothur's directories:
inputDir=/home/qiime/Desktop/Mothur_Notebook/mothur_notebook/MiSeq_SOP/

mothur > summary.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table)

Using 1 processors.

Start	End	NBases	Ambigs	Polymer	NumSeqs
Minimum:	1250	10693	250	0	3	1
2.5%-tile:	1968	11550	252	0	3	3222
25%-tile:	1968	11550	252	0	4	32219
Median: 	1968	11550	252	0	4	64437
75%-tile:	1968	11550	253	0	5	96655
97.5%-tile:	1968	11550	253	0	6	125651
Maximum:	1982	13400	270	0	12	128872
Mean:	1967.99	11550	252.462	0	4.36693
# of unique seqs:	16426
total # of seqs:	128872

Output File Names:
/home/qiime/Desktop/Mothur_Notebook/mothur_notebook/MiSeq_SOP/stability.trim.contigs.good.unique.summary

It took 5 secs to summarize 128872 sequences.

mothur > quit()


## 11.  screen.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table, summary=stability.trim.contigs.good.unique.summary, start=1968, end=11550, maxhomop=8)
Removes sequences with that start after 1968 or end before 11550.  This enables alignment with the database
### 11.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>stability.trim.contigs.good.good.count_table</td><td> 	 New table with updated counts that show removed sequences</td></tr>
<tr><td>stability.trim.contigs.good.unique.good.align</td><td> 	 Alignment file without the removed sequences</td></tr>
<tr><td>stability.trim.contigs.good.unique.bad.accnos</td><td>   Sequences that were removed and reason for removal</td></tr>
<tr><td>stability.trim.contigs.good.unique.good.summary</td><td> New summary file without the removed sequences</td></tr>
</table>

In [10]:
%%mothur
set.dir(input=./MiSeq_SOP)
screen.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table, summary=stability.trim.contigs.good.unique.summary, start=1968, end=11550, maxhomop=8)

mothur > set.dir(input=./MiSeq_SOP)
Mothur's directories:
inputDir=/home/qiime/Desktop/Mothur_Notebook/mothur_notebook/MiSeq_SOP/

mothur > screen.seqs(fasta=stability.trim.contigs.good.unique.align, count=stability.trim.contigs.good.count_table, summary=stability.trim.contigs.good.unique.summary, start=1968, end=11550, maxhomop=8)

Using 1 processors.

Output File Names:
/home/qiime/Desktop/Mothur_Notebook/mothur_notebook/MiSeq_SOP/stability.trim.contigs.good.unique.good.summary
/home/qiime/Desktop/Mothur_Notebook/mothur_notebook/MiSeq_SOP/stability.trim.contigs.good.unique.good.align
/home/qiime/Desktop/Mothur_Notebook/mothur_notebook/MiSeq_SOP/stability.trim.contigs.good.unique.bad.accnos
/home/qiime/Desktop/Mothur_Notebook/mothur_notebook/MiSeq_SOP/stability.trim.contigs.good.good.count_table


It took 5 secs to screen 16426 sequences.

mothur > quit()


## 12.  filter.seqs(fasta=stability.trim.contigs.good.unique.good.align, vertical=T, trump=.)
Removes gap-only ('-') columns, and any columns containing a '.' from all alignments.
### 12.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>stability.trim.contigs.good.unique.good.filter.fasta</td><td> 	 New trimmed alignment without dots or dashes</td></tr>
</table>

In [13]:
%%mothur
set.dir(input=./MiSeq_SOP)
filter.seqs(fasta=stability.trim.contigs.good.unique.good.align, vertical=T, trump=.)

mothur > set.dir(input=./MiSeq_SOP)
Mothur's directories:
inputDir=/home/qiime/Desktop/Mothur_Notebook/mothur_notebook/MiSeq_SOP/

mothur > filter.seqs(fasta=stability.trim.contigs.good.unique.good.align, vertical=T, trump=.)

Using 1 processors.
Creating Filter...


Running Filter...



Length of filtered alignment: 376
Number of columns removed: 13049
Length of the original alignment: 13425
Number of sequences used to construct filter: 16298

Output File Names:
/home/qiime/Desktop/Mothur_Notebook/mothur_notebook/MiSeq_SOP/stability.filter
/home/qiime/Desktop/Mothur_Notebook/mothur_notebook/MiSeq_SOP/stability.trim.contigs.good.unique.good.filter.fasta


mothur > quit()


## 13.  unique.seqs(fasta=stability.trim.contigs.good.unique.good.filter.fasta, count=stability.trim.contigs.good.good.count_table)
Filtering in the previous step may have introduced duplicates, dereplicate these and keep track of their counts
### 13.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>stability.trim.contigs.good.unique.good.filter.count_table</td><td> 	 New table with updated counts that show dereplicated sequences</td></tr>
<tr><td>stability.trim.contigs.good.unique.good.filter.unique.fasta</td><td> 	 New alignment without the replicated sequences</td></tr>
</table>

In [17]:
%%mothur
set.dir(input=./MiSeq_SOP)
unique.seqs(fasta=stability.trim.contigs.good.unique.good.filter.fasta, count=stability.trim.contigs.good.good.count_table)

mothur > set.dir(input=./MiSeq_SOP)
Mothur's directories:
inputDir=/home/qiime/Desktop/Mothur_Notebook/mothur_notebook/MiSeq_SOP/

mothur > unique.seqs(fasta=stability.trim.contigs.good.unique.good.filter.fasta, count=stability.trim.contigs.good.good.count_table)
16298	16295

Output File Names:
/home/qiime/Desktop/Mothur_Notebook/mothur_notebook/MiSeq_SOP/stability.trim.contigs.good.unique.good.filter.count_table
/home/qiime/Desktop/Mothur_Notebook/mothur_notebook/MiSeq_SOP/stability.trim.contigs.good.unique.good.filter.unique.fasta


mothur > quit()


## 14.  pre.cluster(fasta=stability.trim.contigs.good.unique.good.filter.unique.fasta, count=stability.trim.contigs.good.unique.good.filter.count_table, diffs=2)
Removes sequences with pyrosequencing errors by clustering within groups (samples) sequences that are within 2bp of eachother.
### 14.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta</td><td>Contains all the clusters representatives</td></tr>
<tr><td> stability.trim.contigs.good.unique.good.filter.unique.precluster.count_table </td><td> New counts that represent merged, similar (within 2bp) sequences </td></tr>
<tr><td>stability.trim.contigs.good.unique.good.filter.unique.precluster.F3D0.map  </td><td> Shows the clusters for each sample (group) </td></tr>
<tr><td> ...other samples map files </td><td> ... </td></tr>
<tr><td> stability.trim.contigs.good.unique.good.filter.unique.precluster.F3D9.map </td><td> Shows the clusters for each sample (group) </td></tr>
 
</table>

In [18]:
%%mothur
set.dir(input=./MiSeq_SOP)
pre.cluster(fasta=stability.trim.contigs.good.unique.good.filter.unique.fasta, count=stability.trim.contigs.good.unique.good.filter.count_table, diffs=2)

mothur > set.dir(input=./MiSeq_SOP)
Mothur's directories:
inputDir=/home/qiime/Desktop/Mothur_Notebook/mothur_notebook/MiSeq_SOP/

mothur > pre.cluster(fasta=stability.trim.contigs.good.unique.good.filter.unique.fasta, count=stability.trim.contigs.good.unique.good.filter.count_table, diffs=2)

Using 1 processors.

Processing group F3D0:
1523	564	959
Total number of sequences before pre.cluster was 1523.
pre.cluster removed 959 sequences.

It took 0 secs to cluster 1523 sequences.

Processing group F3D1:
1195	475	720
Total number of sequences before pre.cluster was 1195.
pre.cluster removed 720 sequences.

It took 0 secs to cluster 1195 sequences.

Processing group F3D141:
1115	446	669
Total number of sequences before pre.cluster was 1115.
pre.cluster removed 669 sequences.

It took 0 secs to cluster 1115 sequences.

Processing group F3D142:
723	322	401
Total number of sequences before pre.cluster was 723.
pre.cluster removed 401 sequences.

It took 0 secs to cluster 723 sequences.

Pro

## 15.  chimera.uchime(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.count_table, dereplicate=t)
Removes chimeras from the count files.  Chimeras are still present in the fasa file.
### 15.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>  stability.trim.contigs.good.unique.good.filter.unique.precluster.uchime.pick.count_table    </td><td>      New count table not including chimeras in counts
 </td></tr>
<tr><td>   stability.trim.contigs.good.unique.good.filter.unique.precluster.uchime.chimeras   </td><td> Log of analysis done to find chimeras      </td></tr>
<tr><td> stability.trim.contigs.good.unique.good.filter.unique.precluster.uchime.accnos     </td><td>    List of chimeras   </td></tr>
</table>

In [22]:
%%mothur
set.dir(input=./MiSeq_SOP)
chimera.uchime(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.count_table, dereplicate=t)

mothur > set.dir(input=./MiSeq_SOP)
Mothur's directories:
inputDir=/home/qiime/Desktop/Mothur_Notebook/mothur_notebook/MiSeq_SOP/

mothur > chimera.uchime(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.count_table, dereplicate=t)

Using 1 processors.
/home/qiime/mothur/mothur/uchime file does not exist. Checking path...
Found uchime in your path, using /home/qiime/mothur//uchime

uchime by Robert C. Edgar
http://drive5.com/uchime
This code is donated to the public domain.

Checking sequences from /home/qiime/Desktop/Mothur_Notebook/mothur_notebook/MiSeq_SOP/stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta ...

It took 3 secs to check 564 sequences from group F3D0.

It took 2 secs to check 475 sequences from group F3D1.

It took 2 secs to check 446 sequences from group F3D141.

It took 1 secs to check 322 sequences from group F3D142.

It took 1 secs to check 342 se

## 16.  remove.seqs(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta, accnos=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.accnos)
Removes chimeras from the fasta files
### 16.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta</td><td>  New fasta file without the chimeras </td></tr>
</table>


In [24]:
%%mothur
set.dir(input=./MiSeq_SOP)
remove.seqs(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta, accnos=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.accnos)

mothur > set.dir(input=./MiSeq_SOP)
Mothur's directories:
inputDir=/home/qiime/Desktop/Mothur_Notebook/mothur_notebook/MiSeq_SOP/

mothur > remove.seqs(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.fasta, accnos=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.accnos)
Removed 3054 sequences from your fasta file.

Output File Names:
/home/qiime/Desktop/Mothur_Notebook/mothur_notebook/MiSeq_SOP/stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta


mothur > quit()


## 17.  classify.seqs(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.pick.count_table, reference=trainset9_032012.pds.fasta, taxonomy=trainset9_032012.pds.tax, cutoff=80)
Classifies the samples into Taxae using the training data
### 17.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td> stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.taxonomy    </td><td>       Taxonomic classfication for each sequence based on the algorithm implemented in  classify.seqs </td></tr>
<tr><td> stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.tax.summary
</td><td>   Summary of how many sequences occur at each lineage. The info is split taxonomic level -- according to taxonomic tree hierarchy.</td></tr>
</table>

In [None]:
%%mothur
set.dir(input=./MiSeq_SOP)
classify.seqs(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.pick.count_table, reference=trainset9_032012.pds.fasta, taxonomy=trainset9_032012.pds.tax, cutoff=80)

## 18.  remove.lineage(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.pick.count_table, taxonomy=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.taxonomy, taxon=Chloroplast-Mitochondria-unknown-Archaea-Eukaryota)
Removes non-bacterial sequences
### 18.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>  stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.taxonomy</td><td> The new taxonomy without the undesirable taxe. Number in the taxonomy indicate the posterior 
probability according to the Wang method   </td></tr>
<tr><td>   stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pick.fasta   </td><td>  New fasta alignment file without the undesirable taxa     </td></tr>
<tr><td>  stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.pick.pick.count_table    </td><td>   New counts table without the undesirable taxa    </td></tr>
</table>


In [None]:
%%mothur
set.dir(input=./MiSeq_SOP)
remove.lineage(fasta=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.fasta, count=stability.trim.contigs.good.unique.good.filter.unique.precluster.denovo.uchime.pick.count_table, taxonomy=stability.trim.contigs.good.unique.good.filter.unique.precluster.pick.pds.wang.taxonomy, taxon=Chloroplast-Mitochondria-unknown-Archaea-Eukaryota)

## 19.  pre.clu
Removes sequences
### 19.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
</table>

In [None]:
%%mothur
set.dir(input=./MiSeq_SOP)

## 20.  pre.clu
Removes sequences
### 20.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
</table>

In [None]:
%%mothur
set.dir(input=./MiSeq_SOP)

## 21.  pre.clu
Removes sequences
### 21.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
</table>

In [None]:
%%mothur
set.dir(input=./MiSeq_SOP)

## 22.  pre.clu
Removes sequences
### 22.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
</table>

In [None]:
%%mothur
set.dir(input=./MiSeq_SOP)

## 23.  pre.clu
Removes sequences
### 23.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
</table>

In [None]:
%%mothur
set.dir(input=./MiSeq_SOP)

## 24.  pre.clu
Removes sequences
### 24.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
</table>

In [None]:
%%mothur
set.dir(input=./MiSeq_SOP)

## 25.  pre.clu
Removes sequences
### 25.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
</table>

In [None]:
%%mothur
set.dir(input=./MiSeq_SOP)

## 26.  pre.clu
Removes sequences
### 26.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
</table>

In [None]:
%%mothur
set.dir(input=./MiSeq_SOP)

## 27.  pre.clu
Removes sequences
### 27.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
</table>

In [None]:
%%mothur
set.dir(input=./MiSeq_SOP)

## 28.  pre.clu
Removes sequences
### 28.1 Output Files
<table>
<hr><td>Output File</td><td>Description</td></hr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
<tr><td>      </td><td>       </td></tr>
</table>

In [None]:
%%mothur
set.dir(input=./MiSeq_SOP)