# Basic Usage

This notebook demonstrates the basic usage of the mothur-py module for running mothur.

The use of this module requires that mothur is in the users `PATH` environment variable.

Run using v0.3.0

In [1]:
# import all requires modules/packages
import os
import sys

# change working directory to allow import of package without installation
work_dir = os.getcwd()
base_dir = os.path.join(work_dir, '..', '..')
os.chdir(base_dir)

# from mothur_py import Mothur
from mothur_py import Mothur
os.chdir(work_dir)

Use of this module revolves around the `Mothur` class that catches method calls and passes them off to mothur to be run 
as commands. An instance of the `Mothur` class needs to be created before running any commands:


In [2]:
# create instance of Mothur class with verbosity at 1 to show normal mothur output
m = Mothur(verbosity=1)

Commands in mothur can then be executed as methods of the `Mothur` class instance using the same names you would use 
within the command line version of mothur:

In [3]:
# run the mothur help command
m.help()

mothur > help()
Valid commands are: align.check, align.seqs, amova, anosim, bin.seqs, biom.info, catchall, chimera.bellerophon, chimera.ccode, chimera.check, chimera.perseus, chimera.pintail, chimera.slayer, chimera.uchime, chimera.vsearch, chop.seqs, classify.otu, classify.seqs, classify.svm, classify.tree, clearcut, cluster, cluster.classic, cluster.fragments, cluster.split, collect.shared, collect.single, consensus.seqs, cooccurrence, corr.axes, count.groups, count.seqs, create.database, degap.seqs, deunique.seqs, deunique.tree, dist.seqs, dist.shared, fastq.info, filter.seqs, filter.shared, get.commandinfo, get.communitytype, get.coremicrobiome, get.current, get.dists, get.group, get.groups, get.label, get.lineage, get.mimarkspackage, get.otulabels, get.otulist, get.oturep, get.otus, get.rabund, get.relabund, get.sabund, get.seqs, get.sharedseqs, heatmap.bin, heatmap.sim, help, homova, indicator, kruskal.wallis, lefse, libshuff, list.otulabels, list.otus, list.seqs, make.biom, make

---

Unlike the command line version, command parameters must be passed as strings, integers, or floats:

In [4]:
m.make.contigs(file='basic_usage.files', processors=2)

mothur > make.contigs(file=basic_usage.files,processors=2)

Using 2 processors.

>>>>>	Processing file pair basic_usage_R1.fastq - basic_usage_R2.fastq (files 1 of 1)	<<<<<
Making contigs...
12
13
Done.

It took 1 secs to assemble 25 reads.

It took 1 secs to process 25 sequences.

Group count: 
basic_usage_group	25

Total of all groups is 25

Output File Names: 
basic_usage.trim.contigs.fasta
basic_usage.trim.contigs.qual
basic_usage.contigs.report
basic_usage.scrap.contigs.fasta
basic_usage.scrap.contigs.qual
basic_usage.contigs.groups


<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<^>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<^>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<^>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<^>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<^>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<^>>>>>>>>>>>>>>>>>>>>>>>>>>>>>


---

Failing to do so will generally result in python raising a `NameError`:

In [5]:
m.make.contigs(file=basic_usage.files)

NameError: name 'basic_usage' is not defined

---

There is also full implementation of the `current` keyword used in the command line version of mothur:

In [6]:
# run the mothur summary.seqs command using the 'current' option
# NOTE: current is being passed as a string
m.summary.seqs(fasta='current')

mothur > summary.seqs(fasta=current)
Using basic_usage.trim.contigs.fasta as input file for the fasta parameter.

Using 2 processors.

		Start	End	NBases	Ambigs	Polymer	NumSeqs
Minimum:	1	252	252	0	3	1
2.5%-tile:	1	252	252	0	3	1
25%-tile:	1	252	252	0	4	7
Median: 	1	252	252	0	4	13
75%-tile:	1	253	253	0	4	19
97.5%-tile:	1	254	254	3	5	25
Maximum:	1	254	254	3	5	25
Mean:	1	252.36	252.36	0.24	4.12
# of Seqs:	25

Output File Names: 
basic_usage.trim.contigs.summary

It took 0 secs to summarize 25 sequences.



In [7]:
# like the command line version, you don't even need to specify 
# the 'current' keyword for some commands
m.summary.seqs() 

mothur > summary.seqs()
Using basic_usage.trim.contigs.fasta as input file for the fasta parameter.

Using 2 processors.

		Start	End	NBases	Ambigs	Polymer	NumSeqs
Minimum:	1	252	252	0	3	1
2.5%-tile:	1	252	252	0	3	1
25%-tile:	1	252	252	0	4	7
Median: 	1	252	252	0	4	13
75%-tile:	1	253	253	0	4	19
97.5%-tile:	1	254	254	3	5	25
Maximum:	1	254	254	3	5	25
Mean:	1	252.36	252.36	0.24	4.12
# of Seqs:	25

Output File Names: 
basic_usage.trim.contigs.summary

It took 0 secs to summarize 25 sequences.

