# Running Python
The first, and simplest, way to run this program is to execute each command:

In [1]:
import sys

In [4]:
print("Python {}".format(sys.version_info))
print("Hello world")

Python sys.version_info(major=3, minor=5, micro=1, releaselevel='final', serial=0)
Hello world


The second way is to copy it into a text editor, save as a .py file (mflp.py), and then run the script using the `%run` magic function:

In [13]:
%run mflp.py

Python sys.version_info(major=3, minor=5, micro=1, releaselevel='final', serial=0)
Hello world


A third way is to run the script using the `exec` function:

In [15]:
exec(open('mflp.py').read())

Python sys.version_info(major=3, minor=5, micro=1, releaselevel='final', serial=0)
Hello world


# Imports

In [16]:
import numpy

The `import` command only enters the name of the module (in this case, `numpy`) into the current symbol table, not the names of the functions within that module.

In [17]:
from numpy import *

This is a variant of the `import` statement that enters all the function names defined within the module (except those beginning with an underscore).

In [23]:
from numpy import random

This statement will import the `random` function within the `numpy` module.

Of these three, `from numpy import random` is probably the most useful, because the user does not run the risk of overwriting a function which they have already named (substitute `random` with the function of your choice).

In [33]:
random

<module 'numpy.random' from '/Users/jsalt/anaconda/lib/python3.5/site-packages/numpy/random/__init__.py'>

The statement `from numpy import random` overwrote the `stdlib` `random` module, as evidenced by the path above. To reset it to the `stdlib` version, execute the following commands:

In [36]:
import random
random

<module 'random' from '/Users/jsalt/anaconda/lib/python3.5/random.py'>

# Paths

Path to file on desktop:
`/users/jsalt/desktop/file_1`

Path to file in `root` directory: `/users/jsalt/file_2`

`$PATH` is an environment variable that specifies a set of directories where executable programs are located. The user can edit which directories are part of their path for ease of access to programs they are primarily working with. (Source: https://en.wikipedia.org/wiki/PATH_(variable) (accessed 19 Jan 2016))

The `$PYTHONPATH` environment variable allows the user to easily import Python modules from any directory. (Source: https://support.enthought.com/hc/en-us/articles/204469160-How-do-I-set-PYTHONPATH-and-other-environment-variables-for-Canopy- (accessed 19 Jan 2016))

# Modules

The `biopython` module was created to enable bioinformatics applications in the Python language. The main advantage of Biopython is that it allows you to interact with online bioinformatics tools, such as NCBI (Blast, PubMed, etc.) by interfacing directly and, most importantly, by parsing bioinformatics files (i.e. FASTA files, GenBank sequences, etc.) into Python-friendly formats. Central to this functionality is the `Seq` object, which allows the user to define a sequence (which is what we're all doing here, right?) and then manipulate it in various ways. The `Seq` object behaves similarly to a `string`, but with some exciting biologically-based differences. The following example is taken from the Biopython documentation (accessed 19 Jan 2016):

In [51]:
from Bio.Seq import Seq
my_seq = Seq("AGTACACTGGT")
my_seq

Seq('AGTACACTGGT', Alphabet())

In [53]:
print(my_seq)

AGTACACTGGT


In [48]:
my_seq.complement()

Seq('TCATGTGACCA', Alphabet())

In [49]:
my_seq.reverse_complement()

Seq('ACCAGTGTACT', Alphabet())

Wow, how cool is that! `Alphabet`, which we haven't defined here, refers to the type of sequence (i.e. DNA, amino acid, etc.).

As mentioned above, the beauty of Biopython is its ability to parse bioinformatics files using modules such as `Bio.SeqIO`, which for a given input file will output the accession number, sequence, the alphabet, and locus length, for example. There are many different parsers in Biopython, each suited for different types of input file.

The Biopython module will be very useful to me as I learn how to manipulate large sequence datasets to produce alignments and phylogenetic trees. For my own research I will need genomic sequence data from several species, which I can now access and interact with through Biopython. Furthermore, when I have generated new sequence data, I can upload it to GenBank using this module.

(Source: the very helpful Biopython documentation, http://biopython.org/DIST/docs/tutorial/Tutorial.pdf (accessed 19 Jan 2016)).