![Welcome](../pix/python-27624.png)

<center>
    <img src="files/images/aim.png" /> 
</center>


## What is python?
[Python](http://www.python.org/) is a modern, general-purpose, object-oriented, high-level programming language.

General characteristics of Python:

* **clean and simple language:** Easy-to-read and intuitive code, easy-to-learn minimalistic syntax, maintainability scales well with size of projects.
* **expressive language:** Fewer lines of code, fewer bugs, easier to maintain.

Technical details:

* **dynamically typed:** No need to define the type of variables, function arguments or return types.
* **automatic memory management:** No need to explicitly allocate and deallocate memory for variables and data arrays. No memory leak bugs. 
* **interpreted:** No need to compile the code. The Python interpreter reads and executes the python code directly.

![popular](../pix/popular2015.jpg)

### Python compared to other languages

Lets make a simple program that **counts the number of adenine in a DNA sequence**. The rationale for this is not to understand all the different pieces of code but to see the difference in how the code is written.

Here is an outline of what we want to do:
<ol>
    <li>Go through each nucleotide in a DNA sequence ACGAAGTCGAGG that we call "dna_sequence". </li>
    <li>Ask if the current nucleotide is "A".</li>
    <ul>
        <li>If yes: Add 1 to a variable we call "nAde".</li>
        <li>If no: Don't add anything.</li>
    </ul>
    <li>Continue to next nucleotide until the end of the sequence</li>
    <li>Print the variable "nAde" to the screen</li>
</ol>


**In C:**

While C is one of the fastest computer languages in terms of running time it has a rather steep learning curve for new comers and does not have many built in functionality.



**In perl:**

Perl is excellent for making tools for life-science research, but like C it can be a little hard to read for new comers. 


**In Python:**

So far, you may not have understood the codes for counting adenine in a DNA sequence - that's okay. Try to read this piece of code out loud: 

In [None]:
dna_sequence = "ACGAAGTCGAGGTCGAGGTCGAGGTCGAGGTCGAGGTCGAGGTCGAGG"
nAde = 0
for nucleotide in dna_sequence:
    nAde += 1 if nucleotide == "T" else 0
print nAde

Actually in python we have many built in functions to help us along the way. This code does the same trick in just one line:

In [None]:
"ACGAAGTCGAGG".count('A')

<img src="../pix/play2.jpg">

### Advantages and disadvantages
Advantages:

* The main advantage is ease of programming, minimizing the time required to develop, debug and maintain the code.
* Well designed language that encourage many good programming practices:
 * Modular and object-oriented programming, good system for packaging and re-use of code. This often results in more transparent, maintainable and bug-free code.
 * Documentation tightly integrated with the code.
* A large standard library, and a large collection of add-on packages.
* Great performance due to close integration with time-tested and highly optimized codes written in C and Fortran:
    * Using in-built functions can really boost performance

Disadvantages:

* Since Python is an interpreted and dynamically typed programming language, the execution of python code can be slow compared to compiled statically typed programming languages, such as C and Fortran. 
* Somewhat decentralized, with different environment, packages and documentation spread out at different places. Can make it harder to get started.
* Traditionally, the bioinformatics community was heavily Perl oriented


### Python Versions

Python is evolving. At the moment, there are two major versions of Python available: Python 2.7 and Python 3.2. In this course, **we'll be using Python 2.7**. Why aren't we using the more up-to-date Python 3? There's a bit of history: For most of Python's lifetime, each new version of the language would introduce new features, but would try very, very hard to not break any code that other people had already written; in other words, most changes were backwards compatible. In about 2006, however, Python's creator and "Benevolent Dictator For Life" Guido van Rossum decided that there were a number of things that he had gotten wrong in the original Python specification, and would like to change. Making those changes would break other people's code, so it was decided to make them all at once. <br /><br />

<center>
<img src="files/images/pyhistory.svg" />
</center>
<br /><br />
Although that version was released almost 5 years ago, there are lots of other libraries that haven't completely switched. The number of "gotchas" between Python 2 and 3 is relatively small, and we'll try to point out where there are differences, so if you do decide to make the leap, things should go relatively smoothly.

## Python Environments
There are also many different environments through which the python interpreter can be used. Each environment have different advantages and is suitable for different workflows. One strength of python is that it versatile and can be used in complementary ways, but it can be confusing for beginners so we will start with a brief survey of python environments that are useful for scientific computing.

<p></p>


#### Python enterpreter

The standard way to use the Python programming language is to use the Python interpreter to run python code. The python interpreter is a program that read and execute the python code in files passed to it as arguments. At the command prompt, the command ``python`` is used to invoke the Python interpreter.

For example, to run a file <cb>my-program.py</cb> that contains python code from the command prompt, use:

<cb>python my-program.py</cb>

We can also start the interpreter by simply typing <cb>python</cb> at the command line, and interactively type python code into the interpreter. 

<center>
<img src="files/images/python_prompt.png" width="600px"/>
</center>

<p></p>

#### IPython
IPython is an interactive shell that addresses the limitation of the standard python interpreter, and it is a work-horse for scientific use of python. It provides an interactive prompt to the python interpreter with a greatly improved user-friendliness.

<center>
<img src="files/images/ipython_prompt.png" width="600px"/>
</center>

Some of the many useful features of IPython includes:

* Command history, which can be browsed with the up and down arrows on the keyboard.
* Tab auto-completion.
* In-line editing of code.
* Object introspection, and automatic extract of documentation strings from python objects like classes and functions.
* Good interaction with operating system shell.
* Support for multiple parallel back-end processes, that can run on computing clusters or cloud services like Amazon EE2.

#### IPython Notebook

IPython notebook is an HTML-based notebook environment for Python, similar to Mathematica or Maple. It is based the IPython shell, but provides a cell-based environment with great interactivity, where calculations can be organized documented in a structured way.

<img src="files/images/notebook.png" width="800" />

Although using the a web browser as graphical interface, IPython notebooks are usually run locally, from the same computer that run the browser. To start a new IPython notebook session, run the following command:

<cb>ipython notebook</cb>

from a directory where you want the notebooks to be stored. This will open a new browser window (or a new tab in an existing window) with an index page where existing notebooks are shown and from which new notebooks can be created.
<p></p>

#### Komodo and sublime-text

Komodo and Sublime-Text are feature-rich editors made for multiple languages (not just python). They are both highly extensible and both have a large repitoire of plugins and packages. It doesn't include built-in python entepreters but offers a minimilistic experience when programming which can be quite productive in itself. 

<img src="files/images/komodo.png" width=800px />

<p></p>

<img src="files/images/sublime.png" width=800px />

<p></p>

#### Spyder

Spyder is a MATLAB-like IDE for scientific computing with python. It has the many advantages of a traditional IDE environment, for example that everything from code editing, execution and debugging is carried out in a single environment, and work on different calculations can be organized as projects in the IDE environment.

If you are coming from MATLAB or something similar this is a good editor to start with.

<img src="files/images/spyder.png" width="800px" />

#### Emacs python-mode.el

![Emacs](../pix/emacs2.png)



## Let's get started: 

##### Form groups of two

## Our first python code:

### Printing values

In [None]:
print 

 This is a **code snippet** (blocks of text in gray boxes like this indicate python code). This code can be put into a text file and saved as *myprogram.py* file. When being run it displays the output shown in lines underneath the code snippet.

In [None]:
print 12 + 14 +100.0

You can print multiple expressions by seperating with commas

In [None]:
print "the answer:", 21*2

<img src="../pix/play2.jpg">

### Using variables

Instead of directly operating on values, like "the answer:" and 42, we can store them in variables. This is especially useful when working with multiple values at the same time. 

In [None]:
A = 12
B = 16
print A + B

Variables can be used on the right hand side of an assignment as well, in which case they will be evaluated before the value is assigned to the variable on the left hand side.

In [None]:
C = 2 * A
print C

In [None]:
C = C + 1
C

In fact this can be done in a minimalistic way, such as

In [None]:
C -= 1
print C

<img src="../pix/play2.jpg">

### Using Calls
In python there are two types of calls

<ul>
<li>Function call</li>
<li>Method call</li>
</ul>

#### Function call
The simplest kind of call is to invoke a *function*. 

In [None]:
len("TATA")

A *function* is invoked by a <span style="color: green; font-style: italic">function name</span>, a pair of parentheses and zero or more <span style="color: red; font-style: italic">arguments</span>. Here are some more examples:

In [None]:
type("TATA")

In [None]:
raw_input()

In [None]:
abs(-12)

<img src="../pix/play2.jpg">

#### Method call
Instead of supplying a value to Python's built-in functions, most objects in python has it's own functions, called *methods*.

Calling a method is like with a function, except they are specific to a certain type of object:

In [None]:
mystring = "TATA"
mystring.upper?

In [None]:
"TATA".count("A")

In [None]:
"GCTAGTCAAGCTTACTATTTTGGCATTGGCATGAG".find("TGGCAT")

In [None]:
X = 5.25
X.as_integer_ratio()

<img src="../pix/play2.jpg">

### Different usage of python

#### Interactive mode (PROMPT):
Allows you to get an immediate reply to each instructions

In [None]:
print "hello world"

In [None]:
name = raw_input("Type your name: ")
print "Your name is", name, "!"

####Batch mode (PROGRAM):
Allows you to run a series of instructions before getting the output.

Let's program a doorman in a file called **doorman.py:**

In [None]:
#!/usr/bin/python

""" My first Python program """

# GET INPUT FROM USER
name = raw_input("Type your name: ").capitalize()

# VALIDATE INPUT
guest_list = ["John", "Charles", "Hans"]

if name in guest_list:
    print "Please enter", name
else:
    print "Your name is not on the guest list! Get lost!"

In [None]:
name = raw_input("Type your name: ").capitalize

The above program can be run with either <cb>python doorman.py</cb> or simply <cb>./doorman.py</cb>

<qq> Q: How would you make sure that JOHN and john would also be accepted?</qq>

### Commenting

When you are writing a program it is often convenient to annotate your code to remind you what you were (intending) it to do, in programming these annotations are known as comments. You can include a comment in python by prefixing some text with a # character. All text following the # will then be ignored by the interpreter. You can start a comment on it's own line, or you can include at the end of a line of code.

In [None]:
print "Hi" # this will be ignored
# as will this
print "Bye"
# print "Never seen"

Being able to write **meaningful** comments is one of the best attributes of a skilled programmer! It makes it really easy for others to read your code. Here are some examples of meaningful comments (don't worry about what the code does!)


In [None]:
    ########################## PARSE INPUTFILE ##########################
    validate.checkFile(inputfile)
    seqdat = seqParse.parse(inputfile, type="fasta")

    ...
   
    ############################ COMPUTE PSSM ###########################
    # SET NUMBER OF SEQUENCES
    seqdat.numseq = len(seqdat.aln)
    # CALCULATE SEQUENCE WEIGHTING
    if sequenceWeighting == 1: # Heuristics
        alpha, weights = Heuristic(seqdat.aln, alphabet, gaps)
    elif sequenceWeighting == 2: # Hobohm
        alpha, weights = Hobohm(seqdat.aln, alphabet, gaps, threshold)
    else: 
        alpha, weights = seqdat.numseq - 1, [1]*seqdat.numseq # Set weights to 1
    # APPLY SEQUENCE WEIGHTING
    seqdat.set_sw(weights)
    # CALCULATE FREQUENCIES
    seqdat.calc_freq()
    # CALCULATE PROBABILITIES
    seqdat.calc_prob(alpha, beta)

Again, don't worry if you don't understand the termology of the codes above. You will get your hands into it yourself later in the course. 

### Getting more out of Python with modules

You can extend the functionality of Python by importing more features (**modules**). For instance, if you want to do more than the basic math operations you can import the math module

In [None]:
import math

x = math.cos(2 * math.pi)
print x

You can import the *cos* and *pi* objects specifically using

In [None]:
from math import cos, pi

x = cos(2*pi)
print x

We will see a lot more to modules later in the course. 

## More information about Python

S. Bassi, Python for Bioinformatics, Chapter 1-3.

P. Barry, Head First: Python, Chapter 1.

M. Model, Bioinformatics Programming Using Python, Chapter 1 and 3.

J.R. Johansson (robert@riken.jp) http://dml.riken.jp/~rob/,
[github scientific-python-lectures](http://github.com/jrjohansson/scientific-python-lectures)

The Python Documentation https://docs.python.org

The Python Tutorial https://docs.python.org/tutorial/

The Biopython project http://www.biopython.org


<img src="../pix/book.gif" width=50px> Suggested reading for todays exercises: 
* Python for Bioinformatics by S. Bassi - Chapter 2.2, 2.3 and 2.4

<img src="../pix/book.gif" width=50px> Required reading for next week: 
* Python for Bioinformatics by S. Bassi - Chapter 3

## Markup and styles

In [None]:
from IPython.core.display import HTML


def css_styling():
    styles = open("../styles/custom_slide.css", "r").read()
    return HTML(styles)
css_styling()