# A crash course in Python 
**Authors**: Thierry D.G.A Mondeel, Stefania Astrologo, Ewelina Weglarz-Tomczak & Hans V. Westerhoff <br/>
University of Amsterdam <br/>
2016 - 2019

**Acknowledgements:**
This tutorial is inspired in part by a collection of parts from various notebooks on the web. Most notably:

* [Learning IPython for Interactive Computing and Data Visualization, second edition](http://ipython-books.github.io/minibook/).**

---

<span style="color:red">**Assignment (1 sec):**</span> Execute the cell below. Don't worry about what it means (it makes sure you will see all the output you compute below). 

In [None]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

## The goal of this notebook
The goal here is **not** to give you a complete introduction to python.
We just want you to be familiar enough to be able to interact with prewritten Python code to do FBA calculations later on in the tutorial.

![title](https://imgs.xkcd.com/comics/python.png)

## Some properties of [Python](http://www.python.org)
* Python is a modern programming language developed in the early 1990s by Guido van Rossum -> A Dutch guy! (https://en.wikipedia.org/wiki/Guido_van_Rossum) 
* Beginner Friendly
* Easy to understand and read
* It is free
* Its use is pervasive in computational biology 

## Printing text in python: The central dogma
The print() function is used to print text, numbers, and results of computations to the screen.
**Note:** If you want to print text put the text between quotation marks "text".

In [None]:
print("The classic view of the central dogma of biology states that \
'the coded genetic information hard-wired into DNA is transcribed into \
individual transportable cassettes, composed of messenger RNA (mRNA); \
each mRNA cassette contains the program for synthesis of a particular \
protein (or small number of proteins).'")

<span style="color:red">**Assignment (1 min):**</span> print the name of your favorite protein (or your name) below. 

Congratulations! You are now a Python programmer.

## Numbers matter in biology
<span style="color:red">**Assignment (1 min):**</span> Open the the website of [Bionumbers](https://bionumbers.hms.harvard.edu/Includes/KeyNumbersLinks.pdf) and keep the tab open

Luckily we can use Python as a calculator.

<span style="color:red">**Assignment (1 min):**Below we show some examples. Execute the cell. Do you understand what each one does? </span>

In [None]:
42
17+19
2 * 2
3 / 2
2**5

Python's built-in mathematical operators include `+` for summation, `-` for subtraction, `*` for multiplication, `**`, for exponentiation, `/` for division.

<span style="color:red">**Assignment (2 min):**</span> 
According to the [Bionumbers](https://bionumbers.hms.harvard.edu/Includes/KeyNumbersLinks.pdf) website you opened above, e. coli has a volume of up to $5~\mu m^3$ and yeast has a volume of up to $160~\mu m^3$

Calculate in the cell below how many times bigger the volume of yeast is compared to e. coli.

<span style="color:red">**Assignment (3 min):**</span> Using [Bionumbers](https://bionumbers.hms.harvard.edu/Includes/KeyNumbersLinks.pdf) find the "average protein diameter" and the "diameter of a yeast cell". Look carefully at the units. How  how many "average proteins" could theoretically lie side by side in a yeast cell? **Neglect any aspects of folding or interactions.** 

## Variables
Variables form a fundamental concept of any programming language. A variable has a name and a value. Here is how to create a new variable in Python:

In [None]:
avogadro = 6e23

print("Question: How many molecules are contained in a mole?")

# show the answer using a print statement
print("Answer:",avogadro,"molecules.")

# here is an alternative way to print the answer
avogadro

<span style="color:red">**Question (1 min):**</span> Observe how the variable is used differently in the print statement, but how you can also simply print its value directly.

Also note how we used the `#` character to write **comments**. Whereas Python discards the comments completely, adding comments in the code is important when the code is to be read by other humans (including your future self).

You can perform calculations using an existing variable:

In [None]:
print("How many molecules in 2 moles?") 
2 * avogadro

<span style="color:red">**Assignment (2 min):**</span> 
* Look up the molar mass of water ($H_2O$) 
* Use the avogadro number to calculate and print the number of molecules in 1 L (1 kg) of water

## Types of variables

There are different types of variables. Above, we have used various numbers (more precisely, **integers**). Other important types include **floating-point numbers** to represent real numbers, **strings** to represent text, and **booleans** to represent `True/False` values. Here are a few examples:

In [None]:
somefloat = 3.1415
sometext = 'pi is about'  # You can also use double quotes.
print(sometext, somefloat)  # Display several variables.
I_am_true = False
I_am_true 

<span style="color:red">**Assignment (1 min):**</span>
Make your own piece of text, i.e. your name and age and print it to the screen. Do not just write a string a text, make your age a variable like pi in the example above. 

## Murphy's law: (In a tutorial) Anything that can go wrong will go wrong
When you write something Python doesn't understand it throws a so-called "exception" and tries to explain what went wrong, but it can only speak in a broken Pythonesque english. 

Let's see some examples by running these code blocks. This is helpful later on because you will likely encounter (and produce) some errors.

<span style="color:red">**Assignment (1 min):**</span>
Execute the cells below and look at the error messages that appear. Do you understand what is wrong in each case?

In [None]:
gibberish

In [None]:
print('Hello'

In [None]:
1_my_variable_name_starting_with_a_number = 1

In [None]:
2000 / 0

Python tries to tell you where it stopped understanding, but in the above examples, each program is only 1 line long.

It also tries to show you where on the line the problem happened with caret ("^").

Finally it tells you the type of thing that went wrong, (NameError, SyntaxError, ZeroDivisionError) and a bit more information like "name 'gibberish' is not defined" or "unexpected EOF while parsing".

Unfortunately you might not find "unexpected EOF while parsing" too helpful. EOF stands for End of File, but what file? What is parsing? It is the command you entered. The problem here is that we forgot the close the print statement with a final parenthesis ')'.  

Python does it's best, but it does take a bit of time to develop a knack for what these messages mean. If you run into an error you don't understand please ask a tutor.

## Types of variables in python
When you define a variable in python it has a type. Above we dealt with numbers which are of type integer or float (numbers with a comma). Below we briefly introduce the other types you might see in the rest of the tutorial. 

Simply put there is text, i.e. strings, and two kinds of containers, e.g. lists and dictionaries. 

### The Written Word, i.e. strings

Numbers are great... but most of our day to day computing needs involves text, from emails to tweets to documents. Or in biology: DNA sequences, chemical formulas, hyperlinks between databases etc.

We have already seen a couple strings in Python. Programmers call text *strings* because they are weird like that. From now on we will only refer to strings, but we just mean pieces of text inside our code.

In [None]:
"Hello, World!"

Strings are surrounded by quotes. Without the quotes Hello by itself would be viewed as a variable name.

You can use either double quotes (") or single quotes (') for text/strings. As we saw before we can also save text in variables.

Let's use strings with variables!

In [None]:
your_name = "James Watson"
print("Hello,",your_name)

Strings in Python are a bit more complicated because the operations on them aren't just + and * (though those are valid operations).

## Finding functions that belong to an object or are part of a module: What kind of methods does math contain?

Execute the cell below to load the famous [NumPy](https://www.numpy.org/) package "the fundamental package for scientific computing with Python". We will give it a shortname for brevity: 'np'.

In [None]:
import numpy as np

Packages (or "modules" in Python speak) like NumPy, and CobraPy later on in this tutorial, contain various specialized functions to do cool things with.

If you want to find out which ones type: "np." (notice the dot!) and then press "TAB". 

<span style="color:red">**Assignment (1 min):**</span> Use the TAB key on the numpy library to find out some math functions you could use. Do you see any that you recognize as mathematical functions?

Below we will see how you could figure out what each function in NumPy actually does.

## Getting help in the notebook

When you want to know what a command or function in Python does you can type a question mark ? in front of the command. An alternative for the question mark is Shift-Tab. First type the name of the command (i.e. **module_name.function_name**) then shift-tab. A tooltip will light up showing you the help file for the function.

The first lines of the window that appears will give you information on the input to the function. A little below that a general explanation of the purpose of the function and its parameters will be shown.

<span style="color:red">**Assignment (1 min):**</span> Ask for help on the "np.sqrt" command using the Shift-Tab method described above. Start by typing np.sqrt in the cell above and then press Shift-Tab. Read some of the documentation and then click away the popup that appeared. Try calculating the "sqrt" of some number. Also try out the question mark approach. 

In [None]:
# Look at the help for np.sqrt here


<span style="color:red">**Assignment (1 min):**</span> Why does the np.sqrt documentation talk about "arrays"? NumPy is optimized for large mathematical computations on big sets of numbers. An array is simply a list of numbers. NumPy deals nicely with such lists or arrays. Execute the cell below to see an example.

In [None]:
np.sqrt([1, 2, 100])

## Dot notation and object oriented programming
** i.e. life involves organisms, which have cells, which have organelles, which contains metabolites, enzymes (reactions) and molecules, molecules contain protons and electrons etc. **

Python, like many programming languages, supports Object Oriented Programming or OOP for short. In this paradigm, we approach ideas as Objects much as we do in the real world. Each Object is an instance of a Class or a type of object. Such an object may have certain properties or function that can be applied to them. 

So what does all this have to do with dot notation? Dot notation allows us to tell a instance of a class to use one of the functions inside that class. That is why we access the sqrt function from the numpy module with the dot. And why we access the 'upper' function of a string as my_string.upper()

<span style="color:red">**Assignment (2 min):**</span> Below, first execute the definition of 'your_string'. After that, in the second cell, type a dot after 'your_string' and press Tab. You will get a list of functions you can apply to the string. Pick the 'upper' function. Look at the documentation for this function. What does it do? Now apply the function by writing your_string.upper and add parentheses '()'. Upper is a function and functions always come with parentheses. 

In [None]:
your_string = 'something'

In [None]:
your_string

**Note** that you have to define a string first to be able to get help. So first execute the cell so that the variable your_string is known. Then using the Tab key find upper. 

**Note:** The point here is just to get you comfortable with the dot notation and finding functions and properties of objects.

# If you have spent > 45 min. on the tutorial so far. Stop this part of the tutorial and continue on to the next part of the tutorial.
If you are fast and you spent < 45 min on the above part of the tutorial. Feel free to finish the part below and become a Python genius.

# List, Loop and Dictionaries

## Lists

A list contains a sequence of items. You can concisely instruct Python to perform repeated actions on the elements of a list. Let's first create a list of numbers:

In [None]:
items = [1, 3, 0, 4, 1]

Note the syntax we used to create the list: square brackets `[]`, and commas `,` to separate the items.

The *built-in* function `len()` returns the number of elements in a list:

In [None]:
len(items)

We can also access individual elements in the list, using the following syntax:

In [None]:
items[0]

items[-1] 

Note that indexing starts at `0` in Python: the first element of the list is indexed by `0`, the second by `1`, and so on. Also, `-1` refers to the last element, `-2`, to the penultimate element, and so on.

The same syntax can be used to alter elements in the list:

In [None]:
items[1] = 9
items

We can access sublists with the following syntax:

In [None]:
items[1:3]

Here, `1:3` represents a **slice** going from element `1` _included_ (this is the second element of the list) to element `3` _excluded_. Thus, we get a sublist with the second and third element of the original list. The first-included/last-excluded asymmetry leads to an intuitive treatment of overlaps between consecutive slices. Also, note that a sublist refers to a dynamic *view* of the original list, not a copy; changing elements in the sublist automatically changes them in the original list.

<span style="color:red">**Assignment (3 min):**</span> In the code cell below make a list of the first 5 prime numbers: https://en.wikipedia.org/wiki/Prime_number. Use the sum() function to figure out the sum of the first 5 prime numbers. 

<span style="color:red">**Assignment (1 min):**</span>
Print the second-to-last prime number from your list to the screen. 

## Dictionaries
Dictionaries contain key-value pairs. They are extremely useful and common. They allow you to map, or point, **keys** to **values**. In the example below the letters a,b,c are now pointing to the numbers 1,2,3. 

For a flux balance analysis application of a dictionary, you can think of a dictionary that points each of the reactions in a network to its flux in the FBA solution. 

You can access the **value** a certain **key** points to with square bracket notation:

In [None]:
my_dict = {'a': 1, 'b': 2, 'c': 3}
print('a:', my_dict['a'])

In [None]:
list(my_dict.keys())

The keys in a dictionary can be anything including numbers

In [None]:
my_dict = {18: 1, 23: 2, 0: 3}
my_dict[18]

<span style="color:red">**Assignment (1 min):**</span> Make your own dictionary. For yourself and a friend add a 'key' to the dictionary, the name of the person, and a value, the age of the person. Then print the dictionary to the screen.

## for loops

We can run through all elements of a list using a `for` loop:

In [None]:
a_list = [1,2,3,4,5,6]
for number in a_list:
    number

* Note that the for loop steps in sequence through the numbers in the list. 
* Every loop, the variable number is assigned the value of the next number in the list
* You may call this variable number whatever you wish. As long as you also change it in the third line

As a more complex example, we can also loop over the **keys** of a dictionary. 

In [None]:
genome_sizes = {'e. coli':5,'yeast':12,'human':2.9e3} # dictionary

print('A list of genome-sizes in #Mbp:')

for organism in genome_sizes.keys():
    print(organism,genome_sizes[organism])

There are several things to note here:

* The `for organism in genome_sizes.keys()` syntax means that a temporary variable named `organism` is created at every iteration. This variable contains the value of every item in the list, one at a time.
* Note the colon `:` at the end of the `for` statement. Forgetting it will lead to a syntax error!
* The  `print` statement will be executed for all items in the list.
* Note the four spaces before `print`: this is called the **indentation**. You will find more details about indentation in the next subsection.

<span style="color:red">**Assignment (3 min):**</span> Write your own for loop that prints each element of your list of 10 prime numbers divided by 2 separately to the screen. 

## List comprehensions: for loops in one line

Python supports a concise syntax to perform a given operation on all elements of a list using for loops:

In [None]:
items = [1,2,3,4,5,6]
squares = [item * item for item in items]
squares

This is called a **list comprehension**. A new list is created here; it contains the squares of all numbers in the list. This concise syntax leads to highly readable and *Pythonic* code.

<span style="color:red">**Assignment (3 min):**</span> Write a list comprehension that calculates the square of each of the first 10 prime numbers. Start from the example above but loop over the list of prime numbers you defined above.

# Now that you are a Python genius we can move on to flux balance analysis!