# Welcome to BT2100 - Computational Biotechnology

---

### Before we start...

During this course we will use **Jupyter notebooks** for practical exercises. If you haven't used Jupyter before, take a moment to get a quick look of its features in the [Jupyter website](https://jupyter.org/). 

![jupyter logo](files/jupyter_logo.png)

For the purpose of this course we will run these notebooks on the cloud. But you always have the option to install python in your own computer and try to run everything locally. The easiest option is to install Anaconda, which includes a Python distribution pre-packaged with several scientific libraries and utilities like the conda package manager and Jupyter. Here you can find the individual (free) edition: https://www.anaconda.com/products/individual.

![anaconda logo](files/anaconda_logo.png)

---

## Exercises

During this course, we will have several exercises that you need to solve by typing a few lines of code. These exercises are not graded, they are here to help you practice your coding skills. 

> **Attention**: If you are running this notebook on the cloud, your work will disappear once you close the session (or if it expires)! If you want to save your work, go to `File -> Download` and save the notebook on your computer.

Each exercise will have a description followed by a code cell where you can type and execute your code. An example solution is provided in a hidden cell after each exercise so you can self-evaluate your progress. 

> **Attention**: Make sure you are running the new **Jupyter lab** interface and not the classic **Jupyter notebook** interface. The latter does not support hidden code cells, and you will immediately see the solutions.

---
## A quick introduction to Python

Our goal for this session is to cover the basics of Python programming. If some exercises seem too easy for you just skip them.

If you need to improve your Python skills, the [official tutorial](https://docs.python.org/3/tutorial/index.html) is a good place to start!

### 1. Just a fancy calculator

Python is an interpreted programming language (just like *Matlab* and *R*), which means you can simply use it as a calculator. 

You type something, and you get a result:

In [None]:
1 + 1

You can also declare variables that store values and use them to create more complex expressions:

In [None]:
x = 1.5
y = 2.0 * x
z = x**2 + y   # this means "x squared" (** is the power operator)

print(z)

It's your turn! Just use the space bellow to try something yourself:

In [None]:
# type some code here... 


### 2. Variable types

The most simple variable types in Python are:

* bool: True, False
* int: 1, 2, 3, ...
* float: 1.0, -15.347, 2e-5, ...

In [None]:
x = 0
y = 2.3
z = (x == y)  # compare x and y

print(x, y, z)

#### 2.1 Lists 

Lists are used to group several values together. They are quite flexible, you can *merge* them, *slice* them, etc. 

In [None]:
a = [0, 1, 2, 3, 4]
b = [5, 6, 7, 8, 9]
c = a + b   # merge two lists
d = c[3:7]  # get elements from position 3 to 6 (the 7th element is excluded, yes this is weird)
e = [d, d]  # create nested lists

print('c =', c)
print('c[0] =', c[0])
print('c[-1] =', c[-1])
print('d =', d)
print('e =', e)
print('sum(c) =', sum(c))
print('len(c) =', len(c))

List comprehensions are a really powerful syntax to build complex lists:

In [None]:
a = range(1, 11)                 # numbers from 1 to 10 (remember, the 11 is excluded)
b = [2 * x for x in a]           # create a list with the double of every number in a
c = [x for x in a if x%2 == 0]   # get all the even numbers in a 
print('b =', b)
print('c =', c)

# combine all elements in [1,2,3] with all elements in ['a', 'b']
d = [(x, y) for x in [1,2,3] for y in ['a', 'b']] 
print('d =', d)

#### 2.2 Tuples

Tuples are also groups of value but, unlike lists, once created they cannot be modified. You can use them to store anything that is supposed to have a fixed number of elements, for instance, a pair of GPS coordinates.

In [None]:
x = (1, 2, 3)
y = (x, x)

print(x)
print(y)

#### 2.3 Sets

Sets are also similar to lists, but they cannot contain repeated values and they are not ordered (so they cannot be indexed or sliced). 

However, sets are very useful when you need to calculate unions and intersections.

In [None]:
a = {1, 2, 3, 4, 5, 6}
b = {4, 5, 6, 7, 8, 9, 10}

c = a & b  # intersection
d = a | b  # union
e = a ^ b  # mutual exclusion

print('c =', c)
print('d =', d)
print('e =', e)

#### 2.4 Dictionaries

Dictionaries are mappings between *keys* and *values*. They work similary to lists, but you use the keys instead of a numbered index to retrieve  values:

In [None]:
colors = {'apple': 'green', 'banana': 'yellow', 'strawberry': 'red'}  # create a dictionary

colors['blueberry'] = 'blue' # we can always add more items
colors['orange'] = 'orange'  # after the dictionary was created

x = 'banana'
print('the', x, 'is', colors[x])

#### 2.5 Strings

Strings are used to represent text, and they are essentially lists of characters:

In [None]:
x = "I am a string"        # you can use double quotes
y = 'I am also a string.'  # or single quotes
z = x + ', and ' + y       # and you can append strings together

print(z)
print(z[:28] + '.')
print(z.split())

Strings are so versatile that they come with their own [format mini-language](https://docs.python.org/3/library/string.html#format-specification-mini-language).

In [None]:
x = 1000
y = 3
z = x/y

# option 1 (using format function)
print('{} / {} is approximately {:.5f} or {:.2g}'.format(x, y, z, z))

# option 2 (using f-strings)
print(f'{x} / {y} is approximately {z:.5f} or {z:.2g}')

### 3 Control flow

#### 3.1 if-else statements

As you start developing more complex code, you will need to create pieces of code that only get executed under certain conditions. 

You can do this using so-called **if** statements:

```
if condition is True:
    do this!
```

You can also use **if-else** statements:

```
if condition is True:
    do this!
else:
    do that instead!
```

> **Attention:** Note that the blocks of code inside the if-else clauses are indented with four spaces.

In [None]:
x = 5
y = 7

if x > y:
    print('x is larger than y')
else:
    print('x is smaller or equal to y')

#### 3.2 iterating with loops

Sometimes you want to execute something multiple times. 

You can do this using **for** loops:

```
for element in list:
    do something
```

Or you can use the **while** statement:

```
while condition is True:
    do something
```

In [None]:
for x in [1, 2, 3]:
    print('x =', x)
    
y = 0
while y < 3:
    y = y + 1
    print('y =', y)

You can also terminate loops using **break** statements:

In [None]:
for x in [1, 2, 3, -1, 4, 5]:
    if x < 0:
        print("I didn't expect a negative number.")
        break
    print('x =', x)

> **Important**: don't forget the correct indentation when you start creating nested statements!

### 4. Functions

Very often you will want to reuse a piece of code that performs some task that you use multiple times. 
Or you may want to divide a long piece of code into smaller blocks that are easier to read. 
In both cases, it is very convenient to store that piece of code as a **function**.

The general definition of a function is as follows:

```
def function_name(argument 1, argument 2, ...):
    # some lines of code
    # more lines of code
    
    return value
```

In [None]:
def double(x): # this function doubles every number it receives
    return 2 * x

x = double(1)
y = double(-5)
z = double(double(3))

print('x =', x, 'y =', y, 'z =', z)

def trouble(x, y, z): 
    a = x**2 + y**2
    b = a / (z + 1)
    return b

w = trouble(1, 2, 4)
print('w =', w)

In Python functions are variables too, so you can use functions as arguments to other functions:

In [None]:
def is_even(x):
    if x%2 == 0:
        print(f'{x} is even')
    else:
        print(f'{x} is not even')
        
def test_all(f, values):
    for value in values:
        f(value)

x = [1, 2, 3, 4, 5]
test_all(is_even, x)

### 5. Modules

For the sake of organization, python functions can be grouped together into **modules** (which can be further divided into sub-modules, and sub-submodules, etc..).

Python's motto is *Batteries Included*, take a look at modules that come with [The Python Standard Library](https://docs.python.org/3/library/index.html#library-index).

There are two main ways of importing functions from modules:

```
import module

# use with module name as prefix
y = module.function(x)

from module import function

# use with function name only
y = function(x)
```

In [None]:
from random import randint

for i in range(10):
    x = randint(1, 2)
    if x == 1:
        print('heads')
    else:
        print('tails')

In [None]:
import math

values = [math.factorial(x) for x in range(10)]
    
print(values)

### 6. Classes

#### 6.1 Short introduction

Python is a *multi-paradigm programming* language. This means that it borrows concepts from other languages, including *functional programming* and *object-oriented programming*. 

Understanding the difference between different programming paradigms is out of the scope of this course. But you will often have to use *objects* so it is important to clarify some basic points.

- In **functional programming** you apply a *function* to a variable and the result is a new variable: `y = function(x)` 
- In **object-oriented programming** variables are structured objects that have an internal *state* and implement *methods* that modify their state: `x.method()`

For instance, if you want to add a new element (`x`) to a list (`L`) you have two options:

- **functional**: `L2 = L + [x]`
- **object-oriented**: `L.append(x)`

The first option looks more like a mathematical formulation, and often resuls in a more ***elegant*** coding style (because [math is beautiful](https://en.wikipedia.org/wiki/Mathematical_beauty)). Here we create a new list `L2` that is the result of merging `L` and `[x]`. In the second option, we use the method `append` implemented by the list object to modify its internal state by adding element `x`. This alternative is more ***efficient*** because the content of the original list does not have to be copied into a new list. 

#### 6.2 Example

Let's create a class to represent rectangles. This class will contain two internal variables (called *attributes*) to represent the dimensions of a rectangle.


In [None]:
class Rectangle:
    def __init__(self, l, w):
        self.length = l
        self.width = w

We can now create rectangle *objects* that are *instances* of class `Rectangle`:

In [None]:
r1 = Rectangle(10, 20)
r2 = Rectangle(6, 4)

print(f'r1 dimensions: {r1.length} x {r1.width}')
print(f'r2 dimensions: {r2.length} x {r2.width}')

This is not a very useful rectangle class, if we just want to store the two dimensions we could simply have used a *tuple (lenght, width)* instead. But now we can add some functionality to our class. We can implement methods to calculate area, perimeter, or re-scale  rectangles to a different size:

In [None]:
class Rectangle:
    def __init__(self, l, w):
        self.length = l
        self.width = w

    def area(self):
        return self.length * self.width

    def perimeter(self):
        return 2 * (self.length + self.width)

    def rescale(self, factor):
        self.length = self.length * factor
        self.width = self.width * factor

Now there is a lot more we can do:

In [None]:
r1 = Rectangle(10, 20)
print(f'dimensions: {r1.length} x {r1.width}')
print('area:', r1.area())
print('perimeter:', r1.perimeter())

r1.rescale(0.5)
print(f'dimensions: {r1.length} x {r1.width}')
print('area:', r1.area())
print('perimeter:', r1.perimeter())

There is a lot more we could discuss about object-oriented programming, but if you understand the concepts of *object*, *class*, *instance*, *method*, and *attribute*, you are ready for the rest of this course ;)

# Exercises

## Exercise 1 (easy)

Your equipment has collected several measurements, but you know the detection signal is noisy and you want to remove all the values lower than a given noise threshold. 

Implement a function called `noise_filter` that receives two inputs, a list of measurements and a threshold value, and returns all measurements that are above the threshold.

In [None]:
# type your code here

Time to test your solution:

In [None]:
from random import uniform

measured = [uniform(0, 100) for i in range(30)]
threshold = 12.5
filtered = noise_filter(measured, threshold)

def beautify(x):
    return '\t'.join([f'{y:.2f}' for y in x])

print('measured:', beautify(measured))
print()
print('filtered:', beautify(filtered))

### Solution:
Click to reveal (please note that this only works with Jupyter lab).

In [None]:
# possible solution

def noise_filter(values, t):
    new_values = []
    for x in values:
        if x > t:
            new_values.append(x)
    return new_values


# alternative solution

def noise_filter(values, t):
    return [x for x in values if x > t]

## Exercise 2 (hard)

You are trying to see if there is an association between the city where people live and their height. So you reached out to 10 friends and collected some data:

Name | City | Height (cm)
--- | --- | ---
Anna | Oslo | 168
Bruno | Oslo | 172
Charlie | Bergen | 190
Daniel | Trondheim | 170
Eva | Bergen | 174
Fred | Oslo | 167
Greg | Trondheim | 182
Hugo | Oslo | 176
Ida | Trondheim | 165
Jake | Trondheim | 177

Now try to calculate the average height for each city and print them by ascending order. 

> Suggestion: use dictionaries to represent your data.

In [None]:
# type your code here

### Solution:

Click to reveal (please note that this only works with Jupyter lab).

In [None]:
# possible solution

cities = {
    'Anna': 'Oslo',
    'Bruno': 'Oslo',
    'Charlie': 'Bergen',
    'Daniel': 'Trondheim',
    'Eva': 'Bergen',
    'Fred': 'Oslo',
    'Greg': 'Trondheim',
    'Hugo': 'Oslo',
    'Ida': 'Trondheim',
    'Jake': 'Trondheim',
}

heights = {
    'Anna': 168,
    'Bruno': 172,
    'Charlie': 190,
    'Daniel': 170,
    'Eva': 174,
    'Fred': 167,
    'Greg': 182,
    'Hugo': 176,
    'Ida': 165,
    'Jake': 177,
}

grouped = {}

for name, city in cities.items():
    if city not in grouped:
        grouped[city] = []
    
    height = heights[name]
    grouped[city].append(height)

averages = [] 
for city, values in grouped.items():
    height = sum(values) / len(values)
    averages.append((city, height))

averages.sort(key=lambda x: x[1])

for city, height in averages:
    print(f'{city:<10} {height:.1f}')