# Program control and logic

## `if` statement

An `if` statement choose a code fragment to execute based on the value of a logical expression.

In [1]:
fragments = ['AGGCT', 'UGGCA', 'AGGCC', 'TAATG']
nr_dna = 0
nr_rna = 0
nr_unknown = 0
for fragment in fragments:
    if 'T' in fragment:
        nr_dna += 1
    elif 'U' in fragment:
        nr_rna += 1
    else:
        nr_unknown += 1
print(f'DNA: {nr_dna:d}')
print(f'RNA: {nr_rna:d}')
print(f'unknown: {nr_unknown:d}')

DNA: 2
RNA: 1
unknown: 1


In the code fragment above, `'T' in fragment` is a logical expression.  If it evaluates to `True`, i.e., if the fragment contains a T nucleotide, the number of DNA fragments `nr_dna` is incremented.  If that is not the case, the logical expression in the `elif` part, `'U' in fragment`, is evaluated, and if `True`, the number of RNA fragments `nr_rna` is incremented. If it evaluates to `False`, the fragment contains neither U, nor T, and hence we don't know for sure, so we increment `nr_unknown`.

An `if` statement can have zero or more `elif` parts, and optionally an `else` part.

#### Your turn now: odd and even

Replace the `___` in the following code fragment so that it prints odd and even appropriately for integers.

In [None]:
numbers = [3, 7, 4, 9, 2]
for number in numbers:
    if ____:
        print(f'{number:d} is even')
    else:
        print(f'{number:d} is odd')

### Comparisons and truth

In Python, there is often an automatic conversion of values to `True` or `False` that may seem bizarre, and sometimes counter-intuitive.

In [13]:
for item in [None, 0, 3, 0.0, -2.3, '', 'abc',
             list(), [1, 2],
             set(), {1, 2},
             tuple(), (1, 2),
             dict(), {'a': 3, 'b': 7}]:
    print(f'{str(item):20s} {type(item).__name__:20s} {bool(item)}')

None                 NoneType             False
0                    int                  False
3                    int                  True
0.0                  float                False
-2.3                 float                True
                     str                  False
abc                  str                  True
[]                   list                 False
[1, 2]               list                 True
set()                set                  False
{1, 2}               set                  True
()                   tuple                False
(1, 2)               tuple                True
{}                   dict                 False
{'a': 3, 'b': 7}     dict                 True


Strictly speaking, yuo need not know this, since each of these automatic conversions can be expressed as a logical expression, e.g.,

In [17]:
len([1, 2]) > 0

True

This is obviously more verbose, and experienced Python programmers will typically use the automatic conversions. However, feel free to be verbose, it is more important that you understand the code, than it is to be brief.

Values of Python's numerical types (i.e., `int`, 'float`) can be compared using the operators you'd expect: `<`, `<=`, `>`, '>=`, '==', and `!=`. The last two test for equality and inequality.

However, all these operators also work on strnig values and lists, while equality and inequality can be used on almost all types.

#### Your turn now: comparing `float`

Given what you know about the representation of floating point numbers in Python, would you consider the following fragment of code safe? Assume that the values stored in `x` and `y` are type `float`.

In [None]:
x = some_computation()
y = some_other_computation
if x == y:
    do_something()

### Logical operations

Forget about this section, it leads to code that is quite error prone, and difficult to maintain.

## `for` statement

The `for` statement is an iteration statement, i.e., it will execute a block of code a number of times. It is often used to iterate over elements of `list`, `set`, or other collective types.

In [18]:
values = [1.1, 2.2, 3.3, 4.4]
total = 0.0
for value in values:
    total += value
print(total)

11.0


#### Your turn now: product

Replace the`____` in the following code fragment so that it will print the product of the elements in `values`.

In [18]:
values = [1.1, 2.2, 3.3, 4.4]
total = ____
for value in values:
    total ____
print(total)

11.0


## `while` statement

Often, we want something done as long as a condition holds true. This can be done in Python using the `while` statement, the second iteration statement.

As an example, let's find the first element of a list that is larger than a given value, for example, which number is the first in `[3, 9, 15, 19, 53]` that is larger than 10?

In [19]:
values = [3, 9, 15, 19, 53]
i = 0
while i < len(values) and values[i] <= 10:
    i += 1
if i < len(values):
    print(values[i])

15


Okay, this works, but you'll probably not be happy with this code since it looks kind of ugly.

## Skipping and breaking loops

The example of the `while` statement above could be improved a bit.

In [21]:
values = [3, 9, 15, 19, 53]
i = 0
while i < len(values):
    if values[i] > 10:
        print(values[i])
        break
    i += 1

15


In the example above, the `while` loop is interrupted as soon as a value is found that is larger than 10. This is caused by the `break` statement. A `break` can be used in the body of both `for` and `while` statements. To illustrate this, let's rewrite the code fragment to use a `for` loop, rather than a `while` loop.

In [22]:
values = [3, 9, 15, 19, 53]
for value in values:
    if value > 10:
        print(value)
        break

15


Sometimes, we want to skip some iterations. For instance, consider a list of DNA/RNA fragments, but we only want to process DNA, not RNA, to compute the total AT-content of all DNA fragments in the list.

In [28]:
fragments = ['AGGTC', 'TTGCAG', 'UGGACC', 'AAGCC']
at_count = 0
total_count = 0
for fragment in fragments:
    if 'U' in fragment:
        print(f'skipping {fragment}')
        continue
    at_count += fragment.count('A') + fragment.count('T')
    total_count += len(fragment)
at_content = at_count/total_count
print(f'AT-content: {at_content:.2f}')

skipping UGGACC
AT-content: 0.44


When a `continue` statement is executed, the current loop iteration, i.e., all remaining statement in the loop body, are skipped, and the next iteration starts immediately. In this case, the fourth iteration, i.e., the one for `'AAGCC'` is started as soon as `continue` is executed, skipping the assignments to `at_count` and `total_count` for `'UGGACC'`, which is clearly an RNA fragment.

Note: we are of course assuming that any sequence that does not contain an `U` nucleotide is a DNA fragment, which is clearly not necessarily true.

#### Your turn now: number of statements

In the code fragment above, how many statements has
  * the complete fragment,
  * the body of the `for` loop, and
  * the body o fthe `if` statement?

#### Your turn now: counting nucleotides

The following code fragment will, given a DNA sequence, count the number of times each nucleotide A, C, G, T occurs. However, if the sequence contains a character that is not a nucleotide, an error message is printed.  Replace the `____` so that the code works appropriately.

In [None]:
sequence = 'AGGTACQTTACG'
nucl_count = dict()
for nucl in sequence:
    if nucl not in ____:
        print(f'### error sequence contains invalid symbol {nucl}')
    if nucl not in ____:
        nucl_count[____] = ____
    nucl_count[____] += 1

## `range` function

The range function can be used to generate a sequence of integers, starting at 0, and ending at the specified value, minus 1.

In [29]:
print(list(range(5)))

[0, 1, 2, 3, 4]


Note that we had to create a `list` out of the result of the `range` function. The type of `range` may be somewhat unexpected.

In [30]:
type(range(5))

range

However, we don't need (and shouldn't) convert a `range` to a `list` when we want to iterate over it with, e.g., a `for` loop.

In [31]:
for i in range(5):
    print(f'{i}**2 = {i**2}')

0**2 = 0
1**2 = 1
2**2 = 4
3**2 = 9
4**2 = 16


The `range` function can be called with a single, with two, and with three arguments. When multiple arguments are specified, the first is the start value, the second the last vallue, non-inclusive, and the third the step. Compare the result of following function calls.

In [34]:
print(list(range(10)))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


In [33]:
print(list(range(1, 10)))

[1, 2, 3, 4, 5, 6, 7, 8, 9]


In [35]:
print(list(range(1, 10, 2)))

[1, 3, 5, 7, 9]


Note the similarity between the notation for `str` and `list` slicing, and the `range` function's arguments.

## `enumerate` function

The `enumerate` function is used in `for` loops, and provides both the index and the value of what we iterate over.

In [36]:
data = [3, 5, 7]
for i, value in enumerate(data):
    print(f'{value} at index {i}')

3 at index 0
5 at index 1
7 at index 2


#### Your turn now: enumerate and nucleotide counts

Modify the code of the previous exercise (counting nucleotides) such that the following `print` call would produce the correct error message. For the value of `sequence`, this should be:

`### error: sequence contains invalid symbol Q at index 6`

In [None]:
sequence = 'AGGTACQTTACG'
nucl_count = dict()
for ____ in ____:
    if nucl not in ____:
        print(f'### error sequence contains invalid symbol {nucl} at position {i}')
    if nucl not in ____:
        nucl_count[____] = ____
    nucl_count[____] += 1

## Exceptions and error handling

Let's forget that for the time being.