# BMI 565: Bioinformatics Programming & Scripting

#### (C) Michael Mooney (mooneymi@ohsu.edu)

## Week 1 - Control Structures

1. If/Else Blocks
2. Iterable Objects, Sequences, Generators
    * Some Useful Functions
3. While Loops
    * The `range()` function
4. For Loops
    * The `enumerate()` function
5. List Comprehension

#### Requirements
- Python 2.7 or 3.x

In [1]:
from __future__ import print_function, division

## If / Else Blocks

`if` and `else` statements are used to control the order of code execution. In Python, blocks of code are defined using a colon (`:`) and indentation. For example:

    if expression:
        code block 1
    else:
        code block 2

\* Indentation must be consistent (don't mix spaces and tabs).

### If / Else Examples

In [2]:
## Import the random module (random number generation)
## Compare two dice rolls
import random
roll1 = random.randint(1,6)
roll2 = random.randint(1,6)
if roll1 > roll2:
    print("Roll #1 is greater.")

In [3]:
## Import the random module
## Compare two dice rolls
import random
roll1 = random.randint(1,6)
roll2 = random.randint(1,6)
if roll1 > roll2:
    print("Roll #1 is greater.")
elif roll2 > roll1:
    print("Roll #2 is greater.")
else:
    print("The rolls are equal.")

Roll #1 is greater.


In [4]:
## Use the pass statement to skip a block of code
import random
roll1 = random.randint(1,6)
roll2 = random.randint(1,6)
if roll1 > roll2:
    print("Roll #1 is greater.")
elif roll2 > roll1:
    pass
else:
    print("The rolls are equal.")

The rolls are equal.


## Iterable Objects, Sequences, Generators

Loops can be used to repeat a certain function on multiple pieces of data. The data processed by a loop can be contained in any iterable object. In Python, an iterable object is any object capable of returning its members one at a time. Examples of iterable objects are: any sequence data type (list, tuple, strings), a dictionary, a file object, etc. A generator is a special kind of function that returns an iterator (i.e. it returns one value at a time). We'll cover these a little bit later (an example is `enumerate()` shown below). 

### Some Useful Functions

In [5]:
## Use the range() function to create a numeric list
## In Python 3 range returns a special range type object (not a list),
## so we wrap it in list() for compatibility
list(range(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [6]:
help(range)

Help on class range in module builtins:

class range(object)
 |  range(stop) -> range object
 |  range(start, stop[, step]) -> range object
 |  
 |  Return an object that produces a sequence of integers from start (inclusive)
 |  to stop (exclusive) by step.  range(i, j) produces i, i+1, i+2, ..., j-1.
 |  start defaults to 0, and stop is omitted!  range(4) produces 0, 1, 2, 3.
 |  These are exactly the valid indices for a list of 4 elements.
 |  When step is given, it specifies the increment (or decrement).
 |  
 |  Methods defined here:
 |  
 |  __bool__(self, /)
 |      True if self else False
 |  
 |  __contains__(self, key, /)
 |      Return key in self.
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __getitem__(self, key, /)
 |      Return self[key].
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __hash

In [7]:
list(range(5, 10, 2))

[5, 7, 9]

In [8]:
## The enumerate() function produces an iterator of tuples
## containing the index and value of the elements in a list
list(enumerate(range(2,6)))

[(0, 2), (1, 3), (2, 4), (3, 5)]

## While Loops

In [9]:
## Create a list using the range() function
L = list(range(4,12))

## Iterate through the list
i = 0
while i < len(L):
    print(L[i])
    i += 1 ## i = i + 1

4
5
6
7
8
9
10
11


In [10]:
## How many dice rolls will it take to roll a 5?
import random
desired = 5
cur_roll = None
nrolls = 0

## It is important to ensure that a loop's stop condition is 
## possible. Use an assert statement to check that the 
## desired number is between 1 and 6
assert 1 <= desired <= 6, "Desired roll must be between 1 and 6."

## Role dice until the desired number is rolled and count 
## the number of rolls
## Be careful using 'True' as the condition in a while loop!
## Make sure there is an appropriate way to exit the 
## loop ('break' statement)
while True:
    cur_roll = random.randint(1,6)
    nrolls = nrolls + 1
    if cur_roll == desired:
        ## the break statement exits the loop
        break

print("It took %d roll(s) to roll a %d" % (nrolls, desired))

It took 2 roll(s) to roll a 5


## For Loops

In [11]:
## Iterate through list and print each element
nrolls = 7
for i in range(nrolls):
    print(i)

0
1
2
3
4
5
6


In [12]:
## Iterate through a string
word = "Hello"
for letter in word:
    print(letter)

H
e
l
l
o


In [13]:
## Use enumerate() to get an element's index while iterating 
## through a sequence
for letter in enumerate(word):
    print(letter)

(0, 'H')
(1, 'e')
(2, 'l')
(3, 'l')
(4, 'o')


In [14]:
## Unpacking values in a tuple
for index, letter in enumerate(word):
    print(index, letter)

0 H
1 e
2 l
3 l
4 o


## List Comprehension
A concise alternative to a for loop is a list comprehension. List comprehensions are used to iterate over an iterable object and to **create a new list**. The basic format is as follows:

    [TASK for VARIABLE in ITERABLE if CONDITION]

This is equivalent to:
    
    new_list = []
    for VARIABLE in ITERABLE:
        if CONDITION:
            result = TASK
            new_list.append(result)


In [15]:
## Use list comprehension to create a new list
[n+1 for n in range(9)]

[1, 2, 3, 4, 5, 6, 7, 8, 9]

In [16]:
## List comprehension with a condition
## Select every third element from sequence
[letter for idx, letter in enumerate('California') if idx % 3 == 0]

['C', 'i', 'r', 'a']

In [17]:
list(enumerate('California'))

[(0, 'C'),
 (1, 'a'),
 (2, 'l'),
 (3, 'i'),
 (4, 'f'),
 (5, 'o'),
 (6, 'r'),
 (7, 'n'),
 (8, 'i'),
 (9, 'a')]

In [18]:
## You can also create a nested list comprehension
## Create a matrix (a list of lists)
mat = [[1,2,3],[4,5,6],[7,8,9]]

## Iterate through the matrix, then iterate through
## the rows, and create a list with elements divisible by 3
[x for row in mat for x in row if x % 3==0]

[3, 6, 9]

You can also create a generator version of a list comprehension, called a generator expression. These are more memory efficient, since they return one value at a time (rather than an entire list at once). Generator expressions can be useful when:
- You only need to process one item at a time
- The list is very large
- You don't need the individual results, just a final answer

In [19]:
sum(x for row in mat for x in row if x % 3==0)

18

## In-Class Exercises

In [None]:
## Exercise 1.
## Generate a random nucleotide sequence [A,T,C,G] 
## using a while loop


In [None]:
## Exercise 2.
## Generate a random nucleotide sequence [A,T,C,G] 
## using a for loop


In [None]:
## Exercise 3.
## Use list comprehension to create a list of random 
## integers between 0 and 3


In [None]:
## Exercise 4.
## Use list comprehension to convert the list above 
## to a nucleotide sequence


In [None]:
## Exercise 5. -- Extra
## Create a large (100000) list of random sequences (strings)
## store these sequences in a list, set, or dictionary
## sequences as keys, index as value
## Ask whether some random sequence exists in your set and observe
## the efficiency of the different data types


## References

- <u>Python Essential Reference</u>, David Beazley, 4th Edition, Addison‐Wesley (2008)
- <u>Problem Solving, Abstraction and Design Using C++</u>, Frank Friedman, Elliot Koffman, 4th Edition, Addison-Wesley (2004)
- <u>Python for Bioinformatics</u>, Sebastian Bassi, CRC Press (2010)
- [http://docs.python.org/](http://docs.python.org/)

#### Last Updated: 15-Sep-2022