# 2. Loops and Logic

Now we understand the basics of different datatypes and data structures, we can start to think about how we might interact with sequential data. This notebook will look at two key areas in Python programming: using `for` loops, and using `if`, `elif` and `else` conditional statements. The notebook is structured as follows:

2.1 - *`for` loops*

2.2 - *List comprehensions*

2.3 - *`if`, `elif` and `else`*

2.4 - *`and` and `or`*

## 2.1 `for` loops

When we have collections of data, such as lists or tuples, it is often helpful to iterate over that data to sequentially interact with each element. We use the `for` loop to do this. 

Consider the peptide chain for Oxytocin:

```python

oxytocin = ['C', 'Y', 'I', 'Q', 'N', 'C', 'P', 'L', 'G']
```

A simple `for` loop might print out every element in the list:

```python

for amino_acid in oxytocin: # iteratively assign each element in `oxycotin` to `amino_acid`
    print(amino_acid) # print the current value of `amino acid`
```

Python will iterate over the list `oxytocin`, sequentially updating the value of `amino_acid` to the current element in `oxytocin`. Hence, in the first iteration `amino_acid="C"`, then `amino_acid="Y"` etc. Every indented line after `for amino_acid in oxytocin:` will be run at each step in in the loop.

*Note:* We can indent the line(s) after our `for` statement with either `Tab` or 4x `Space`. It is good practice to be consistent about which of these you use. Some python interpreters can get confused if you use both methods in the same script!

Copy the code above into Code Block 1 to see the output.

In [5]:
oxytocin = ['C', 'Y', 'I', 'Q', 'N', 'C', 'P', 'L', 'G']
for amino_acid in oxytocin: 
    print(amino_acid)
oxytocin_string= ''
for amino_acid in oxytocin:  # Added the missing colon here
    oxytocin_string += amino_acid
    print(oxytocin_string)

C
Y
I
Q
N
C
P
L
G
C
CY
CYI
CYIQ
CYIQN
CYIQNC
CYIQNCP
CYIQNCPL
CYIQNCPLG


We can also use a `for` loop to update a variable outside of the loop. In the following example, we will use the `oxytocin` list to build a string representation of the peptide:

```python
oxytocin_str = '' # this initialises variable to an empty string

for amino_acid in oxytocin:
    oxytocin_str += amino_acid # add `amino_acid` to `oxytocin_str` on each iteration

print(oxytocin_str)
```
Copy this into the code block below:

In [8]:
# code block 2


CYIQNCPLG


Sometimes, it is helpful to keep track of the current iteration number while completing a `for` loop. To do this, we use the `enumerate` function:

```python

for index, amino_acid in enumerate(oxytocin):
    print(f'Amino acid {index} in oxytocin is {amino_acid}')
```

Take a look at the output of the `enumerate()` function byt running `print(enumerate(oxytocin))` in the code block below:

In [14]:
for index, amino_acid in enumerate(oxytocin):
    print(f'Amino acid {index} in oxytocin is {amino_acid}')

Amino acid 0 in oxytocin is C
Amino acid 1 in oxytocin is Y
Amino acid 2 in oxytocin is I
Amino acid 3 in oxytocin is Q
Amino acid 4 in oxytocin is N
Amino acid 5 in oxytocin is C
Amino acid 6 in oxytocin is P
Amino acid 7 in oxytocin is L
Amino acid 8 in oxytocin is G


We can do multiple operations during a single `for` loop. Everything below the `:` that is indented by a single tab will be run on each iteration.

In the following example, we take our `oxytocin` list, convert the symbols to three letter abbreviations, and store the result in a list.

```python
# Set up a dictionary of amino acid abbreviations
aa_abbr = {'C':'Cys', 'Y':'Tyr', 'I':'Ile', 'Q':'Gln', 'N':'Asn', 'P':'Pro', 'L':'Leu', 'G':'Gly'}
oxytocin_3str = ''

for amino_acid in oxytocin:
    aa3 = aa_abbr[amino_acid] # looks up the amino acid in our abbr dictionary
    oxytocin_3str += aa3 # appends the list with the abbreviation
    oxytocin_3str += '-' # adds a '-' to separate each string

print(oxytocin_3str) # note this is outside of the block
``` 

💡 Use Code Block 4 to try out the code above. How might we get rid of the trailing '-' at the end of the string? What if we wanted to add $NH_2$ at the end?

💡 Use a `for` loop and the `aa_mass` dictionary in Code Block 5 to calculate the molar mass of Oxytocin. 

In [7]:
# Code Block 5
aa_mass = {'C':103.01, 'Y':163.06, 'I':113.08, 'Q':128.06, 'N':114.04, 'P':97.05, 'L':113.08, 'G':57.02}
aa_abbr = {'C':'Cys', 'Y':'Tyr', 'I':'Ile', 'Q':'Gln', 'N':'Asn', 'P':'Pro', 'L':'Leu', 'G':'Gly'}
oxytocin_3str = ''
for amino_acid in oxytocin:
    aa3 = aa_abbr[amino_acid]
    oxytocin_3str += aa3
    oxytocin_3str += '-'

    print(oxytocin_3str)

Cys-
Cys-Tyr-
Cys-Tyr-Ile-
Cys-Tyr-Ile-Gln-
Cys-Tyr-Ile-Gln-Asn-
Cys-Tyr-Ile-Gln-Asn-Cys-
Cys-Tyr-Ile-Gln-Asn-Cys-Pro-
Cys-Tyr-Ile-Gln-Asn-Cys-Pro-Leu-
Cys-Tyr-Ile-Gln-Asn-Cys-Pro-Leu-Gly-


## 2.2 List Comprehensions

Often, we want to construct a new list based on the values of an exsiting list. One way to approach this is to initialise a blank list, then iteratively add new elements to it as we loop over our exsisting list. 

In the example below, we will use the `oxytocin` list and the `aa_abbr` dictionary to produce a new list with the three-letter abbreviations for each amino acid in oxytocin.

```python

oxytocin_3 = []

for amino_acid in oxytocin:
    aa3 = aa_abbr[amino_acid]
    oxytocin_3.append(aa3)

print(oxytocin_3)
```

This is all well and good, but uses a lot of lines of code to do quite a simple task. We can make this much simpler by using *list comprehension*:

```python

oxytocin_3 = [aa_abbr[amino_acid] for amino_acid in oxytocin]
print(oxytocin_3)
```

This allows us to compress the full `for` loop into a single line of code.

💡 Try and use a list comprehension to construct a `oxytocin_mass` list in Code Block 6
💡 Use the sum([<list>]) function around your list comprehension to calculate the total mass of Oxytocin. 


In [51]:
# Code Block 6
oxytocin_3 = [aa_abbr[amino_acid] for amino_acid in oxytocin]
print(oxytocin_3)

['Cys', 'Tyr', 'Ile', 'Gln', 'Asn', 'Cys', 'Pro', 'Leu', 'Gly']


## 2.3 Conditionals
Conditionals allow us to run blocks of code based on whether a given statement evaluates as `True` or `False`. They require an `if` operator as a minimum, but may also contain `elif` or `else` statement to chain together multiple conditionals.


### 2.3.1 `if` statements

These statements share a similar structure to `for` loop - a statement ending with a `:` (e.g. `if x == y:`), followed by a sequence of indented lines which are run if the `if` statement evaluates as `True`.

In the following example, an `if` statement and the `in` operator is used to check if `Cysteine` is a member of the `oxytocin` list. In this example, the statement `x in y` evaluates as `True` if the variable `x` is found in the list `y`:

```python

cysteine = 'C'

if Cysteine in oxcotin:
    print("Cysteine is in Oxytocin")
```

💡 Try and rewrite the code above in such a way that prevents the `print()` statement from triggering. *Hint: you can either use the `not` operator, or alter either of the variables*. 

In [18]:
# Code Block 7
Primary_Colour = ['red', 'blue', 'yellow'] 
Secondary_Colour = ['orange', 'pink', 'purple', 'brown']
if 'red' in Primary_Colour:
    print(f'Red is a primary colour')
if 'blue' in Primary_Colour: 
    print(f'Blue is a primary colour')
if 'pink' in Primary_Colour:  # Added missing colon here
    print(f'pink is a primary colour')
else:  # Fixed else statement (proper indentation and on its own line)
    print(f'false')

Red is a primary colour
Blue is a primary colour
false


### 2.3.2 `if` and `else` statements
We can chain together conditoinals using the `if` and `else` operators. The `else` operator tells Python what to run if the `if` statement resolves as `False`. It must sit directly after the `if` statements indented block.

The `else` operator doesn't need any statetment to evaluate, it is called simply if its corresponding `if` statement evaluated as `False`. 

```python

cysteine = 'C'

if cysteine in oxcotin:
    print("Cysteine is in Oxytocin")

else:
    print("Cysteine is not in Oxytocin")
```
💡 You can (but probably shouldn't) write a statement equivalent to `else` using the `not` and `if` statements. Try doing this below. 

In [25]:
# Code Block 8
Primary_Colour = ['red', 'blue', 'yellow'] 
Secondary_Colour = ['orange', 'pink', 'purple', 'brown']
    
if 'pink' in Primary_Colour:
    print(f'FALSE')
elif 'pink' in Secondary_Colour:  # Fixed the quotes around 'pink' and added quotes
    print(f'TRUE')
else: 
    print(f'Neither Correct')

TRUE


### 2.3.3 `if`, `elif` and `else` statements
In some cases, we might want to evaluate multiple conditionals in the same bit of code. For this we use the `elif` operator. These follow an initial `if` block, and are accompanied by a conditional statement. If the inital `if` statement evaluates as `False`, and the `elif` statement evaluates as `True`, then the code block under the `elif` statement is run. If both evaluate as `False` then neither are run. 

These statements always follow an `if` statement. If an `else` statement is needed, it will always follow any `elif` statements. 

```python
oxytocin = ['C', 'Y', 'I', 'Q', 'N', 'C', 'P', 'L', 'G']
bradykinin = ['R', 'P', 'P', 'G', 'F', 'S', 'P', 'F', 'R']

if cysteine in bradykinin: # evaluates as False
    print("Cysteine is in Bradykinin.")

elif cysteine in oxytocin: # Evaluates as True
    print("Cysteine is in Oxytocin.")

else: # Not run as a statement above was True
    print("Cysteine is not in either Bradykinin or Oxytocin.")
```

💡 **Task:** Conditionals can be used inside `for` loops. Use this to build lists of the following:

* All the amino acids in Oxytocin with molar mass `m < 100`, 
* All with molar mass `100 < m <= 120`,
* All with molar mass `m >= 120`.

To acheive this, you'll need:

* The `aa_mass` dictionary, 
* a `for` loop,
* `if`, `elif` and `else` statements,
* the `<`, `>`, `<=` and `>=` operators.

Use the outline in Code Block 9 to help.

Tips:
* The statement `x < y` evaluates as `True` if `x` and `y` are numbers, and `x` is strictly smaller than `y`.
* The `<=` and `>=` operators correspond to *less/greater than or equal to*  respectively. 
* It is possible to solve this problem using only the `<` OR `>`.

In [66]:
# Code Block 9
oxytocin = ['C', 'Y', 'I', 'Q', 'N', 'C', 'P', 'L', 'G']
bradykinin = ['R', 'P', 'P', 'G', 'F', 'S', 'P', 'F', 'R']

if cysteine in bradykinin: # evaluates as False
    print("Cysteine is in Bradykinin.")

elif cysteine in oxytocin: # Evaluates as True
    print("Cysteine is in Oxytocin.")

else: # Not run as a statement above was True
    print("Cysteine is not in either Bradykinin or Oxytocin.")

oxytocin_mass = [aa_mass[amino_acid] for amino_acid in oxytocin]
print(oxytocin_mass)

sum(oxytocin_mass)

Cysteine is in Oxytocin.
[103.01, 163.06, 113.08, 128.06, 114.04, 103.01, 97.05, 113.08, 57.02]


991.41

### 2.3.4 Conditionals and Comprehensions
We can use conditionals in list comprehensions to filter lists by certain values. 

In the following code, we compare the amino acids in `oxytocin` with a list `aa_hphobic` to return just the amino acids in Oxtocin which have hydrophobic side chains:

```python
oxytocin = ['C', 'Y', 'I', 'Q', 'N', 'C', 'P', 'L', 'G']
aa_hphobic = ['G', 'A', 'V', 'L', 'I', 'P', 'F', 'M', 'W'] # amino acids with hydrophobic side chains
oxytocin_hphobic = [aa for aa in oxytocin if aa in aa_hphobic]
print(oxytocin_hphobic)
```

💡 **Task:** Use list comprehensions with conditionals to rewrite your code from Code Block 9 into just three lines of code.


In [67]:
# Code Block 10
oxytocin = ['C', 'Y', 'I', 'Q', 'N', 'C', 'P', 'L', 'G']
aa_hphobic = ['G', 'A', 'V', 'L', 'I', 'P', 'F', 'M', 'W'] # amino acids with hydrophobic side chains
oxytocin_hphobic = [aa for aa in oxytocin if aa in aa_hphobic]
print(oxytocin_hphobic)

['I', 'P', 'L', 'G']


## 2.4 `and` and `or`

n some cases, we might want to only trigger an `if` statement when two conditional statements are satisfied. To do this, we use `and` and `or`. These operators work by comparing two statements which can evaluate as true or false.

Consider three statements:

```python
oxytocin = ['C', 'Y', 'I', 'Q', 'N', 'C', 'P', 'L', 'G']
bradykinin = ['R', 'P', 'P', 'G', 'F', 'S', 'P', 'F', 'R']

cysteine = 'C'
proline = 'P'

print(cysteine in oxytocin) # -> True
print(cysteine in bradykinin) # -> False
print(proline in oxytocin) # -> True
print(proline in bradykinin) # -> True

print((cysteine in oxytocin) and (cysteine in bradykinin)) # -> True and False -> False

print((cysteine in oxytocin) or (cysteine in bradykinin)) # -> True or False -> True

print((proline in oxytocin) and (proline in bradykinin)) # -> True and True -> True
```
Note that each statement being compared by `and` or `or` are in separate brackets!

💡 **Task:** Use `and` and `or` in a `for` loop to count how many amino acids in `oxytocin`:
* Have both hydrophic side chains AND molar mass greater than 100 g/mol,
* Have either hydrophobic side chains OR molar mass greater than 100 g/mol.

**Warning: Be careful with the order you put `and` and `or` conditionals. Remember, if `x` and `y` are both `True`, `x or y` will still trigger. Try with the statements in either order and compare the results.**

In [68]:
# Code Block 11
oxytocin = ['C', 'Y', 'I', 'Q', 'N', 'C', 'P', 'L', 'G']
bradykinin = ['R', 'P', 'P', 'G', 'F', 'S', 'P', 'F', 'R']

cysteine = 'C'
proline = 'P'

print(cysteine in oxytocin) # -> True
print(cysteine in bradykinin) # -> False
print(proline in oxytocin) # -> True
print(proline in bradykinin) # -> True

print((cysteine in oxytocin) and (cysteine in bradykinin)) # -> True and False -> False

print((cysteine in oxytocin) or (cysteine in bradykinin)) # -> True or False -> True

print((proline in oxytocin) and (proline in bradykinin)) # -> True and True -> True

if mass <100: #use 

True
False
True
True
False
True
True
