# Functions, Booleans, Conditionals
In this notebook, we'll encounter another type of variable, a **Boolean**. We'll talk about the ways that we can use conditionals, and begin writing functions. We'll ultimately put all of this together to write a program that can compute the GC content of a nucleotide sequence.
 
## At the end of this notebook, you'll be able to:
* [Write a simple function](#functions)
* [Recognize and use different types of operators (Boolean and comparison)](#operators)
* [Write and test conditional statements in Python](#conditionals)
* [Use these tools to test the GC content in a DNA string](#GCcontent)

<hr>

<a id="Functions"></a>

## Functions

A function has the following syntax:

```
    def function():        
        a*2
        return a
```

In [None]:
# Define the function here
def fun1(a):
    ''' fun1 is a function that raises the input number a to the third power'''
    b = a**3
    return b

In [None]:
# Run the function here
a  = fun1(2)
aa = fun1(fun1(2)) 
print(a,aa)

fun1?

### Additional notes about functions
* We can add docstrings to define a function, by adding a statement wrapped by `'''` after the function name. This will come up when you use `help(function)`.
* Functions can have many, many lines!
* Functions can call other functions.
* A **program** is one or more functions that work together.

<a id="Operators"></a>
   
## Comparison and Boolean Operators

We can use comparison operators to test the relationship between two objects. These statements return Booleans.

**Booleans** are variables that store `True` or `False`. They are named after the British mathematician George Boole. He first formulated Boolean algebra, which are a set of rules for how to reason with and combine these values. This is the basis of all modern computer logic.

**Note:** Capitalization *still* matters! TRUE is not a Boolean.


| Symbol |    Operation   | Usage | Boolean Outcome |
|:------:|:--------------:|:-----:|:-------:|
|    ==   |  is equal to  |`10==5*2`| True | 
|    !=   | is not equal to | `10!=5*2` | False |
|    >  | Greater than |  `10 > 2` | True |
|    <   |    Less than    |  `10 < 2` | False |
| >= | Greater than _or_ equal to | `10 >= 10` | True |
| <= | Less than _or_ equal to | `10 >= 10` | True |


<div class="alert alert-success"><b>Task:</b> Test each of the comparison operators, saving their output to a variable.</div>

In [None]:
# Test the comparison operators here
a = 10
b = 10.0

print(a<b)
print(int(a<b))
print(a==b)
print(float(a==b))

print(a==b+0.00001)
print(a==b+0.00000000001)
print(a==b+0.0000000000000001)

print(a!=b)

**Boolean operators** use Boolean logic, and include:
- `and` : True if both are true
- `or` : True if at least one is true
- `not` : True only if false

What will this statement return?

In [None]:
(6 > 10) and (4 == 4)

<div class="alert alert-success"><b>Task:</b> Test each of the Boolean operations <code>and</code>, <code>or</code>, & <code>not</code> to see how these variables relate, putting integers, floats, and conditional statements on each side.</div>

In [None]:
# Test Boolean operators here
(6 > 10) or (4 == 4)
(6 > 10) or not (4 == 4)


### Short circuit evaluation!
To determine the final result of an `and` expression, Python starts by evaluating the left operand. If it’s false, then the whole expression is false. In this situation, there’s no need to evaluate the operand on the right. Python already knows the final result. This is called **short circuit evaluation** and it saves Python time.

Strings that contain any values return `True`. Strings that are empty (`''`) return `False`.

<div class="alert alert-success"><b>Task:</b> Test how Boolean operators work with strings, by testing <code>'a' and 'b'</code> and then <code>'b' and 'a'</code>.</div>

In [None]:
# Test Boolean operators with strings

print('a' and 'b')
print('b' and 'a')
print(int('a' and not 'b'))
print(bool('a' and 'b'))

print((0 == 1) and (c == c*c))
print((c == c*c) and (0 == 1))


<a id="conditionals"></a>

## Conditionals
**Conditionals** are statements that check for a condition, using the `if` statement, and then only execute a set of code if the condition evaluates as `True`.

- `if`
- `elif` (else if): After an if, you can use elif statements to check additional conditions.
- `else`: After an if, you can use an else that will run if the conditional(s) above have not run.

### If/elif/else syntax
- Indentation matters! Your statements in the `if` block need to be indented by a tab or four spaces.
- You need a colon after `if`, `elif`, and `else`

In [None]:
condition = False

if condition:
    print('This code executes if the condition evaluates as True.')
    
else: 
    print('This code executes if the condition evaluates as False')

### Properties of Conditionals
- Conditionals can take any expression that can be evaluated as `True` or `False`. 
- The order of conditional blocks is always `if` then `elif`(s) then `else`.
- If the `elif` is at the end, it will never be tested, as the else will have already returned a value once reached (and Python will throw an error).
- An `else` statement is not required, but if both the `if` and the `elif` condtions are not met (both evaluate as `False`), then nothing is returned.
- **At most one component (`if` / `elif` / `else`) of a conditional will run**

<a id="GCcontent"></a>

## Writing a program to count GC content

Below, there is a function to calculate the GC content of a DNA string of length 4. It includes a few new elements that we haven't discussed, but can you see what it's doing?

In [None]:
# Write our function

def GCcontent4(DNA):
    
    '''counts GC content for a DNA string of length 4'''
    
    counter = 0 # Initialize counter
    
    if DNA[0]=='G' or DNA[0]=='C':
        counter = counter+1
    if DNA[1]=='G' or DNA[1]=='C':
        counter = counter+1
    if DNA[2]=='G' or DNA[2]=='C':
        counter = counter+1
    if DNA[3]=='G' or DNA[3]=='C':
        counter = counter+1
        
    return counter/4.0

In [None]:
# Call our function
dna = 'ACGG'
GC_1 = GCcontent4(dna)
print(GC_1)

GC_2 = (dna.count('G') + dna.count('C'))/len(dna)
print(GC_2)

The `GCcontent4` uses **conditional statements** to test whether a given nucleotide in the sequence is equal to either a G or a C. In other words, it is doing a **value comparison**. If either of those conditions are met, it increments a **counter**. 

**Question**: Why can't we write `DNA[0] == 'C' or 'G'`?

In [None]:
print('C' or 'G')

## Creating a function that catches errors
If we put in a string that's not length 4, what happens?

<div class="alert alert-success"><b>Task:</b> Pseudocode a function (<code>GCcontent3or4</code>) that will work with strings of 3 or 4. Then, write your new function below.</div>

In [None]:
# Write your function here

def GCcontent3or4(DNA):
    
    '''counts GC content for a DNA string of length 3 or 4'''

    if len(DNA) == 3:
        counter = 0 # Initialize counter
    
        if DNA[0]=='G' or DNA[0]=='C':
            counter = counter+1
        if DNA[1]=='G' or DNA[1]=='C':
            counter = counter+1
        if DNA[2]=='G' or DNA[2]=='C':
            counter = counter+1
        
        return counter/3.0
    elif len(DNA)== 4:
        counter = 0 # Initialize counter
    
        if DNA[0]=='G' or DNA[0]=='C':
            counter = counter+1
        if DNA[1]=='G' or DNA[1]=='C':
            counter = counter+1
        if DNA[2]=='G' or DNA[2]=='C':
            counter = counter+1
        if DNA[3]=='G' or DNA[3]=='C':
            counter = counter+1
        
        return counter/4.0

    else:
        print('Error: DNA string must be length 3 or 4')
        return 0

    


In [None]:
dna = 'CGGAA'
GC_1 = GCcontent3or4(dna)
print(GC_1)

Typically, we'll work with DNA strings that are longer than 4! What if we need to work with a longer string? Turns out, there's a good way to tackle this: a **for** loop. More on that in the next notebook.


## Additional Resources
* <a href="https://www.python-course.eu/python3_functions.php">Python Course: Functions</a>
* <a href="https://swcarpentry.github.io/python-novice-plotting/17-conditionals/">Software Carpentries Conditionals</a>

## About this notebook
* This notebook is largely derived from UCSD COGS18 Materials, created by Tom Donoghue & Shannon Ellis, as well as exercises in [*Computing for Biologists*](https://www.cambridge.org/highereducation/books/computing-for-biologists/5B08EEEE2AE8A602113A8F225E89F5FD#overview).