# Notebook 3 - Functions
Functions are a crucial for writing code that is concise and reusable. They allow us to wrap multiple line sof code in a single, callable module - and pass variables into that module. We've already seen a few built in function, like `len()` and `.append()`. 

## 3.1 `def`, `return` and arguments
We define a function using the `def` keyword, and use the `return` keyword define the output of the function. Every indented line after `:` is part of the function.

Functions take in *arguments*, these are variables passed into the function definition which are used in the rest of the code. For example, the function below takes an argument `x` and returns the square of that input:

```python
def square_number(x):
    return x*x
```
We can then call that function and use it in other parts of our code:

```python
y = 3
y_squared = square_number(y)
print(y_squared) # 9
```

💡 **Task:** Use a similar structure to write a function which reverses a string.

In [3]:
# Code Block 1
y = 3
def square_number(x): 
    return x*
y_squared = square_number(y)
print(y_squared) # 9

SyntaxError: invalid syntax (211792974.py, line 4)

In [4]:
# Code Block 1
y = 3
def square_number(x): 
    return x*x  # Fixed: Added the second operand to the multiplication operator
y_squared = square_number(y)
print(y_squared) # 9

9


Most of the time, we write functions to prevent us from rewriting the same blocks of code over and over again. 

The following block of code builds 3-letter abbreviation strings for two peptides - Oxytocin and Bradykinin:

```python

oxytocin = ['C', 'Y', 'I', 'Q', 'N', 'C', 'P', 'L', 'G']
bradykinin = ['R', 'P', 'P', 'G', 'F', 'S', 'P', 'F', 'R']
aa_abbr = {'C':'Cys', 'Y':'Tyr', 'I':'Ile', 'Q':'Gln', 'N':'Asn', 'P':'Pro', 'L':'Leu', 'G':'Gly','R':'Arg', 'F':'Phe', 'S':'Ser'}

# build the 3-letter representation for oxytocin:
oxytocin_3str = ''
for aa in oxytocin:
    abbr = aa_abbr[aa]
    oxytocin_3str += abbr
    oxytocin_3str += '-'
# Add an CONH2 C-Terminal Group
oxytocin_3str += 'CONH2'

# build the 3-letter representation for bradykinin:
bradykinin_3str = ''
for aa in bradykinin:
    abbr = aa_abbr[aa]
    bradykinin_3str += abbr
    bradykinin_3str += '-'
bradykinin_3str += 'CONH2'

print(bradykinin_3str)
print(oxytocin_3str)

```
there are two large blocks in this code that acheive the same thing: cycling through amino acids in a peptide, looking up the abbreviations, and appending them to the output list. We can use functions to make this much cleaner!

The function below takes a `peptide` as an argument and returns a string of the 3 letter abbreviations.

```python
def make_peptide_3str(peptide):

    peptide_3str = ''
    for aa in peptide:
        abbr = aa_abbr[aa]
        peptide_3str += abbr
        peptide_3str += -
    
    peptide_3str += 'CONH2'

    return peptide_3str
```
We can now call our `make_peptide_3str()` function on any arbritrary peptide:

```python
oxytocin = ['C', 'Y', 'I', 'Q', 'N', 'C', 'P', 'L', 'G']
bradykinin = ['R', 'P', 'P', 'G', 'F', 'S', 'P', 'F', 'R']

oxytocin_3str = make_peptide_3str(oxytocin)
bradykinin_3str = make_peptide_3str(bradykinin)

print(oxytocin_3str)
print(bradykinin_3str)
```

💡 **Task:** Use a similar structure to the code above, along with the `aa_mass` dictionary in Code Block 2 to build a function called `calculate_peptide_mass` which takes in a `peptide` list as an argument and returns the molar mass of the peptide. Try your function with Oxytocin and Bradykinin.

In [None]:
# Code Block 2
aa_mass = {'C':121.16, 'Y':181.19, 'I':131.17, 'Q':146.14, 'N':132.12, 'P':115.13, 'L':131.17, 'G':75.05,'R':174.20, 'F':165.19, 'S':105.09}
        
def Peptide_Mass: 0.0



## 3.2 Functions with default arguments

Funcitons can have multiple arguments. The example below shows an updated version of the `make_peptide_3str` function which takes both `peptide` and `c_terminal` as arguments. This gives the function a little more utility and flexibility.

Additionally, the `c_terminal` function is passed with a default argument of `CONH2`. We do this by putting `c_temrinal = "CONH2"` in the function definition. We will explore what this does in the next Code Block. 


```python

def make_peptide_3str(peptide, c_terminal='CONH2'):
    peptide_3str = ''
    for aa in peptide:
        abbr = aa_abbr[aa]
        peptide_3str += abbr
        peptide_3str += -
    
    peptide_3str += nc
    return peptide_3str
```
Unless you specifcy otherwise, Python will assume you are passing arguments in the same order they appear in the function definition. For example, `make_peptide_3str(oxytocin, 'COOH')` is equivalent to both `make_peptide_3str(peptide=oxytocin, c_terminal='COOH')` and `make_peptide_3str(c_terminal='COOH', peptide=oxytocin)`.

💡 **Task:** Copy the code above into Code Block 3, then try calling `make_peptide_3str` with the following arguments:
* `make_peptide_3str(peptide=oxytocin, c_terminal="CONH2")`
* `make_peptide_3str(peptide=oxytocin, c_terminal="COOH")`
* `make_peptide_3str(peptide=oxtocin)`
* `make_peptide_3str(c_terminal="COOH")`

What is the role of the default argument? What happens if you don't pass the default argument when calling the function?

In [7]:
# Code Block 3
def make_peptide_3str(peptide, c_terminal='CONH2'):
    peptide_3str = ''
    for aa in peptide:
        abbr = aa_abbr[aa]
        peptide_3str += abbr
        peptide_3str += -
    
    peptide_3str += nc
    return peptide_3str
make_peptide_3str(peptide=oxytocin, c_terminal="CONH2")
make_peptide_3str(peptide=oxytocin, c_terminal="COOH")
make_peptide_3str(peptide=oxtocin)
make_peptide_3str(c_terminal="COOH")

SyntaxError: invalid syntax (3778881050.py, line 7)

In [10]:
# Code Block 3
def make_peptide_3str(peptide, c_terminal='CONH2'):
    peptide_3str = ''
    for aa in peptide:
        abbr = aa_abbr[aa]
        peptide_3str += abbr
        peptide_3str += '-'  # Fixed: Added quotes around the hyphen to make it a string
    
    # Remove the trailing hyphen if there is one
    if peptide_3str.endswith('-'):
        peptide_3str = peptide_3str[:-1]
        
    peptide_3str += c_terminal  # Fixed: Using the parameter name c_terminal instead of nc
    return peptide_3str

# Note: You'll need to define aa_abbr and oxytocin before running this code
# Also, there's a typo in one of your function calls: "oxtocin" should be "oxytocin"
# And the last call is missing the required "peptide" parameter

## 3.3 Functions with multiple returns 

We can design a function to give us different returns based on some conditional. The function below checks to see if a given amino acid is present in a given peptide. To acheive this, it uses two different `return` statements in the same function. 

```python

def is_amino_acid_in_peptide(amino_acid='C', peptide=oxytocin):

    if amino_acid in peptide:
        return True
    else:
        return False
```
💡 **Task:** Use your `calculate_peptide_mass` function to build a new function which returns `True` if a given peptide has molar mass greater than 1000, otherwise returns `False`. How could you amend this to work for an arbritrary mass?

In [14]:
# Code Block 3


## 3.4 Documentation and `help()`

Documentation is vital for helping to make our code more readable and reusable. We include documentation in to our functions in the form of a *docstring* by writing text inside triple quotes (i.e., `"""<text>"""`) after our function definition. This usualy informs the user about what the function does, what data-types the input should be, and what types to expect in the output. Below is an example for a Docstring for `is_amino_acid_in_peptide`:

```python
def is_amino_acid_in_peptide(amino_acid='C', peptide=oxytocin):
    """Checks whether a given `amino_acid` is found in a given `peptide`.

    args:
        amino_acid (str) : A single letter amino acids.
        peptide (list[str]) : A list of single letter amino acids in a peptide. 
    returns:
        Boolean : True if amino acid is in peptide else False. 
    """ 
    if amino_acid in peptide:
        return True
    else:
        return False
```
In this example, we tell the user that the type ecpected for `peptide` is `list[str]` as it is a list of string elements. If it were a list of `float` numbers we would use `list[float]`. 

We can view the documentation for a function `function1` by clling `help(function1)`.

💡 **Task:** Rewrite your `calculate_peptide_mass` function to include documentation, then use the `help()` function to view the documentation. 

In [15]:
# Code block 4


## 3.5 Exercise 1: building peptides from DNA

This exercise will pull together a few of the things we have covered already to build a function that takes in a sequence of nucleotides as an argument, and returns a 3-letter string formatted peptide as an output. 

To get us started, Code Block 5 contains a dictionary `dna_to_aa` which has DNA codon triplets as keys, and the amino acids they encode as values. The input to your function will be a string of nucelotides, e.g. "GGGATACCGTAG", and will always end with the "TAG" end codon.  

The final funciton should work something like this:

```python

peptide_3str = peptide_from_dna(dna_sequence='GGGGTGGATTAG')

print(peptide_3str) # -> Glu-Val-Asp
```

Think carefully about the different steps required in your function, and how you will address them. You might be able to reuse some functions from elsewhere in this notebook!

💡 **Tip:** You can write multiple functions here. It's usually best to aim for a function to only do one thing at a time, so try to split the tasks across a few different functions!

💡 **Extension:** This function will fail if the length of the input nucleotide is not divisible by three. Luckily, we have the `%` (mod) operator, which returns the remainder of a number when it is divided by another number. Hence, if `x=9` and `y=3` then `x % y` evaluates to `0`, while (x+1) % y evaluates to `1`. 
* Use the `%` operator to add a check in your function which prints `"Check DNA sequence length!"` if the length is not a mutiple of three.
* What other checks could you put in to the function to ensure safe use?

In [12]:
# Code Block 
dna_to_aa = {'GGG':'G', 'GAG':'E', 'GAT':'D', 'GTG':'V', 'GCG':'A', 'AGA':'R', 
             'AGT':'S', 'AAA':'K', 'AAT':'N', 'ATA':'I', 'ACC':'T', 'TAT':'Y', 
             'TTG':'L', 'TGT':'C', 'TTT':'F', 'CCG':'P', 'CAG':'Q', 'ATG':'M', 'TAG':'End'}

dna_sequence = 'TGTTATATACAGAATTGTCCGTTGGGGTAG'
X = 9
Y = 3
X == Y = 0
(x+1)%Y=1

SyntaxError: cannot assign to expression here. Maybe you meant '==' instead of '='? (3369336348.py, line 9)

In [13]:
# Code Block 
dna_to_aa = {'GGG':'G', 'GAG':'E', 'GAT':'D', 'GTG':'V', 'GCG':'A', 'AGA':'R', 
             'AGT':'S', 'AAA':'K', 'AAT':'N', 'ATA':'I', 'ACC':'T', 'TAT':'Y', 
             'TTG':'L', 'TGT':'C', 'TTT':'F', 'CCG':'P', 'CAG':'Q', 'ATG':'M', 'TAG':'End'}

dna_sequence = 'TGTTATATACAGAATTGTCCGTTGGGGTAG'
X = 9
Y = 3
# Using == for comparison instead of = for assignment
print(X % Y == 0)  # Check if X modulo Y equals 0
print((X+1)%Y == 1)  # Fixed capitalization of X and using == for comparison



True
True
