# Lecture 3 - Functions, Dictionaries, Arguments, and Scopes
### University of California, Berkeley - Spring 2022

## What we have learned?

- Python
- Variables
- Operators
- strings
- lists
- Flow control: conditions (if,elif,else) and loops (for,while)

## Dictionaries
### Reminder - Lists
Lists are a data structure used to store collections of elements (int, float, str etc.) in an __ordered__ way.

In [1]:
organisms = ['Pan troglodytes', 'Gallus gallus', 'Xenopus laevis', 'Vipera palaestinae']

We access elements of lists by using their _index_:

In [2]:
print(organisms[0])
print(organisms[2])

Pan troglodytes
Xenopus laevis


__Dictionaries__ are another data structure used to store collections of elements, only this time they can be accessed through a _key_. Keys can be anything - a string, an integer, float and so on. Each key is connected to a _value_.

### Defining dictionaries:

In [3]:
organisms_classes = {'Pan troglodytes': 'Mammalia', 'Gallus gallus': 'Aves', 'Xenopus laevis': 'Amphibia', 'Vipera palaestinae': 'Reptilia'}

In this dictionary, the _keys_ are the organisms and the _values_ are the class of each organism. Both are of type `str`.

Another example would be a dictionary representing the number of observations of various species:

In [4]:
observations = {'Equus zebra': 143,
                'Hippopotamus amphibius': 27,
                'Giraffa camelopardalis': 71,
                'Panthera leo': 112}

Here, the keys are of type `str` and the values are of type `int`. Any other combination could be used.

### Accessing dictionary records
Accessing a dictionary record is similar to what we did with lists, only this time we'll call a _key_ instead of an _index_:

In [5]:
print(organisms_classes['Pan troglodytes'])
print(organisms_classes['Gallus gallus'])

Mammalia
Aves


### Changing and adding records
We can change the dictionary by simply assigning a new value to a key.

In [6]:
organisms_classes['Pan troglodytes'] = 'Mammals'
print(organisms_classes['Pan troglodytes'])

Mammals


Similarly, we can use this syntax to add new records: 

In [7]:
organisms_classes['Danio rerio'] = 'Actinopterygii'
print(organisms_classes['Danio rerio'])

Actinopterygii


__Note__: A dictionary may not contain multiple records with the same _key_, but it may contain many keys with the same _value_.

### Looping throgh dictionaries
Remember the __for__ loop and how we used it to loop on lists?

In [8]:
for organism in organisms:
    print(organism)

Pan troglodytes
Gallus gallus
Xenopus laevis
Vipera palaestinae


Well, it also works on dictionaries! The for loop simply itterates over the _keys_ of the dictionary.

In [9]:
for organism in organisms_classes:
    print(organism, 'belongs to the', organisms_classes[organism], 'class.')

Pan troglodytes belongs to the Mammals class.
Gallus gallus belongs to the Aves class.
Xenopus laevis belongs to the Amphibia class.
Vipera palaestinae belongs to the Reptilia class.
Danio rerio belongs to the Actinopterygii class.


Notice that dictionary items don't keep their original order.

We can even change values while looping:

In [10]:
for animal in observations:
    if observations[animal] > 50:
        observations[animal] = True
    else:
        observations[animal] = False
print(observations)

{'Equus zebra': True, 'Hippopotamus amphibius': False, 'Giraffa camelopardalis': True, 'Panthera leo': True}


### Is it in the dictionary?
We can check if a __key__ is in the dictionary using an _if_ statement:

In [11]:
'Vipera palaestinae' in organisms_classes

True

In [12]:
'Bos taurus' in organisms_classes

False

In [13]:
new_organism = ['Vipera palaestinae', 'Bos taurus']
for organism in new_organism:
    if organism in organisms_classes:
        print(organism, 'belongs to the', organisms_classes[organism], 'class.')
    else:
        print(organism, 'not found in dictionary.')

Vipera palaestinae belongs to the Reptilia class.
Bos taurus not found in dictionary.


## What's a function?
### In mathematics:
A function is like a _machine_, that (usually) takes a number, performs some mathematical process and returns another number.  
For example, the function $f(x) = 2x + 6$  
When the function takes 3 (that is, x = 3), it returns 2*3 + 6 = 12  
And in 'pythonic':

In [14]:
x = 3
y = 2*x + 6
print(y)

12


### In computer science
A function is a piece of code that performs some process. Like the mathematical concept, a function receives _inputs_ and returns _outputs_.  
We _define_ functions with the __def__ command.  
The general syntax is:  

In [21]:
def function_name(input1, input2, input3): # ...
    # some processes
    # .
    # .
    # .
    return None # output

In [22]:
def linear1 (x):
    y = 2*x + 6
    return y

Once a function is defined, we can call it whenever we need it (i.e. multiple times), with different inputs.

In [23]:
result1 = linear1(3)
print(result1)

12


In [24]:
result2 = linear1(7)
print(result2)

20


A function may have more than one input, and they can also be other types of variables.  
For example, the following function receives a __list__ of sequences and concatenates a given sequence __string__ to each sequence in the list. It then returns the new list.

In [25]:
def concat_to_sequences(sequence_list, sequence_to_concat):
    new_list = []
    for seq in sequence_list:
        new_list.append(seq + sequence_to_concat)
    return new_list

In [26]:
my_sequences = ['AGTTAGAGTTA', 'TTACCAGTG', 'GGCAACTTTAGG']
new_sequences = concat_to_sequences(my_sequences, 'GGG')
print(my_sequences)
print(new_sequences)

['AGTTAGAGTTA', 'TTACCAGTG', 'GGCAACTTTAGG']
['AGTTAGAGTTAGGG', 'TTACCAGTGGGG', 'GGCAACTTTAGGGGG']


The inputs of a function are also called __Arguments__ or formal variables.

### Why do we need functions?
So why bother? Can't we just write code as we did so far and avoid all that functions mess?  
Functions are good for (at least) three reasons:
* Prevent code duplication - if we perform the same process multiple times, we don't have to write it again every time. We just call the function, thereby making the code shorter and more readable and avoid errors.
* Modularity - Your code can easily be separated to small components, which can be reused and recombined.
* Abstraction - separating a complex task into smaller and more simple tasks.

### A biological example

Now, let's use some of the stuff we've learned to write a function that finds the reverse complement of a given sequence. Let's start by finding the complement.

In [27]:
def complement(sequence):
    transcript_dict = {'A': 'T', 'T': 'A', 'G': 'C', 'C': 'G'}
    complement = ''
    for base in sequence:
        complement += transcript_dict[base]
    return complement

In [28]:
my_dna = 'ACGCTATTAGAGGGCGAGAAGCTAGAGGA'
my_complement = complement(my_dna)
print(my_complement)

TGCGATAATCTCCCGCTCTTCGATCTCCT


Now, let's write another function, that reverses a given sequences.

In [29]:
def reverse_sequence(sequence):
    reversed_seq = ''
    seq_as_list = list(sequence)
    for base in reversed(seq_as_list):
        reversed_seq += base
    return reversed_seq

In [30]:
my_reverse_complement = reverse_sequence(my_complement)
print(my_reverse_complement)

TCCTCTAGCTTCTCGCCCTCTAATAGCGT


We can call functions _from within_ a function, thereby wrapping the two functions we have in a third function.

In [31]:
def reverse_complement(sequence):
    complement_seq = complement(sequence)
    reverse_complement = reverse_sequence(complement_seq)
    return reverse_complement

In [32]:
print(reverse_complement(my_dna))

TCCTCTAGCTTCTCGCCCTCTAATAGCGT


Fuctions don't __have__ to return anything. Sometimes they just print stuff to the screen or to a file (next lesson). For example, we can take the function we created above and simply replace 'return' with 'print':

In [33]:
def print_reverse_complement(sequence):
    complement_seq = complement(sequence)
    reverse_complement = reverse_sequence(complement_seq)
    print(reverse_complement)

In [34]:
print_reverse_complement(my_dna)

TCCTCTAGCTTCTCGCCCTCTAATAGCGT


So, what's the difference between __return__ and __print__???  
As the names suggest, while __print__ just prints the output of the function, __return__ retrns a value that can be stored within a variable. The difference is especially noticable when the output is not a string (e.g. list, dictionary etc). Even if the output is a string, __retun__ let's you further manipulate the output, while __print__ does not. 

In [35]:
my_reverse_complement = reverse_complement(my_dna)
final_sequence = "ATG" + my_reverse_complement + "TAA"
print(final_sequence)

ATGTCCTCTAGCTTCTCGCCCTCTAATAGCGTTAA


### Documenting your functions
It is considered good practice to add documentation to functions you write - what do they do, what's their input and output etc. It becomes very useful once you have lots of code that you want to reuse. If you document your functions, you won't have to read the whole code when you need them again.  
Documenting functions is done by adding a '_docstring_' right under the definition line. It is enclosed by """. For example:

In [36]:
def reverse_complement(sequence):
    """
    Receives a string of DNA sequence and returns a string of it's reverse complement
    """
    complement_seq = complement(sequence)
    reverse_complement = reverse_sequence(complement_seq)
    return reverse_complement

You can easily access the documentation of a function using the `help()` command.

In [37]:
help(reverse_complement)

Help on function reverse_complement in module __main__:

reverse_complement(sequence)
    Receives a string of DNA sequence and returns a string of it's reverse complement



### Built-in functions

In fact, we've used functions before, without defining them first. For example: print(), type(), int(), len() etc. These functions are provided by the courtesy of Python developers. It is strongly adviced not to overwrite built-in functions with your own functions. That is, don't do:

In [38]:
# def len(lst):
#    .
#    .
#    .

just use another name...  
We can acquire more functions written by others by __importing__ them into our code. We'll do that on the next lesson.

## Scopes

Assume we have the following function, that calculates the hypotenuse (יתר) given two sides of a right triangle. (Remember Pythagoras' theorem?)

In [39]:
def pythagoras(a,b):
    hypo_square = a**2 + b**2
    hypo = hypo_square**0.5

And now we want to run our function on the sides _a_ = 3 and _b_ = 5. So we do:

In [40]:
pythagoras(3,5)
print(hypo)

NameError: name 'hypo' is not defined

__What happened to our result???__  
The answer is _Scope_!  
The variable _hypo_ 'lives' only as long as the function is running. In other words, it exists only withing the _scope_ of the function, and so do _a, b_ and _hypo_square_!  
If we try to print hypo from _within_ the function:

In [41]:
def pythagoras(a,b):
    hypo_square = a**2 + b**2
    hypo = hypo_square**0.5
    print(hypo)
pythagoras(3,5)

5.830951894845301


Or even better, we can use the __return__ statement to get the result. Like this:

In [42]:
def pythagoras(a,b):
    hypo_square = a**2 + b**2
    hypo = hypo_square**0.5
    return(hypo)

result = pythagoras(3,5)
print(result)

5.830951894845301


## Congrats!

The notebook is available at https://github.com/Naghipourfar/molecular-biomechanics/