# Introduction to the Python programming language

The "Collaborative Chemoinformatics Open Platform" (CHEMO) course is an autonomous learning platform aimed at students and researchers interested in the development of biocomputing tools, from Python, which is the most widely used programming language. With advances in omic sciences and new technologies, it has become necessary to acquire computer skills applied to science and the management of biological databases.

Welcome to the introduction of the course CHEMO, in the first part you will find the fundamentals to understand the Python programming language, it is focused on the manipulation, extraction and analysis of data from biological databases (omics) , starting from the basic concepts of programming and its applications, with examples based on manipulation of DNA sequences.

> **Note:** This book is available in two ways
> 1. Downloading the repository and following the instructions in the file [README.md](https://github.com/ramirezlab/CHEMO/blob/main/README.md)
> 2. Clicking here on[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ramirezlab/CHEMO/blob/main/1_PART_ONE/1.1_Introduction_Python.en.ipynb?hl=es)

## Contents

In this first *Python_basic* Jupyter notebook you will learn:

1. Introduction to Python
2. Variables
3. Types of data
4. Types of arrangements
5. Upload files
6. String manipulation
7. Flow control structures
8. Functions

## JupyternoNotebook Overview
A jupyter notebook is made up of a title, a toolbar, and cells. The toolbar allows us to save, insert a new cell, cut, copy and paste cells. In addition, it has a section with functions to manipulate the Kernel component, which is the part in charge of executing the code contained in the notebook. In this section we will be able to run the selected cell and move forward, interrupt the kernel, restart the kernel and restart the kernel and run all the cells. For all tutorials we recommend rebooting the kernel, this is done by clicking on the button with an arrow with the return sign. Finally, we have an option in the toolbar for cells that can be of type code and are identified by the In [ ]: label that appears at the beginning of the cell. Another option for the cell is markdown, it is there where we can include comments, images and links using the Markdown language.

# Theory: basic concepts

## Variables

A variable is a reference to a value in the computer's memory, where different **types of data** can be stored. Values ​​are usually assigned to variable names, an example of this is: `text = "hello world"`, `text` is the name of the variable and refers to `hello world` which would be the value of the variable, and in the previous example the operator `=` was used, then the operation that was performed was assignment. This is known as an assignment expression.

It is usually the software developer who assigns a variable name that is easy to remember and use in the program. It is important to know that the name of the variables cannot begin with a number and they are case sensitive, and spaces cannot be included.

The following table shows some of the data that can be worked on in Python:

### Native Python data types

|          Name         | Type name  |  Category  |                  Description                  |        Example                                |
| :-------------------- | :--------- | :--------- | :-------------------------------------------- |:---------------------------------------------|
| integer               | `int`      |   Numeric  |          Positive/negative integers           | 1, 2, ..., n                                 |
| floating point number | `float`    |   Numeric  |        Real numbers in decimal form           | 1.0, 1.1, 1.22, 1.333, ..., n                |
| boolean               | `bool`     |   Logical  |           True or False                       | True, False                                  |
| string                | `str`      |   Chain    | Text                                           | 'Hello world...'                              |
| list                  | `list`     |  Sequence  | An ordered and mutable collection of objects   | [0, 1, 2, 3, ..., n]                         |
| tuple                 | `tuple`    |  Sequence  | An immutable, ordered collection of objects   | (1, 2, 3, ..., n)                            |
| dictionary            | `dict`     |     Map    |             Object Pair Map                   | { "First Name": "Doe", 'Last Name: "Laden" } |
| none                  | `NoneType` |     Null   | Represents no value                           | `None`                                       |

(_Note: Variable names cannot be equal to python_ reserved words)

## Type of data

### Data type: numeric

There are three types of numeric data, here we will generally work with two of them: `integers` (`int`) and `Floating point` (`float`) **<sup> 1 </sup>**. The `type()` function helps us determine the type of an object in Python

* **Integers (int)**: this data type includes all integers, since this set is infinite, the `Python` language is limited by the available memory capacity.
* **Floating point (float)**: This type of data is used to represent most real numbers without problems. `Float` values are stored in a very particular way, called floating point representation, which is explained in detail in [the IEEE 754 standard](https://en.wikipedia.org/wiki/IEEE_754) Thus, if an integer is defined with a decimal point, for example: 1.0, it will be stored as a float.


In [None]:
# Example Integer
integer_1 = 12361
float_1 = 123,215
float_2 = 1236.0

print("The type of the variable integer_1 is: " + str(type(integer_1)))
print("The type of the variable float_1 is: " + str(type(float_1)))
print("The type of float_2 is: " + str(type(float_2)))

#### Arithmetic operations

| Symbol | Description |
|:-------:|:---------------:|
| `+` | addition |
| `-` | subtraction |
| `*` | multiplication |
| `/` | division |
| `**` | power |
| `//` | integer division |
| `%` | module |

*Note*: From now on we will work with `f-strings` that allow you to make short lines of text with built-in variables. For more information click here: [f-strings](https://platzi.com/blog/f-strings-en-python/?utm_source=google&utm_medium=cpc&utm_campaign=12915366154&utm_adgroup=&utm_content=&gclid=Cj0KCQjw3IqSBhCoARIsAMBkTb2p5ZOBtPtlGG2B7P0qrtnp8Wwvbgd2OY_F3_P-6OOU1YE_QHHCMaYaAnTaEALw_wcB&gclsrc=aw.ds), [f-strings](https://peps.python.org/pep-0498/)

In [None]:
# Example operations with integers
x = 10
y = 5

print(f'The sum of two integers is an integer, for example: {x + y}')

In [None]:
# Example operations with real numbers in decimal form
x = 10.0
y = 5.0

print(f'The sum of two float numbers results in a float number, for example: {x + y}')

In [None]:
# Example operations with real numbers and integers
x = 10
y = 5.0

# Note that the Python interpreter gives importance to float types over int types.
print(f'The sum of an integer and a float is a float, for example: {x + y}')


In `Jupyter` (interactive version of Python) notebooks, the last line of the cell will be displayed automatically. This means that it is not always necessary to use the print() function.

In [None]:
#Example
x = 15
y = 20

x + y # The operation of this cell is displayed automatically

### Data type: Boolean (bool)

This data type (`bool`) has only two values: true: `True` or false: `False` **<sup> 1 </sup>**. It can be used for logical operators where it is necessary to evaluate if a statement is true or not.

### Data type: text (String)

Strings are denoted as <code>str</code> and are a sequence of symbols that can include uppercase and lowercase letters, numbers, punctuation marks, and spaces **<sup> 1 </sup>**.

There are three ways to represent this type of data, any of them is valid and does not affect the code:
* **Enter single quotes:** 'Donepezil'
* **Enter double quotes:** "Donepezil"
* **Enter three single quotes or three double quotes:** '''Donepezil''' or """Donepezil""" *(Mainly used to define multi-line strings)*

In [None]:
# Example of a variable of type string
text = 'Hello world'
print(f'text: {text}')

In [None]:
# Example of definitions of string type variables, with single, double and triple quotes.
text = "Hello world, double quotes"
print(f'Text - Double Quotes: {text}')

In [None]:
text = 'Hello World, single quotes'
print(f'Text - Single Quotes: {text}')

In [None]:
# Triple quotes preserve separators (spaces, newlines)
text = """Hello world,
triple double quote"""
print(f'Text - Triple Double Quote: {text}')

In [None]:
text = '''Hello world,
triple single quote'''
print(f'Text - Triple Single Quote: {text}')

If the text is a very long string, it is possible to write it in multiline mode, as shown below:

In [None]:
multiline_text = """This text is multiline
because it has different lines
for this it is printed
"""
# \n is a line separator
print(f'Multiline text \n{multiline_text}')

#### Indexing

One of the easiest ways to manipulate a string is through the indexing method:

A text is similar to a list, where each element has an _index_, in `Python` the first element has zero index. This way you can make use of indexing to manipulate parts of the text.

In [None]:
text = "Text to be manipulated"

### EXAMPLE 1
print(f'Extract a word from text (text[5:10]): {text[5:10]}') # from beginning to end

In [None]:
#### EXAMPLE 2
print(f'Extract the word \'be\' from the text by traversing the string from front to back: {text[8:10]}') # from beginning to end
print(f'Extract the word \'be\' from the text by traversing the string from back to front: {text[-14:-12]}') # from the end to the beginning

It is even possible to do logic checks with strings as seen below:

In [None]:
text = "Text to be manipulated"

### EXAMPLE 1
print(f'Could it be that \'for\' is in text?')
print("for" in text)

#### EXAMPLE 2

print(f'\nDoes it \'Text\' to be in text?')
print("Text" in text)

#### String Methods

There are several operators in the Python language that allow you to work with String data using operations that return the values ​​without changing the string. Among which are **<sup> 1 </sup>**:
* <code>.replace() </code>: Replaces a specific value in the string with another.
* <code>.split()</code> - Splits the string into substrings based on the parameter set. Returns a list of items.
* <code>.find()</code>: Searches the string for a specific value and shows the position in which it is found. Returns the index (position) of the searched element

In [None]:
text = "Hello world. This is sample text"
print(text)

# replace a word
text.replace("Hello", "Hi")

In [None]:
# Separate the phrase by spaces
split_list = text.split(" ")
split_list

In [None]:
# Find the position of the word 'world' in text
text.find("world")

### Data type: Arrays

Lists, tuples, dictionaries, and arrays are used to store multiple items in the same variable **<sup> 1 </sup>**.
* **Lists (list):** the elements have a modifiable order, modifications can be made and there may be duplicates, they are also indexed
* **Tuples (tuple):** the elements have an order and cannot be changed, added or deleted once the tuple is created, they can also make duplicates
* **Sets (set):** the elements do not have an order, they cannot be changed, added or deleted once the set is created, they are not indexed nor can there be duplicates
* **Dictionaries (dict):** are used to store data values in `key:value` pairs, elements have a non-modifiable order, modifications can be made and duplicates are not allowed

#### Lists

Lists are used to store multiple items in a single variable. The elements or data that are stored can be of any type. The following are the characteristics of this type of data:
* The elements of the lists **are ordered**, that is, they have a defined order that will not change because when adding new elements to the list they will be placed at the end of it.
* Items in lists are **modifiable**, meaning you can change, add, and remove items after the list has been created.
* Lists **allow duplicates**, that is, there can be elements with the same value.

In [None]:
# Store in the num variable an ordered list of elements from 0 to 10
num = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
print(f'The list of integers from 0 to 10 is:')
num

In [None]:
# to write the same list you can use the range function
num = list(range(11))
num

Lists can be defined from different types of data:

In [None]:
num = [1, "2", 3, "4", 5, "6", 7, 8, "9", 10]
# Note how string elements are enclosed in quotes
print(f'The list is:')
num

In the previous example you can see the way in which a list is written:
* It is delimited by square brackets `[ ]`
* Each element is separated by commas `,`

Since the elements of the lists are ordered, the index of an element can be known with the function <code>index()</code> which returns the index of the element in the first occurrence that it finds from index 0 regardless of how many times is the element inside the list.

##### Indexing 
Using the same technique to manipulate string variables, the indexing method can manipulate lists in a similar way:

In [None]:
# num: an ordered list of elements from 0-10
num = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
print('list is:')
print(num)
print('---------------------')

#### EXAMPLE 1
print(f'In position 0 of the list is the element (first element):')
print(num[0])
print('---------------------')

#### EXAMPLE 2
print(f'The first two elements of the list are:')
print(num[0:2])
print('---------------------')

#### EXAMPLE 3
print(f'The last two elements of the list are:')
print(num[-2:])

##### List Methods
Sometimes it is necessary to perform some basic operations on the lists, some of the existing methods are:

* <code>len()</code>
* <code>.index() </code>
* <code>.pop() </code>
* <code>.append()</code>
* <code>.remove() </code>
* <code>.reverse()</code>

###### .len() Method
With `.len(<list>)` you can determine the number of elements in the list:

In [None]:
num = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
print('Number of elements in the list:')
len(num)

###### .index() Method
With `.index(<element>)` you can find the position or index of an element in the list:

In [None]:
# It is possible to find out the position of an element, even the value in a place of the list
print(f'What is the position of number 5?')
num.index(5)

###### .pop() method
With `.pop(<index>)`. An item can be removed from the list and returned.
- In case the index is not specified, the last element is removed

This method **overwrites** the list:

In [None]:
print('Remove the last item in the list')
print(num.pop())
print(num)

###### .append() Method
`.append(<element>)` adds the element to the end of the list. This method __overwrites__ the list.

In [None]:
num.append(6)
print(f'The list now with element 6 at the end:')
print(num)

###### .remove() Method
With `.remove(<element>)` an element is removed from a list, its element must be indicated:
- In case of having several identical elements, only the first one is eliminated
- If the element is not in the list, return a 'ValueError' error

This method __overwrites__ the list.

In [None]:
num.remove(2)
print(f'The list now without the 2:')
print(num)

###### .reverse() Method
With `.reverse()` the list is reordered, from the last element to the first. This method __overwrites__ the list.

In [None]:
# recreate num
num = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
num.reverse() # Change the order
print('List in reverse order')
print(num)

#### Tuples
Tuples are used to store multiple items in a single variable. The elements or data that are stored can be of any type. The following are the characteristics of this type of data:
* The elements of the tuples **are ordered**, that is, they have a defined order that will not change because when adding new elements to the list they will be placed at the end of it.
* Elements in tuples **are immutable**, that is, elements cannot be changed, added, and removed after the tuple has been created.
* Tuples **allow duplicates**, that is, there can be elements with the same value.

In [None]:
# Let tuple_num be an ordered tuple of elements from 1 to 10
tuple_num = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

print(f'The tuple of elements from 1 to 10 is:')
print(tuple_num)

A tuple is immutable, so new values ​​cannot be assigned to its elements.

In [None]:
# tuple_num[3] = 12

Tuple elements can be accessed in the same way as a list

In [None]:
tuple_num = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
print('Tuple of elements')
print(tuple_num)
print('---------------------')

#### EXAMPLE 1
print('First element (zero position)')
print(tuple_num[0])
print('---------------------')

#### EXAMPLE 2
print('Last four items')
print(tuple_num[-4:])
print('---------------------')

#### EXAMPLE 3
print('tuple of elements from position two to five')
print(tuple_num[2:5])
print('---------------------')

#### Sets
Sets are used to store multiple items in a single variable. The elements or data that are stored can be of any type. The following are the characteristics of this type of data:
* Elements in arrays are **unordered**, that is, they do not have a defined order since the elements may appear in a different order each time you use them and cannot be referenced by index or key.
* Elements in arrays **are immutable**, ie no elements can be changed after the array has been created.
* Sets **do not allow duplicates**, that is, there cannot be elements with the same value.

In [None]:
# Set of elements with some repetitions
set_num_1 = {1, 2, 1, 2, 3, 5, 8, 1, 3, 9, 10, 2, 3, 4, 5, 6, 7, 8, 1, 0}

print('The set of elements is:')
set_num_1

#### Dictionaries
Dictionaries are a special data structure, which gives us a bit more flexibility due to their nature of a list of objects where each pair contains a key and a value. By key we refer to a value that describes the type of data stored, this type of structure is known as a list of key-value objects.

Let's see an example, for this we will define a codon dictionary with its translation into its genetic code:

In [None]:
genetic_code = {
    "GUU": "V", "GUC": "V", "GUA": "V",
    "GUG": "V", "GCU": "A", "GCC": "A",
    "GCA": "A", "GCG": "A", "GAU": "D",
    "GAC": "D", "GAA": "E", "GAG": "E",
    "GGU": "G", "GGC": "G", "GGA": "G",
    "GGG": "G", "AGA": "R", "AGG": "R",
    "AGU": "Y", "AGC": "Y", "AAU": "N",
    "AAC": "N", "AAA": "K", "AAG": "K",
    "ACU": "T", "ACC": "T", "ACA": "T",
    "ACG": "T", "AUU": "I", "AUC": "I",
    "AUA": "I", "AUG": "M", "CGU": "R",
    "CGC": "R", "CGA": "R", "CGG": "R",
    "CCU": "P", "CCC": "P", "CCA": "P",
    "CCG": "P", "CAU": "H", "CAC": "H",
    "CAA": "Q", "CAG": "Q", "UUU": "F",
    "UUC": "F", "UUA": "L", "UUG": "L",
    "UCU": "Y", "UCC": "Y", "UCA": "Y",
    "UCG": "Y", "UAU": "Y", "UAC": "Y",
    "UAA": "STOP", "UAG": "STOP", "UGU":"C",
    "UGC": "C", "UGA": "STOP", "UGG": "W",
    "CUU": "L", "CUC": "L", "CUA": "L",
    "CUG": "L"}

# The previously created dictionary is printed
print(genetic_code)

Python has its own way of defining dictionaries, via the `dict` keyword. Which would give us a more elegant form of creation:

We can generally access dictionary values ​​in the same way that we can access arrays and lists.

In [None]:
### Example 1: What is the value of the genetic code of GGG?
print(f'The amino acid of GGG is: {genetic_code["GGG"]}')
print('---------------------')

### Example 2: Using try/catch for when a key does not exist in the dictionary
genetic_code['YYY']


In example 2 of the cell above it threw the error `KeyError` because `YYY` is not defined in the dictionary. In `Python` you can use exceptions to define how to handle errors. This is done with the `try/except` script. In general, a `try/except` type block has the following structure:

```markdown
try:
    code block
except <type>:
    code block
```

Now, the example can be written as follows:

In [None]:
### Example 2: Using try/catch for when a key does not exist in the dictionary
try:
     genetic_code['YYY']
except KeyError:
     print(f'Property YYY does not exist in the dictionary, try another one!')

# Theory: Flow control structures
Python has control flow statements that allow you to group commands in a controlled manner. Two of the most used are:
1. Conditional control structures
2. Iterative control structures

### 1. Conditionals

Conditionals allow you to execute a statement or make a decision if a certain condition is met, resulting in a Boolean value of true or false. The most used functions **<sup> 2 </sup>**:
* <code>if</code>: where if the expression is true, the block of followed statements is executed.
* <code>elif</code>: where if the above conditions are not met, another statement is tried.
* <code>else</code>: where the boolean expression is false or a condition is not met take this other option.

#### Structure of a conditional:

In general, a block of code for a conditional must follow the following form:

```markdown
if <condition_1>:
    code block
elif <condition_2>:
    code block
 ...
 else:
    code block
```

Evaluates the logical expression condition_1 and executes the first block of code if it is `True`; if not, it evaluates the following conditions until it reaches the first one that is `True` and executes the associated block of code. If neither condition is `True` execute the block of code after `else`.
*Observations*
- An `if` conditional does not necessarily need `elif` or `else`
- Only one `else` can appear, and it must go at the end
- Code blocks must be indented by 4 spaces
- The end of the conditional block is when the line returns to the previous level

In certain scenarios, decisions can be made, such as increasing the value of x or, failing that, decreasing its value.

In [None]:
x = 1 # initial value (try changing this value)

# Condition
if x == 1:
    x = x + 1
else:
    x = x -1

print(x) # final value

(_Note: The == operator is one of the comparison operators also called relational. A common mistake is to use a single equal sign (=) which is an assignment operator, instead of the double (==)._)

More than one check expression can be part of the block via `elif`, example:

In [None]:
x = 3 # initial value (try changing this value)

if x == 1:
    x = x + 1
elif x == 2:
    x = x - 1
else:
    x = 0

print(x) # final value

Sometimes you can include nested **if** expressions as much as necessary, it is not recommended as a software development practice, for example:

In [None]:
x = 2
y = 2
if x == 1:
    if y == 2:
        x = x + 1
        y = y + 1
elif x == 2:
    if y == 2:
        x = x - 1
        y = y - 1
else:
    x = 0
    y = 0

print(f'Expect x,y to be equal to 1. x={x} y={y}')

### 2. Iterations
The iterations or loops allow you to repeat a portion of the code as many times as necessary, while the boolean condition is true or false, in python only two functions are included **<sup> 3 </sup>**:
* <code>while</code> - Allows multiple iterations executing code while the condition is true.
* <code>for</code>: allows iterating in order over each of the elements of a sequence, be it list, tuple, dictionary, array or string



#### Structure of While:

In general, a While code block should follow the following form:

```markdown
while <condition>:
    code block
```

*Remember that in Python the `return` keyword is not necessary at the end of a code block since the convention is that the last line of code is always returned.*

In [None]:
#### EXAMPLE 1
# Let's increase the value of x up to 10
x = 1
print('The initial value of x is:')
print(x)

while x < 10:
    print(f'The new value of x is {x}')
    x = x + 1

print('At the end of the loop, the value of x is:')
print(x)

#### Structure for

In the same way, the `for` block follows a standard structure for its definition:

```markdown
for <var> in <sequence>:
    code block
```

*Remember: you don't need the `return` keyword like in other languages.*

In [None]:
#### EXAMPLE 2
# Let's take as an example a list of objects, for example the types of codons and their genetic code:
genetic_code= [
    ["GUU", "V"], ["GUC", "V"], ["GUA", "V"],
    ["GUG", "V"], ["GCU", "A"], ["GCC", "A"],
    ["GCA", "A"], ["GCG", "A"], ["GAU", "D"],
    ["GAC", "D"], ["GAA", "E"], ["GAG", "E"],
    ["GGU", "G"], ["GGC", "G"], ["GGA", "G"],
    ["GGG", "G"], ["AGA", "R"], ["AGG", "R"],
    ["AGU", "S"], ["AGC", "S"], ["AAU", "N"],
    ["AAC", "N"], ["AAA", "K"], ["AAG", "K"],
    ["ACU", "T"], ["ACC", "T"], ["ACA", "T"],
    ["ACG", "T"], ["AUU", "I"], ["AUC", "I"],
    ["AUA", "I"], ["AUG", "M"], ["CGU", "R"],
    ["CGC", "R"], ["CGA", "R"], ["CGC", "R"],
    ["CCU", "P"], ["CCC", "P"], ["CCA", "P"],
    ["CCG", "P"], ["CAU", "H"], ["CAC", "H"]
]

# You can iterate through all the list elements prot_list and print the protein and its type:
for codon in genetic_code:
    print(f'The amino acid of {codon[0]} is {codon[1]}')

We can iterate the items in a dictionary through the `key-value` pairs using the `.items()` method.

In [None]:
#### EXAMPLE 3: Using for to loop through the elements of a dictionary
genetic_code = {
    "GUU": "V", "GUC": "V", "GUA": "V",
    "GUG": "V", "GCU": "A", "GCC": "A",
    "GCA": "A", "GCG": "A", "GAU": "D",
    "GAC": "D", "GAA": "E", "GAG": "E",
    "GGU": "G", "GGC": "G", "GGA": "G",
    "GGG": "G", "AGA": "R", "AGG": "R",
    "AGU": "Y", "AGC": "Y", "AAU": "N",
    "AAC": "N", "AAA": "K", "AAG": "K",
    "ACU": "T", "ACC": "T", "ACA": "T",
    "ACG": "T", "AUU": "I", "AUC": "I",
    "AUA": "I", "AUG": "M", "CGU": "R",
    "CGC": "R", "CGA": "R", "CGG": "R",
    "CCU": "P", "CCC": "P", "CCA": "P",
    "CCG": "P", "CAU": "H", "CAC": "H",
    "CAA": "Q", "CAG": "Q", "UUU": "F",
    "UUC": "F", "UUA": "L", "UUG": "L",
    "UCU": "Y", "UCC": "Y", "UCA": "Y",
    "UCG": "Y", "UAU": "Y", "UAC": "Y",
    "UAA": "STOP", "UAG": "STOP", "UGU":"C",
    "UGC": "C", "UGA": "STOP", "UGG": "W",
    "CUU": "L", "CUC": "L", "CUA": "L",
    "CUG": "L"}

for codon, gene in genetic_code.items():
    print(f"The amino acid of {codon} is {gene}")

# Theory: Functions

Apart from the native Python expressions and functionalities known so far throughout the book, there are others that are very relevant and necessary when writing any computer program. And we are talking about **functions**: A function is basically a piece of code that can be reused, that has a name that identifies it, and receives some input parameters **<sup> 4 </sup>**. Functions generally do not always return the same result, they have a behavior based on the values ​​of their arguments.

When defining a function, it is important to keep in mind that:

1. Have a name that explains at first sight, the result and the operation it performs.
2. Most functions receive one or more parameters necessary to perform the operation, although they may not receive any.
3. They usually return a result, although this is not always the case.

#### Structure of a function:

In general, a block of code for a function should follow the following form:

```markdown
def <function_name>(<parameters>):
    code block
    code block
    ...
    return <object>
```

Let's see an example:
The following function, when executed, returns a data of the `string` type if the protein is found in the dictionary or, failing that, it returns `None`.

In [None]:
#### EXAMPLE 1: Checking for codon existence of a codon list
# Dictionary of proteins and their type:
genetic_code = {"GUU": "V", "GUC": "V", "GUA": "V", "GUG": "V", "GCU": "A", "GCC": "A", "GCA": "A", "GCG": "A",
                "GAU": "D", "GAC": "D", "GAA": "E", "GAG": "E", "GGU": "G", "GGC": "G", "GGA": "G", "GGG": "G",
                "AGA": "R", "AGG": "R", "AGU": "S", "AGC": "S", "AAU": "N", "AAC": "N", "AAA": "K", "AAG": "K",
                "ACU": "T", "ACC": "T", "ACA": "T", "ACG": "T", "AUU": "I", "AUC": "I", "AUA": "I", "AUG": "M",
                "CGU": "R", "CGC": "R", "CGA": "R", "CGG": "R", "CCU": "P", "CCC": "P", "CCA": "P", "CCG": "P",
                "CAU": "H", "CAC": "H", "CAA": "Q", "CAG": "Q", "UUU": "F", "UUC": "F", "UUA": "L", "UUG": "L",
                "UCU": "S", "UCC": "S", "UCA": "S", "UCG": "S", "UAU": "Y", "UAC": "Y", "UAA": "STOP", "UAG": "STOP",
                "UGU": "C", "UGC": "C", "UGA": "STOP", "UGG": "W", "CUU": "L", "CUC": "L", "CUA": "L", "CUG": "L"}


# function that checks if there is a codon defined in the dictionary and returns its genetic code or its default returns a None value.
def codon_gen_code(codon):
    if codon in genetic_code:
        return genetic_code[codon]

# execute the function
print(f'The amino acid of GCG is: {codon_gen_code("GCG")}')

*Observations*:
* Every function must start with the keyword **def** followed by the name, followed by the parameters in parentheses and end the line with a colon
    * The name of the function is: **type_of_protein**
    * The parameter that the function receives is: **protein_name**
* The body of the function is indented four spaces.
* The end of the block is when the line returns to the previous level
* Once the function has been created, it can be called with its name followed by parentheses and, if necessary, send the parameters by reference.

### Parameters of a function

A function can receive more than one parameter, in the previous example the function only received one, but we can add more. For example, let's modify the function so that it also receives the list of `proteins`:

In [None]:
#### EXAMPLE 2

# function that checks if there is a codon defined in the dictionary and returns its genetic code or its default returns a None value.
# notice that this function takes two parameters
def codon_gen_code (codones_type, name):
    if name in codones_type:
        return codones_type[name]

# Way to execute the previously created function
codon_gen_code(genetic_code, "ACC")

Now that it is not necessary to define the variable proteins before the function, this is because the functions have their own execution context, now the value of the variable `proteins` is referenced when executing the function, as its first parameter.

About the parameters it is important to highlight that:

* They must have clear and legible names that complement what the function does
* They can be of any type as in the case of the previous example `proteins` is a dictionary while `protein_name` is of type string
* They can have default values, let's see an example below.

In [None]:
#### EXAMPLE 3

# Default parameter protein_name='AAU'
def codon_gen_code(codon_types, name='AAU'):
    if name in codon_types:
        return codon_types[name]

# protein_name it is not necessary to pass it as a reference since it has a default value in the function definition.
codon_gen_code(genetic_code)

Now let's make a function that is of the general type and that helps us find any protein.

In [None]:
#### EXAMPLE 4: Function that is responsible for finding a protein from RNA and an initial codon.

import requests

# Define the document URL link
url = "https://raw.githubusercontent.com/ramirezlab/CHEMO/main/1_PART_ONE/data/sec_CYP2C9.fasta"

# Make an HTTP GET request to the URL link
response = requests.get(url)

if response.status_code == 200:
     # Get the content of the file from the response
     sec_CYP2C9 = response.text

     # Process the DNA sequence into RNA in a similar way to the previous code
     DNA_CYP2C9 = ''.join(sec_CYP2C9.split('\n')[1:])
     RNA_CYP2C9 = DNA_CYP2C9.replace("T", "U")


def rna_a_protein(rna, start_codon='AUG'):
    run = True
   # search start codon AUG
    i = 0
    for i in range(len(rna)):
        if rna[i:i + 3] == start_codon:  # Start of protein found
            rna = rna[i:]  # trim sequence. new RNA
            break  # end the for loop
        if i >= (len(rna) - 3):  # Protein start NOT found
            print('Start codon not found AUG')
            rna = rna[i:i + 3]
            run = False  # end up
            break  # end the for loop

    # This code is only executed if the start of the protein was found
    # Executes with the sequence trimmed
    if run:
        i = 0
        protein = list()
# start translation
        while i <= len(rna) - 2:
            codon = genetic_code[rna[i:i + 3]]
            protein.append(codon)
            i += 3
            if codon == 'STOP':
                print(f'>> Protein found')
                rna = rna[i:]  # new RNA (trimmed)
                protein = protein[:-1]
                protein_text = ''.join(protein)
                print(f'Protein: {protein_text}')
                break
            if i >= (len(rna) - 3):
                print('No codon found STOP')
                rna = rna[i:i + 3]
                break

# We call the function with the necessary arguments
rna_a_protein(RNA_CYP2C9, "AUG")

# References

1. Built-in types. (2023). Python documentation. https://docs.python.org/3/library/stdtypes.html
2. More control flow tools. (2023). Python documentation. https://docs.python.org/3/tutorial/controlflow.html
3. Compound statements. (2023). Python documentation. https://docs.python.org/3/reference/compound_stmts.html#while
4.  Compound statements. (s. f.). Python documentation. https://docs.python.org/3/reference/compound_stmts.html#function-definitions