# Introduction to Python

In these classes we will learn the basics of the programming language Python.  
Python can be easily installed, along with many usefull scientific libraries, with [Anaconda](https://www.anaconda.com/). Binaries are available for Windows, macOS, and Linux.  
In these classes we will make use of the [Jupyter Notebook](https://jupyter.org/) or [Jupyter Lab](https://github.com/jupyterlab/jupyterlab).

Once installed both Python and Jupyter, retrieve the notebooks by cloning this [repository](https://github.com/batterio/introduction_to_python)

To start Jupyter, type in the terminal:  
    `jupyter notebook`  
or  
    `jupyter lab`

This will open a webpage in your browser. Open the notebooks folder and run (by double clicking on it) the introduction.ipynb file.

The material is also explorable in [nbviewer.ipython.org](https://nbviewer.jupyter.org/github/batterio/introduction_to_python/blob/master/notebooks/index.ipynb)

## Index

* [Run Python on the shell](#Run-Python-on-the-shell)
* [Syntax](#Syntax)
* [Simple operations](#Simple-operations)
* [Variables](#Variables)
* [Data structure: string](#Data-structure%3A-string)
* [Exercise 1](#Exercise-1)
* [Data structure: list](#Data-structure%3A-list)
* [Data structure: tuple](#Data-structure%3A-tuple)
* [Data structure: dictionary](#Data-structure%3A-dictionary)
* [Exercise 2](#Exercise-2)

## Run Python  on the shell
[back to top](#Index)

We'll first run python on the shell. As a convention, text that begins with a # is a comment and should not be typed.

    $> python
    
    Python 3.7.1 (default, Oct 22 2018, 11:21:55) 
    [GCC 8.2.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> 
    >>> # This is a comment and it will not be executed
    >>>

To exit python, you just have to type exit() or press the keys CTRL and D together.

## Syntax 
[back to top](#Index)

Unlike languages like C/C++ or Perl, which use braces to define blocks, Python uses line indentation to define a block. The number of spaces in the indentation is variable, but all statements within the block must be indented the same amount.

In [None]:
# The second block in this example will generate an error
if 1 < 10:
    print("1 is smaller than 10")
    print("10 is bigger than 1")
else:
    print("1 is bigger than 10")
   print("10 is smaller than 1")

## Simple operations
[back to top](#Index)

The following convections are used in the code examples: blue color is used in comments, green color is used to represent the result of a command and red color is used when an error is raised.

In [None]:
# We can use python as a calculator
3 + 4

In [None]:
# Division
8 / 2

In [None]:
# Product
4 * 2

In [None]:
# Power
4 ** 2

In [None]:
8 + 4 * 3

In [None]:
(8 + 4) * 3

Numbers can be:

* integer: **int()**
* floating point: **float()**
* complex: **complex()**

In [None]:
float(2)

In [None]:
int(3.2)

In [None]:
int('3')

## Variables
[back to top](#Index)

Define variables in python is very easy. Python variables don't have types, but their value do. So you can bound a variable to an integer at one point in your program and then rebound to a string at another point

In [None]:
# We can bound a variable to an integer...
a = 8
print(a)

In [None]:
a + 4

In [None]:
c = a + 4
print(c)

In [None]:
# ...or we can bound the same variable to a string...
a = "Hello"
print(a)

In [None]:
a + " World!"

In [None]:
# ...but we can't mix different types
4 + " is a number" 

In [None]:
"4" + " is a number" 

In [None]:
str(4) + " is a number" 

The command `type` shows the type of the object passed as argument.

In [None]:
n = 5
type(n)

In [None]:
s = "Hello"
type(s)

In [None]:
import sys
type(sys)

Another very useful command is `dir`, expecially when you use python in interactive mode. This command return a list of the attributes of the object passed as argument.

In [None]:
x = "this is a string"
dir(x)

Using Jupyter you can explore the attribute of the object by pressing `TAB`

In [None]:
x.

All the element of this list are functions or constant of the type string. Now, how can we know how to use a particular function? In this case it could be very useful to use the `.__doc__` or the `help` commands.

Let's say we are interested in the functions `count` and `replace`, what we can do is:

In [None]:
help(x.count)

In [None]:
x.count("i")

In [None]:
help(x.replace)

In [None]:
x.replace("i", "X")

`count` could be useful for calculate the GC content of a sequence of DNA while `replace` could be useful to change DNA to RNA (T -> U)

## Data structure: string
[back to top](#Index)

A string, by definition, is a sequence of characters, like "012345ABCDE". Python recognize as strings everything that is delimited by quotation marks `" "` or `' '`.

In [None]:
dna1 = "gattaca"
dna2 = "acattag"
dna1 == dna2

In [None]:
dna1 != dna2

Strings have indices, so we can refer to every position of a string with its correspondig index. You have to keep in mind that python, as many other languages, **starts to count from 0**!  
You can also select a slice of a string defining an interval where the first number is included but not the last one: In `[n, m]`, `n` is included but not `m`

```
   0   1   2   3   4   5   6
 +---+---+---+---+---+---+---+
 | g | a | t | t | a | c | a |
 +---+---+---+---+---+---+---+
  -7  -6  -5  -4  -3  -2  -1
```

In [None]:
dna1[1]

In [None]:
dna1[1:3]

In [None]:
dna1[-1]

In [None]:
len(dna1)

In [None]:
"c" in dna1

In [None]:
# 'find' and 'index' are two useful methods to extract an index from a string
help(dna1.index)

In [None]:
help(dna1.find)

In [None]:
dna1.index('f')

In [None]:
dna1.find('f')

In [None]:
dna1.find("ta")

In [None]:
dna1.upper()   # The opposite is .lower()

In [None]:
dna1.count("a")

In [None]:
dna1.replace("a", "U")
'gUttUcU'

## Exercise 1
[back to top](#Index)

Try to write a code than transform the RNA sequence `"UUgGAagaGcuuACUUag"` to DNA and then calculate its GC content

Tips:

* Assign the sequence to a variable
* Make all the nucleotides uppercase (or lowercase)
* Replace all the 'U' with 'T'
* Count the number of 'C' and 'G' and divide it by the length of the sequence

[Solution](solutions.ipynb#Exercise-1)

## Data structure: list
[back to top](#Index)

List is an array that contains objects non necessarily of the same type. The elements of a list are included between two square brackets `[` and `]`

In [None]:
ecoRI = "gaattc"
bamHI = "ggatcc"
hindIII = "aagctt"
enzymes = [ecoRI, bamHI, hindIII]

print(enzymes) 

In [None]:
# A list can also contain another list
my_list = [100, 'bio', enzymes]

print(my_list)

In [None]:
# You can access to one element of the list with its index
my_list[0]

In [None]:
# Remember that python starts counting from 0!
my_list[3]

In [None]:
# You can get the last element of the list with the index -1
my_list[-1]

In [None]:
my_list[-1] == my_list[2]

In [None]:
# You can know how many elements are in the list with 'len'
len(my_list)

In [None]:
# With ':' you can select part of the list
my_list[1:3]

In [None]:
my_list[1:]

In [None]:
my_list[:]

In [None]:
type(my_list[0])

In [None]:
type(my_list[2])

In [None]:
# The command 'range' create a list of numbers
range(5)

In [None]:
# The command 'split' returns a list of subsequences
seq = "atg-gct-tta"
seq.split("-")

In [None]:
# The command 'list' returns a list of all the characters of a string
my_list = list("atga")
my_list

In [None]:
# With command 'append' you can add something to a list (last position).
my_list.append('c')
my_list

In [None]:
# The command 'insert' adds something in a specific position of the list
my_list.insert(1, 'c')
my_list

In [None]:
# The command 'pop' removes amd returns an element of the list at a given index
my_list.pop(1)

In [None]:
my_list

In [None]:
# The command 'sort' orders a list
my_list.sort()
my_list

In [None]:
# The command 'reverse' changes the order of the list
my_list.reverse()
my_list

In [None]:
# Is possible to transform a list in a string with the command 'join'.
my_string = "".join(my_list)
my_string

## Data structure: tuple 
[back to top](#Index)

Tuple is an immutable list. The elements of a tuple are included between two brackets `(` and `)`

In [None]:
my_tuple = ("gttc", 5, [4, "a"])
my_tuple

In [None]:
type(my_tuple)

In [None]:
my_tuple[1] = 3

In [None]:
# The only commands that we can apply to the tuples are 'count' and 'index'
dir(my_tuple)

In [None]:
my_tuple.count("gttc")

In [None]:
my_tuple.index("gttc")

## Data structure: dictionary
[back to top](#Index)

Dictionary is one of the most useful tools in python. It is an associative array: **{key: value}**.  
The elements of a dictionary are included between two curly brackets `{` and `}`. The key and the value of a dictionay are linked by `:`.  
The key is a tuple and consecuentely is immutable

In [None]:
my_dict = {'hindIII': 'aagctt', 'ecoRI': 'gaattc', 'bamHI': 'ggatcc'}
print(my_dict)

In [None]:
# 'keys' returns all the keys of the dictionary in a list
my_dict.keys()

In [None]:
# 'values' returns all the values of the dictionary in a list
my_dict.values()

In [None]:
# 'items' returns both the keys and the values of the dictionary in a list of tuples
my_dict.items()

In [None]:
# Given a key, you can get very quickly the relative value
my_dict['ecoRI']

In [None]:
# This is how you add an element to a dictionary...
my_dict['BhlII'] = 'agatct'
print(my_dict)

In [None]:
# ...and this is how you delete one
del my_dict['bamHI']
my_dict.keys()

In [None]:
# What happen if you ask for a key that doesn't exist?
my_dict['Xho1']

In [None]:
# You can check if the key exists with 'in' or 'get'
'Xho1' in my_dict

In [None]:
my_dict.get('Xho1', "The key doesn't exist!")

## Exercise 2
[back to top](#Index)

Try to write a code than make the reverse of the sequence `'ACTCGAACGTGTGTCGTTCGGGATTACG'`

Tips:

* Assign the sequence to a variable
* Create a list with the characters of the sequence
* Reverse the list
* Concatenate wth the **`join`** command the elements of the list together to form a string

[Solution](solutions.ipynb#Exercise-2)