# Python tutorial
https://github.com/yoavram/SciComPy/blob/master/notebooks/python.ipynb
#### All credit to: Yoav Ram

# Hello Jupyter!

To execute code in Jupyter notebook press `Shift+Enter` (or `Shift+Return`) or press `Control+Enter` (or `Command+Return`). The former will execute and advance, the later will execute and stay.
You can also use the ▶️ button on the command pallete above.

In [None]:
print("Hello World!")

In [None]:
print("Welcome to Python!")

`print` is a builtin function, and it can print text along with some execution -- in general `print` accepts as many arguments as we want, and separates them with spaces.

In [None]:
print("The product of 7 and 8 is", 7 * 8)

## Exercise: `print`

Print to the screen the following sentences:  

- "I love Python!"
- "7 + 6 = RESULT", replacing `RESULT` with the computation of 6+7
- "my name is NAME", replacing `NAME` with your name

In [None]:
# Your code here:


# Variables

A variable is a _name_ that references an _object_ in memory.
An object has a _value_ and a _type_.

To bind an _object_ to a _variable_, we use the _assignment_ operator `=`.

In [None]:
a = 5

Once a variable has been declared, we can use its name to get its value:

In [None]:
print(a)

In [None]:
a + 7

# Getting help

You can use `help()`, `?`, and `??`.

In [None]:
help("int")

In [None]:
help(a)

In [None]:
a?

In [None]:
a??

Skipping ahead a bit, let's define a function.

In [None]:
def my_print(x):
    """Prints the argument x."""
    print(x)

In [None]:
my_print?

In [None]:
my_print??

# Types

These are the basic Python data types:

| Type | Description | Range | Use |
|--------|-----------|-------|--------|
| `int`  | Integers | -oo to oo | counting, indexing |
| `float` | Decimal fractions | limited precision, depends on machine | calculations |
| `complex` | Complex numbers | just two floats | complex calculations  |
| `str` | Strings | unicode | text, categories |
| `bool` | Booleans | `True` and `False` | boolean logic |

We can determine a variable's type using the `type` function.

In [None]:
type(a)

## `int`
The **`int`** type is for integers:

Note that in Python 3 integers have unlimited precision:

In [None]:
n = 13891783871827487875832758374287348205743285742386738476843768327683467432876284368236487283476847684376843768207185275128758785712853783275137587357138757
type(n)

but the larger the number the more memory it requires:

In [None]:
n.bit_length(), a.bit_length()

## `float`

**`float`** is for decimal point numbers, and is usually implemented using a double in C:

In [None]:
x = 5.12312983
type(x)

To get info on float precision with this specific Python build, call (we'll learn about `import` later):

In [None]:
import sys
sys.float_info

## `complex`

**`complex`** is for complex numbers, in which each component is a `float`. Note that the imaginary part is denoted by `j` rather than `i`, probably because `i` is a common name for loop indices.

In [None]:
1j ** 2

In [None]:
z = 4.5 + 3j
type(z)

In [None]:
z.real

In [None]:
type(z.imag)

We saw three numerical types: `int`, `float`, and `complex`. 

The standard library includes additional numeric types. The *fractions* module deals with rational fractions; the *decimal* module deals with floating-point numbers with user-definable precision.

## `str`

**`str`** is for strings, used for both characters and text. 
We will deal with strings later.

## `bool`

Lastly, **`bool`** is for boolean variables that are either `True` or `False`:

In [None]:
value = True # Camelcase
type(value)

# Variable names
* You *can't include spaces*. 
* In principle, you can use almost any unicode symbol.
* You can override words that have special meaning in python (for example `print`), but don't do it unless you have a good reason.
* The convention is to use *lowercase only* and separate words with *underscores*: `num_atoms`, `first_template`.
* For more details on Python style conventions, see [PEP8](https://www.python.org/dev/peps/pep-0008/), the Python style guide..

In [None]:
שם = 'יואב'
print(שם)

In [None]:
😀 = 'smily'
print(😀)

In [None]:
ᄚ = "HANGUL CHOSEONG RIEUL-HIEUH (U+111A)"
print(ᄚ)

In [None]:
# It gets even weirder
ª = 123  # FEMININE ORDINAL INDICATOR (U+00AA)
print(a) # The standard letter a
# Python applies NFKC normalization (https://en.wikipedia.org/wiki/Unicode_equivalence#Normalization).

More details here:
<https://www.asmeurer.com/python-unicode-variable-names/>

To be clear: **don't do this**.

# Comments

Everything between a hashtag symbol `#` and the end of the line is a comment.

In [None]:
print("This will be printed")
# print("This will not be printed")
print("Another example") # of a comment 

**Tip:** In the notebook you can comment and uncomment complete lines by selecting them and pressing `Ctrl+/`.

# Operators

## Arithmetic operators

| Symbol | Operator                    | Use    |
|--------|-----------------------------|--------|
| +      | Addition                    | x + y  |
| -      | Substraction                | x - y  |
| *      | Multiplication              | x * y  |
| **     | Power                   | x ** y |
| /      | Decimal division            | x / y  |
| //     | Integer division            | x // y |
| %      | Integer remainder           | x % y  |

This is fairly straightforward except maybe integer division `//` and power `**`.

In [None]:
a = 5
b = 2

Add:

In [None]:
a + b

Substract:

In [None]:
a - b

Multipy:

In [None]:
a * b

Power:

In [None]:
a**b

Decimal division:

In [None]:
a / b

Integer division:

In [None]:
a // b

Remainder (modulo):

In [None]:
a % b

## Exercise: Pythagoras

Define two variables, `a` and `b`, and give them numeric values of your choice. 

Assume these are the lengths of the edges of a right angle triangle, and use numeric operators to calculate the length of the hypotenuse (יתר). 

Print out the result.  

Reminder: $c^2 = a^2 + b^2$  

In [None]:
# Your code here



**Hint**: to calculate the squared root of c, use c\*\*0.5

## Comparison operators
These operators are used to compare values. They always return boolean values: `True` or `False`.

| Symbol | Operator          | Use    |
|--------|-------------------|--------|
| ==     | Equals            | x == y |
| !=     | Not equals        | x != y |
| <      | Smaller than      | x < y  |
| >      | Larger than       | x > y  |
| <=     | Smaller or equals | x <= y |
| >=     | Larger or equals  | x >= y |

In [None]:
a == b    # Note: '==', not '='

In [None]:
a > b

In [None]:
b > a

In [None]:
a != b

In [None]:
b < 5

In [None]:
b <= 5

For strings, comparison operators are based on **lexicographical order**.

In [None]:
food = 'Noodles'
drink = 'Ice Tea'
food == drink

In [None]:
food > drink

In [None]:
food <= drink

In [None]:
food == 'Noodles'

## Logical operators

| Keyword | Use     |
|---------|---------|
| and     | a and b |
| or      | a or b  |
| not     | not a   |

In [None]:
a > b and a != b

In [None]:
a != b and a < b

In [None]:
a != b or a < b

In [None]:
boolean = a > b
type(boolean)

In [None]:
boolean and b == 5

We can also think of logical operators as 2X2 matrices, or alternatively - Venn diagrams.

![logic_venn](https://raw.githubusercontent.com/yoavram/Py4Life/master/lec1_images/logic_venn.jpg)

# Conditional statements
## `if` statements

The `if` statement allows us to condition the program flow on its data.

In [None]:
a = 10
b = 2

if a > b:
    print('Yes')

In [None]:
if a < b:
    print('Yes')

Notice the colon and the indented block. The syntax is always:

```py
if condition:  
    statement1
    statement2
    statement3
    ...
```

The condition does not need to be surrounded by round brackets `(...)`.

**Whitespaces mark block code**: Only commands within the indented block are conditional. Other commands will be executed, no matter if the condition is met or not. There is no use of curly brackets or `end` command: unindenting will close the code block.

__Note__: the condition expression is always converted to a boolean -- if it's not already a boolean, it will be implicitly converted into one. 
The indented commands only occur if the boolean has a `True` value.
Therefore, we can use logical operators to create more complex conditions.

## `if` example: divisibility

Let's write a program that checks if a number is devisible by 17. Remember the modulo operator.

In [None]:
n = 442

if n % 17 == 0:
    print(n, 'is devisible by 17!')

print('End of program')

## `else` statements

We can add _else_ statements to perform commands in case the condition is __not__ met, or in other words, if the boolean is False.

![if else flow](https://raw.githubusercontent.com/yoavram/Py4Life/master/lec1_images/if_else_flow.jpg)

In [None]:
n = 586

if n % 17 == 0:
    print(n, 'is devisible by 17!')
else:
    print(n, 'is not devisible by 17!')
    
print('End of program')

## `elif` statements

When using _elif_ statements, multiple conditions are tested one by one. Once a condition is met, the corresponding indented commands are performed. If none of the conditions is `True`, the `else` block (if exists) is executed.

In [None]:
n = 586

if n % 17 == 0:
    print(n, 'is devisible by 17!')
elif n % 2 == 0:
    print(n, 'is not devisible by 17, but it is even!')
else:
    print(n, 'is not devisible by 17, and it is odd!')
    
print('End of program')

## `match-case` statements

Similar to `switch` from other languages, but much more powerful.

In [None]:
http_status = 404

match http_status:
    case 400:
        print("Bad request")
    case 404:
        print("Not found")
    case 418 | 419:
        print("418 or 419")
    case _:
        print("Something's wrong with the Internet")

More details in [this tutorial](https://peps.python.org/pep-0636/).

## Exercise: leap year

A leap year is a year that has 366 days (adding February 29th). A year is a leap year if it is divisible by 400, or divisible by 4 but not by 100. 

For example, 2012 and 2000 are leap years, but 1900 isn't. 

Test a year of your choice by  using an appropriate statement and print the result.

## `while` loop

We use `while` loops to do something again and again, as long as a condition is met.  

![while](http://www.tutorialspoint.com/images/python_while_loop.jpg)

The syntax is very similar to that of `if` statement.

Let's count how many times it takes to get a random number greater than 90. 

In [None]:
from random import randint # we will get back to import later on

trials = 1
random_num = randint(1, 100)

while random_num <= 90:        # condition
    print(random_num)           # indented block
    random_num = randint(1,100) # indented block
    trials = trials + 1
print ('Found a number greater than 90 (', random_num, ') after ', trials, 'trials.')

## Exercise: Collatz Conjecture

The Collatz Conjecture (also known as the 3n+1 conjecture) is the conjecture that the following process is finite for every natural number:

> If the number n is even divide it by two, if it is odd multiply it by 3 and add 1. Repeat this process until you get the number 1.

Write a program to check if the Collatz conjecture is true for a number of your choice. Print every step of the process.

![CollatzXKCD](http://imgs.xkcd.com/comics/collatz_conjecture.png)

# Sequences

## Strings

Strings are ordered collections of _characters_. 

_Ordered collections_ means that elements are numbered with _indexes_: 0, 1, 2, 3, 4...  
Note that the first index is 0, __not__ 1!

We can create new string usings single- or double-quotes: `'` or `"`.

In [None]:
x = "Jupyter"
y = 'I love Python'
print(x)
print(y)

Strings are objects of type `str`:

In [None]:
type(x)

### Multiline strings

Multiline strings can be defined using `"""`:

In [None]:
cheeseshop_dialog ="""DIALOG
Customer: 'Not much of a cheese shop really, is it?'
Shopkeeper: 'Finest in the district, sir.'
Customer: 'And what leads you to that conclusion?'
Shopkeeper: 'Well, it's so clean.'
Customer: 'It's certainly uncontaminated by cheese.'
"""
print(cheeseshop_dialog)
type(cheeseshop_dialog)

In [None]:
print(cheeseshop_dialog[6])

### Concatenation 
We can concatenate strings using the addition operator `+`.

In [None]:
print(x)

In [None]:
print(x + "2018")

### Conversion
We can convert string to numbers and vice versa (if it is appropriate).

In [None]:
x = "4"
y = int(x)
print("y + 1 =", y + 1)

Otherwise, we get an error message...

In [None]:
print("x + 1 =", x + 1)

In [None]:
x = str(y)
print("x =", x)

In [None]:
x = "3.14"
y = float(x)
print("y * 2 =", y * 2)

### Strings as sequences

Strings are text but can represent other things, too. For example, DNA sequences.

Again we can concat strings:

In [None]:
upstream = "AAA"
downstream = "GGG"
dna = upstream + "ATG" + downstream
print(dna)

We can find the length of a string using the command `len`:

In [None]:
n = len(dna)
print("The length of the DNA variable is", n)

dna = dna + "AGCTGA"
print("Now it is", len(dna))

### Augmented assignment

We can use augmented assignment to make `dna = dna + x` into `dna += x`:

In [None]:
print(dna)
dna += "AGCTGA"
print(dna)

Augmented assignment also work with numbers and other operators:

In [None]:
x = 10
x *= 7
print(x)

### Access: Indexing

We can acces specific characters (sequence items) in a string using square brackets `[i]`:

In [None]:
text = "A musician wakes from a terrible nightmare."

In [None]:
print(text[0])
print(text[5])

Python uses **zero-count** indexing: the first element has index 0.

In addition, there is also support for **reverse indexing** using negative numbers:

In [None]:
print(text[-1])
print(text[-4])

Here, the last element is accessed using -1 index, and so on.

### Access: Slicing
We can extract subsets of a string by using _slicing_, with the corresponding indexes.  
Remember: indexes start from **0**!

We can access specific indexes of the list (_starting from 0_)

In [None]:
# get the 1st and 6th letters
print(text[0])
print(text[5])

Indexes work from the tail as well, using negative indices:

In [None]:
# get the last letter
print(text[-1])
# get 5th letter from the end
print(text[-5])

We can get a range of indexes using _\[start:end\]_

In [None]:
# get the 3rd to 8th letters
print(text[2:8])

Notice that the _start_ position is included, but not the _end_ position. We actually take the characters with indexes 2,3,4,5,6,7.

There are shortcuts for taking the first and last characters:

In [None]:
# get the first 5 letters
print(text[0:5])
# or simply:
print(text[:5])

# get 3rd to last letters:
print(text[3:])

# last 3 letters
print(text[-3:])

### Exercise: String access

The sequence below (named _seq_) consists of 20 characters. 

1. Print the 2nd and 7th characters.
2. Print the 2nd character from the end.
3. Slice the first half of the sequence.  
4. Slice the second half of the sequence.  
5. Slice the middle 10 characters

In [None]:
seq = "CAAGTAATGGCAGCCATTAA"


### String formatting

There are three ways to do this:
1. Using `%` (the old way)
2. Using the `format` method (the newer way)
3. Using [f-strings](https://docs.python.org/3/reference/lexical_analysis.html#f-strings) (the newest way)

~~We'll mostly use the `format` method~~

The `format` method works on a string template, with placeholders marked by curly brackets (who said Python doesn't like curly brackets?).
The method arguments are parsed to be the values for the placeholders, by order:

In [None]:
message = "Hello {}, would you like {} or {} apples?"
message = message.format("Adam Price", 1, 2)
print(message)

We can also specify placeholder's replacement using indices:

In [None]:
message = 'Hello {0}, my name is {1}, if your name is not {0}, please let me know'
message = message.format('Adam', 'Wendy')
print(message)

Finally, we can also use named placeholders and specify the values as keyword arguments:

In [None]:
message = 'Hello {guest}, my name is {host}, if your name is not {guest}, please let me know'
message = message.format(guest='Adam', host='Wendy')
print(message)

Format automatically handles numbers and other string conversions:

In [None]:
print("Snowhite and the {} dwarfs".format(7))
print("Snowhite and the {} dwarfs".format(7.0))
print("Snowhite and the {} dwarfs".format(7+0j))

But we can specify how to convert numbers, if we want; for example, we can specify the number of decimal digits we want:

In [None]:
x = 7.0554332
print("Snowhite and the {:.0f} dwarfs".format(x))
print("Snowhite and the {:.4f} dwarfs".format(x))
print("Snowhite and the {:.6f} dwarfs".format(x))

See all formatting options in the [docs](https://docs.python.org/3.6/library/string.html#format-string-syntax).

## f-strings

Python f-strings do everything inline!

In [None]:
guest = 'Adam'
host = 'Wendy'
print(f'Hello {guest}, my name is {host}, if your name is not {guest}, please let me know')

You can add any python expression, and formatting.

In [None]:
import math

print(f'2 * 3 = {2 * 3}.')
print(f'π with 6 decimal digits: {math.pi:.6f}.')

### Exercise: bottles of beer

Write a template and fill it with values to produce the following text:

```
3 bottles of beer on the wall, 3 bottles of beer.
Take one down, pass it around, 2 bottles of beer on the wall...
2 bottles of beer on the wall, 2 bottles of beer.
Take one down, pass it around, 1 bottles of beer on the wall...
1 bottles of beer on the wall, 1 bottles of beer.
Take one down, pass it around, 0 bottles of beer on the wall...
```

### String methods

Strings have many methods for text processing.

We can change a string to lowercase:

In [None]:
print(text)

In [None]:
text = text.lower()
print(text)

and to uppercase:

In [None]:
text = text.upper()
print(text)

We can replace characters:

In [None]:
dna = 'AAAATGGGGAGCTGAAGCTGA'
rna = dna.replace("T", "U")
print(rna)

#### Count
We can count characters. 

For example, let's count the number of histidine (`H`) and proline (`P`) in the [amino-acid](http://upload.wikimedia.org/wikipedia/commons/a/a9/Amino_Acids.svg) sequence of the [Human Insulin](http://www.uniprot.org/blast/?about=P01308) enzyme:

In [None]:
insulin = 'MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN'
print("# of histidine:", insulin.count('H'))
print("# of proline:", insulin.count('P'))

#### Find substrings
We can find a substring within a string.
For example, we can look for the character `D` in the insulin sequence.

In [None]:
pos = insulin.index('D')
print(pos)

In [None]:
type(pos)

In [None]:
print(insulin[pos])

The result is the index (position) of the first `D` found in the sequence.

We can also look for longer substrings, representing motiffs. For example, let's find the position of the Insulin [B-chain](http://www.uniprot.org/blast/?about=P01308[25-54]) - a specific subsequence - in the entire protein sequence:

In [None]:
b_chain = "FVNQHLCGSHLVEALYLVCGERGFFYTPKT"
position = insulin.index(b_chain)
print("Position:", position)

In [None]:
print(len(b_chain))

In [None]:
found = insulin[position : position + len(b_chain)] # slicing (notice the ':')
print(b_chain == found)
print("Original:", b_chain)
print("Found:   ", found)

#### Split

We can split a string on every occurence of a separator character:

In [None]:
names = "banana,ananas,potato,tomato"
foods = names.split(",")
print(foods)

What do we get?

In [None]:
type(foods)

## Lists

Lists are similar to strings in being sequential, only they can contain **any type of data**, not just characters. They are also mutable (we'll get back to that distinction).

Lists could even include mixed variable types.

We define a list just like any other variable, but use `[ ]` to surround the list elements and `,` to separate the elements.

In [None]:
# a list of strings
apes = ["Human", "Gorilla", "Chimpanzee"]
print(apes)

![Gorila](http://upload.wikimedia.org/wikipedia/commons/thumb/c/c0/Western_Lowland_Gorilla_at_Bronx_Zoo_2_cropped.jpg/338px-Western_Lowland_Gorilla_at_Bronx_Zoo_2_cropped.jpg)

In [None]:
# a list of numbers
nums = [7, 13, 2, 400]
print(nums)

In [None]:
# a mixed list
mixed = [12, 'Mouse', True]
print(mixed)

### Access

You can access list elements just like strings, using indexes (starting from 0):

In [None]:
print(apes[2])
print(apes[-3])

Lists are dynamic and mutable - you can append, remove and insert into them. This is done using _list methods_.

We can access and change list elements:

In [None]:
new_apes = apes[:] # make a copy of the apes list
new_apes[2] = 'Bonobobobo'
print(new_apes)
print(apes)

In [None]:
new_apes = apes # Notice the difference?
new_apes[2] = 'Bonobobobo'
print(new_apes)
print(apes)

In [None]:
# What will this do?

l = [1, 2, 3, 4]
print(l)

l[::2] = ['AB'] * 2
print(l)

In [None]:
id?

In [None]:
# What will this do?

l = [1, 2, 3]
print(id(l))
l += [99, 100]
print(id(l))

print(l)

This __does NOT__ work with strings though...

In [None]:
print(dna)
dna[5] = 'G'

This is because strings are **immutable** whereas lists are **mutable**. We'll get back to this notion soon.

### List methods

Lists also have many methods. 
The most useful ones we'll see here make use of the fact that lists are **mutable**.

`append` adds an element to the end of the list:

In [None]:
apes.append("Macaco")
print(apes)

`insert` adds an element at a given index:

In [None]:
apes.insert(2, "Kofiko")
print(apes)

`remove` finds and deletes an element from list:

In [None]:
apes.remove("Human")
print(apes)

`pop` deletes an elements from a list by its index:

In [None]:
print(apes.pop(3))
print(apes)

We can concatenate lists, just like strings, using addition:

In [None]:
apes += ["Orangutan", "Baboon"]
print(apes)

![Organutan](http://upload.wikimedia.org/wikipedia/commons/thumb/b/be/Orang_Utan%2C_Semenggok_Forest_Reserve%2C_Sarawak%2C_Borneo%2C_Malaysia.JPG/220px-Orang_Utan%2C_Semenggok_Forest_Reserve%2C_Sarawak%2C_Borneo%2C_Malaysia.JPG)

Searching in lists is done using `index`:

In [None]:
i = apes.index('Orangutan')
print(apes)
print(i, apes[i])

### EAFP vs. LBYL

If the value is not found a `ValueError` is raised.
We can catch the error with a try-except block.
This idiom is called *EAFP* - easier to ask for forgiveness than permission.
It is a based on a quote of [Admiral Grace Hopper](https://en.wikiquote.org/wiki/Grace_Hopper), the famous computer scientists.

![Grace Hopper](https://upload.wikimedia.org/wikipedia/commons/thumb/a/ad/Commodore_Grace_M._Hopper%2C_USN_%28covered%29.jpg/192px-Commodore_Grace_M._Hopper%2C_USN_%28covered%29.jpg)

In [None]:
print(apes)

In [None]:
search_item = 'Kofiko'

try:
    i = apes.index(search_item)
except ValueError as e:
    print('Exception caught!', e)
else:
    print(f'{search_item} in index {i}')
finally:
    print('This always happens')
    

You can also check if something is in a list before accesing it; this is called *LBYL* - look before you leap.

In [None]:
if 'Panda' in apes:
    i = apes.index('Panda')
    print('Panda in index', i)
else:
    print('Panda not in apes list')

Although exceptions are somewhat less efficient than `if` in terms of performance, in the former example we do only a single lookup (just `index`, no `in` test) and moreover, it is stable in multi-threaded applications.
In the latter example a different thread could in principle change the dictionary between the test (`in`) and the lookup (`index`).

### Sorting lists
  
We can sort lists using the `sorted` method.  
If the list is made __entirely__ of strings, then sorting is straightforward -- it will be sorted lexicographically (think about the way '<' and '>' work on strings).

In [None]:
sorted_apes = sorted(apes)
print(apes)
print(sorted_apes)

In [None]:
l = [1, 2, 4, 2]
print(l)
l.sort()
print(l)

But beware of mixed lists!

In [None]:
mixed = apes + [1, 2, 3]
print(mixed)
print(sorted(mixed))

# How would you fix this?

### Access: Slicing
  
We can slice lists just like we did with strings, to get partial lists.  
For example:

In [None]:
import random
 
# Generate a list of random numbers in [0, 10)
measurements = random.choices(range(10), k=30)
print(measurements)

# get the first 10 measurements
print(measurements[:10])
# get the last 3 measurements
print(measurements[-3:])

### Exercise: Lists

- Use the lists `birds` and `snakes` to create a single list of strings with the animal names. 
- Add the string `Mus musculus` to the list. 
- Remove the `Corvus corone` from the list. 
- Print the 2nd to 5th elements of the resulting list, sorted alphabetically.

In [None]:
birds = ['Gallus gallus', 'Corvus corone', 'Passer domesticus']
snakes = ['Ophiophagus hannah', 'Vipera palaestinae', 'Python bivittatus']



## `for` loops

Say we want to print each element of our list:

Python’s `for` loop syntax allows us to iterate over the elements of a `list`, or any `iterable` value. Python's `for` is similar to the `foreach` statement in other languages, rather than `for(i=0; i<n; i++)`:

```py
for loop_variable in iterable:
    statement1
    statement2
    statement3
    ...
```

In [None]:
for ape in apes:
    print(ape, "is an ape")

![Python loop](http://2.bp.blogspot.com/-7lXe1_Gou3k/UX92PWche3I/AAAAAAAAAFA/JxD4u8St-9g/s1600/python+loop.jpg)

### Looping over strings

Let's go over the Insulin AA sequnce and count the number of prolines manualy. Reminder: `insulin` is a `str`, not `list`.

In [None]:
print(insulin)

count = 0
for aa in insulin:
    count += (aa == "P")
print("# of prolines:", count)

Do you remember another way of doing this?

### Exercise: string loop

Complete the code below to count the _ratio_ of electrically-charged amino acids in the Insulin sequence.

In [None]:
charged = ['R','H','K','D','E']
insulin = 'MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN'



print(f'Ratio of charged amino acids is: {charged_ratio:.3}')

In [None]:
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#

In [None]:
charged = ['R','H','K','D','E']
insulin = 'MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN'

n = 0
for c in insulin:
    n += c in charged
        
charged_ratio = n / len(insulin)
        
print(f'Ratio of charged amino acids is: {charged_ratio:.3}')

In [None]:
charged_ratio = sum([insulin.count(c) for c in charged]) / len(insulin)

print(f'Ratio of charged amino acids is: {charged_ratio:.3}')

In [None]:
import re

print(re.findall(f"[{''.join(charged)}]", insulin))

In [None]:
import re

charged_ratio = len(re.findall(f"[{''.join(charged)}]", insulin)) / len(insulin)

print(f'Ratio of charged amino acids is: {charged_ratio:.3}')

In [None]:
from collections import Counter

count = Counter(insulin)
charged_ratio = sum([count[c] for c in charged]) / len(insulin)

print(f'Ratio of charged amino acids is: {charged_ratio:.3}')

### `range`

Sometimes we want to loop over consecutive numbers.

This is accomplished using the `range` function.

`range` accepts one, two, or three arguments: the bottom and upper limits and the step size.  
The bottom limit can be omitted - the default is zero - and the step can be omitted, too - the default is one.
The upper limit is __not__ included.

In [None]:
for i in range(10): # == range(0, 10, 1)
    print(i)

In [None]:
for i in range(100, 1000, 10):
    print(i, end=' ')

We can turn the range into a list -- so what is `range`?

In [None]:
print(list(range(10)))
print(type(range(10)))

We can also use `range` to loop on the indices of a list instead of the elements themselves.
This is useful in some cases.

In [None]:
for i in range(len(apes)):
    print(apes[i])

### `enumerate`

Another elegant way to iterate over lists is with the `enumerate` function. `enumerate` provides two loop variables for every item in the list -- the index and the element:

In [None]:
for i, ape in enumerate(apes):
    print("The ape at index", i, "is", ape)

In [None]:
for x in enumerate(apes):
    print(x)

### Exercise: primality check

Implement a simple primality check for the variable `n=97` (or some other value of your choice).

For each number `k` between 2 and `n` (or some other range if you prefer), check if `k` is a divider of `n` (using the modulo operation, right?).
If `k` is a divider you can break the loop using `break`.

**Note** `for` can have an `else` clause that will be executed if we exited the `for` normally, without a `break` or an exception.

In [None]:
n = 97 # try other numbers



## Tuples

[Tuples](https://docs.python.org/3.5/tutorial/datastructures.html#tuples-and-sequences) are another data structure for sequential data. They, too, can contain any type and mixed types. The main difference between tuples and lists is that tuples are **immutable**.

Tuples are denoted by round brackets `()`:

In [None]:
t = (15, 76, 'a')
print(t)
type(t)

Tuples are commonly packed and unpacked in Python:

In [None]:
a, b, c = t # unpacking
print('a:', a, 'b:', b, 'c:', c)
t = a, b # packing
print(t)

You can also create empty and singleton tuples:

In [None]:
t0 = ()
type(t0)

In [None]:
t1 = (5,) # notice the comma
type(t1)

# Dictionaries

**Dictionaries** are hashtables or maps: a data structure used to store collections of elements to be accessed with a _key_.
Keys can be of any _immutable_ type - strings, integers, floats, etc.
Each key refers to a single _value_.

In [None]:
d = {1: 'adas', 'ewer': 231, (5, 7): True}
print(d)

In [None]:
taxonomy = {
    'Pan troglodytes': 'Mammalia', 
    'Gallus gallus': 'Aves', 
    'Xenopus laevis': 'Amphibia', 
    'Vipera palaestinae': 'Reptilia'
}

In this dictionary, the _keys_ are the organisms and the _values_ are the taxonomic classification of each organism. Both are of type `str`.

### Access
Accessing a dictionary record is similar to what we did with lists, only this time we'll use a _key_ instead of an _index_:

In [None]:
print(taxonomy['Pan troglodytes'])
print(taxonomy['Gallus gallus'])

### Changing and adding records
We can change the dictionary by simply assigning a new value to a key.

In [None]:
taxonomy['Pan troglodytes'] = 'Mammals'
print(taxonomy['Pan troglodytes'])

Similarly, we can use this syntax to add new records: 

In [None]:
taxonomy['Danio rerio'] = 'Actinopterygii'
print(taxonomy['Danio rerio'])

__Note 1__: The fact that we can change elements of the dictionary and dynamically add more elements suggests that `dict` is a **mutable** type.

__Note 2__: A dictionary may not contain multiple records with the same _key_, but it may contain many keys with the same _value_.

### Looping over dictionaries

By default, `for` loops over the dictionary keys:

In [None]:
for key in taxonomy:
    print(f'{key}: {taxonomy[key]}')

In [None]:
for key, value in taxonomy.items():
    print(f'{key}: {value}')

### Dictionaries as containers
We can check if a dictionary contains a *key* using the `in` operator:

In [None]:
'Vipera palaestinae' in taxonomy

In [None]:
'Bos taurus' in taxonomy

### Exercise: secret

Given in the code below is a dictionary (named `code`) where the keys represent encrypted characters and the values are the corresponding decrypted characters. Use the dictionary to decrypt an ecnrypted message (named `secret`) and print out the resulting cleartext message.

In [None]:
secret = """Mq osakk le eh ue usq qhp, mq osakk xzlsu zh Xcahgq,
mq osakk xzlsu eh usq oqao ahp egqaho,
mq osakk xzlsu mzus lcemzhl gehxzpqhgq ahp lcemzhl oucqhlus zh usq azc, mq osakk pqxqhp ebc Zokahp, msauqjqc usq geou dat rq,
mq osakk xzlsu eh usq rqagsqo,
mq osakk xzlsu eh usq kahpzhl lcebhpo,
mq osakk xzlsu zh usq xzqkpo ahp zh usq oucqquo,
mq osakk xzlsu zh usq szkko;
mq osakk hqjqc obccqhpqc, ahp qjqh zx, mszgs Z pe heu xec a dedqhu rqkzqjq, uszo Zokahp ec a kaclq iacu ex zu mqcq obrfblauqp ahp ouacjzhl, usqh ebc Qdizcq rqtehp usq oqao, acdqp ahp lbacpqp rt usq Rczuzos Xkqqu, mebkp gacct eh usq oucbllkq, bhuzk, zh Lep’o leep uzdq, usq Hqm Meckp, mzus akk zuo iemqc ahp dzlsu, ouqio xecus ue usq cqogbq ahp usq kzrqcauzeh ex usq ekp."""

code = {'w': 'x', 'L': 'G', 'c': 'r', 'x': 'f', 'G': 'C', 'E': 'O', 'h': 'n', 'O': 'S', 'y': 'q', 'R': 'B', 'd': 'm', 'f': 'j', 'i': 'p', 'o': 's', 'g': 'c', 'a': 'a', 'u': 't', 'k': 'l', 'q': 'e', 'r': 'b', 'V': 'Z', 'X': 'F', 'N': 'K', 'B': 'U', 'T': 'Y', 'M': 'W', 'U': 'T', 'm': 'w', 'C': 'R', 'J': 'V', 't': 'y', 'S': 'H', 'v': 'z', 'e': 'o', 'D': 'M', 'p': 'd', 'K': 'L', 'A': 'A', 'P': 'D', 'l': 'g', 's': 'h', 'W': 'X', 'H': 'N', 'j': 'v', 'z': 'i', 'I': 'P', 'b': 'u', 'Z': 'I', 'F': 'J', 'Y': 'Q', 'Q': 'E', 'n': 'k'}



# Sets

A [set](https://docs.python.org/3.5/tutorial/datastructures.html#sets) is an **unordered collection** with **unique elements**, similar to the mathematical concept of a [set](https://en.wikipedia.org/wiki/Set_%28mathematics%29) (קבוצה). 

Curly braces (`{}`) or the `set()` function can be used to create sets. 

In [None]:
basket = {'apple', 'orange', 'apple', 'pear', 'orange', 'banana'}
print(basket) # duplicates have been removed
type(basket)

Basic uses include eliminating duplicate entries (as above, one apple and one orange were eliminated), and fast membership testing:

In [None]:
print('orange' in basket)
print('crabgrass' in basket)

Set objects also support set-theoretical operations like union, intersection, difference, and symmetric difference.

In [None]:
a = set('abracadabra')
b = set('alacazam')
print(a)
print(b)
type(b)

Letters in `a` but not in `b`:

In [None]:
a - b

Letters in either `a` or `b`:

In [None]:
a | b

Letters in both `a` and `b`:

In [None]:
a & b

Letters in `a` or `b` but not both:

In [None]:
a ^ b

To create an empty set you have to use `set()`, not `{}`; the latter creates an empty dictionary.

In [None]:
Ø = set()
print(Ø)
type(Ø)

Note that a `set` is mutable:

In [None]:
print(a)
a.add('z')
print(a)

## `frozenset`

There is also a immutable set, called `frozenset`:

In [None]:
a = frozenset('abracadabra')
print(type(a), a)
a.add('z')

# Functions

We _define_ functions with the __def__ command.
The general syntax is:

```py
def function_name(input1, input2, input3,...):
    # some processes
    .
    .
    .
    return output1, output2, ...
```

For example:

In [None]:
def multiply(x, y):
    z = x * y
    return z

In [None]:
x = 3
y = multiply(x, 2)
print(y)

In [None]:
z = multiply(7, 5)
print(z)

## Exercise: secret (2)

Let's turn the code from the last exercise into a function.
Write a function called `decrypt` that takes two arguments, `secret` and `code`, and returns a string which is the cleartext (decrypted) message. Then call the function to decrypt the secret from above.

In [None]:
secret = """Mq osakk le eh ue usq qhp, mq osakk xzlsu zh Xcahgq,
mq osakk xzlsu eh usq oqao ahp egqaho,
mq osakk xzlsu mzus lcemzhl gehxzpqhgq ahp lcemzhl oucqhlus zh usq azc, mq osakk pqxqhp ebc Zokahp, msauqjqc usq geou dat rq,
mq osakk xzlsu eh usq rqagsqo,
mq osakk xzlsu eh usq kahpzhl lcebhpo,
mq osakk xzlsu zh usq xzqkpo ahp zh usq oucqquo,
mq osakk xzlsu zh usq szkko;
mq osakk hqjqc obccqhpqc, ahp qjqh zx, mszgs Z pe heu xec a dedqhu rqkzqjq, uszo Zokahp ec a kaclq iacu ex zu mqcq obrfblauqp ahp ouacjzhl, usqh ebc Qdizcq rqtehp usq oqao, acdqp ahp lbacpqp rt usq Rczuzos Xkqqu, mebkp gacct eh usq oucbllkq, bhuzk, zh Lep’o leep uzdq, usq Hqm Meckp, mzus akk zuo iemqc ahp dzlsu, ouqio xecus ue usq cqogbq ahp usq kzrqcauzeh ex usq ekp."""

code = {'w': 'x', 'L': 'G', 'c': 'r', 'x': 'f', 'G': 'C', 'E': 'O', 'h': 'n', 'O': 'S', 'y': 'q', 'R': 'B', 'd': 'm', 'f': 'j', 'i': 'p', 'o': 's', 'g': 'c', 'a': 'a', 'u': 't', 'k': 'l', 'q': 'e', 'r': 'b', 'V': 'Z', 'X': 'F', 'N': 'K', 'B': 'U', 'T': 'Y', 'M': 'W', 'U': 'T', 'm': 'w', 'C': 'R', 'J': 'V', 't': 'y', 'S': 'H', 'v': 'z', 'e': 'o', 'D': 'M', 'p': 'd', 'K': 'L', 'A': 'A', 'P': 'D', 'l': 'g', 's': 'h', 'W': 'X', 'H': 'N', 'j': 'v', 'z': 'i', 'I': 'P', 'b': 'u', 'Z': 'I', 'F': 'J', 'Y': 'Q', 'Q': 'E', 'n': 'k'}






## Documenting your functions

Documenting functions is done by adding a *docstring* element below the function definition. Docstrings are enclosed by `"""`. For example:

In [None]:
def decrypt(secret, code):
    """Decrypt a message using a substitution code.
    
    The function only decrypts characters that appear in `code`; other characters remain as they appear in `secret`.
    
    Parameters
    ----------
    secret : str
        an encrypted message
    code : dict
        a substitution code, where the keys are encrypted characters and the values are the cleartext characters.
    
    Returns
    -------
    str
        the decrypted cleartext message.
    """
    pass


print(decrypt(secret, code))

You can easily access the documentation of a function using the `help()` command.

In [None]:
help(decrypt)

In [None]:
decrypt?

## Other stuff to discuss if we have time

- Use slicing to *edit* lists.
- `itertools.chain`.
- `itertools.permutation`.
- Go over the `itertools` docs together.
- `for else`. `while else`. (Do something once at the end, unless `break` was used.)
- Dictionaries: `keys()`, `values()`, `items()`.
- List comprehension cool examples.
- Generators.
- Sieve algorithm using generators.
- Classes and inheritance.
- Exceptions.
- Named arguments. Default values.
- Functions with variable number of arguments.
- Type hinting.
- Decorators.
- `contextlib.contextmanager`. Show example: timing a function.

# Colophon
This notebook was originally written by [Yoav Ram](http://python.yoavram.com) (and tweaked by Ohad Fried).

This work is licensed under a CC BY-NC-SA 4.0 International License.

![Python logo](https://www.python.org/static/community_logos/python-logo.png)