# Statistics and Data Science: Python Basics

![Python.jpg](attachment:51e28707-76ce-4d79-b1b5-f8cfb44efe7d.jpg)

Source: [Agent-X Comics - Perfect Programming](https://www.agent-x.com.au/comic/perfect-programming/)

## Contents

- [Variables](#Variables)
- [Data Type](#Data_Type)
  - [Text data](#Text_data)
  - [Numeric data](#Numeric_data)
  - [Casting and type conversion](#Casting)
- [Operators](#Operators)
  - [Arithmetic Operators](#arithmetic-operators)
    - [Operations on integers](#operating-int)
    - [Operations on floats](#operating-float)
    - [Operations on strings](#operating-str)
    - [Order of operations](#operating-order)
  - [Assignment Operators](#assignment-operators)
  - [Comparison Operators and Boolean](#comparison-operators)
  - [Identity Operators](#identity)
  - [Logical Operators](#logical)
- [Conditionals](#Conditionals)
- [Collection of elements](#collection)
  - [Lists](#Lists)
  - [Tuples](#Tuples)
  - [Conversion](#Conversion)
  - [Indexing](#Indexing)
  - [Slicing](#Slicing)
  - [Membership operators](#membership-operators)
  - [Mutability](#Mutability)
  - [Methods for lists and tuples](#methods)
- [Iteration](#Iteration)
  - [For loop](#for-loop)
  - [While loop](#while-loop)
  - [Break, continue, else](#break-continue-else)

## Variables <a class="anchor" id="Variables"></a>

Whether you are programming in Python or pretty much any other language, you will be working with **variables**. Variables are containers for storing data values. We will talk more about **objects** later, but a variable, like everything in Python, is an object. The following can be properties of a variable:
1. The **type** of variable. E.g., is it an integer, like `2`, or a string, like `'Hello, world.'`?
2. The **value** of the variable.
Depending on the type of the variable, you can do different things to it and other variables of similar type.

A variable is created the moment you first assign a value to it:

In [1]:
a = 3

Notice that when you assign a value to a variable, there is not visible output. However, now if we ask for `a`, its value will be displayed:

In [2]:
a

3

Be careful, when you assign an already assigned variable, the value will be overwritten!

In [3]:
a = 7
a

7

<span style='color:blue'> **Tips:** </span>
    
- Use explicit name for your variables: you and someone reading your code should be able to understand what is the variable.
- You can - actually, should - comment your code using `#` to explain what you are doing. 

Variables names are **case-sensitive**:

In [4]:
# Defining "A" will not overwrite "a"
A = "What a wonderful day!"
print(A)
print(a)

What a wonderful day!
7


## Data Type <a class="anchor" id="Data_Type"></a>

Variables can store data of different types, and different types can do different things. The Python's built-in `type()` function allows to determine the type of some data/variables.

### Text data <a class="anchor" id="Text_data"></a>

The first type of data we have encountered is **Text**, such as `"Hello, world"`. In Python language, Text type data are called **string**. When asked about the type of `"Hello, world"`, the `type()` function will return `str`, the short version of string:

In [5]:
type("Hello world")

str

Note that there are several ways to define strings. You can use single or double quotes. For instance, `'This is a string'` and `"This is a string"` are equivalent. You can also use triple quotes to extend a string over multiple lines:

In [6]:
my_str = '''Triple quotes allows...
to extend strings over multiple lines.'''

print(my_str)

Triple quotes allows...
to extend strings over multiple lines.


### Numeric data <a class="anchor" id="Numeric_data"></a>

There are three numeric types in Python:
- integer `int`
- real `float`
- complex `complex`

**Integer** (**int**) is a whole number, positive or negative, without decimals, of unlimited length:

In [7]:
type(4)

int

**Float** stands for "floating point number" and is a number, positive or negative, containing one or more decimals.

In [8]:
type(3.9)

float

Note that you can also use scientific notation using `e`:

In [9]:
type(9.5e-8)

float

Finally, you can define **complex** number. Note that in Python, the imaginary part is defined by `j`:

In [10]:
type(1+2j)

complex

Be careful when you define and operate on data: `3.9` is a float, but `'3.9'` is a string!

### Casting and type conversion

There may be times when you want to specify a type on to a variable. This can be done with "casting", using constructor functions:
- `int()` - constructs an integer number from an integer literal, a float literal (by removing all decimals), or a string literal (providing the string represents a whole number)
- `float()` - constructs a float number from an integer literal, a float literal or a string literal (providing the string represents a float or an integer)
- `complex()` - constructs a complex number from a wide variety of data types, including strings, integer literals and float literals
- `str()` - constructs a string from a wide variety of data types, including strings, integer literals and float literals

In [11]:
cast_int = int(1.2)
cast_float = float(4)
cast_complex = complex(3.5)
cast_str = str(9.2)

print(type(cast_int), cast_int)
print(type(cast_float), cast_float)
print(type(cast_complex), cast_complex)
print(type(cast_str), cast_str)

<class 'int'> 1
<class 'float'> 4.0
<class 'complex'> (3.5+0j)
<class 'str'> 9.2


Note that when converting a `float` to an `int`, the interpreter does not round the result, but gives the floor.

In [12]:
int(2.6)

2

The `int()`, `float()`, `complex()`, and `str()` functions are very useful to convert one variable type into another. For example, often times we will import data from a text file, i.e., many strings, but we will want to perform operations on numbers:

In [13]:
imp_str = '5.3'
conv_str = float('5.3')
print(type(imp_str), imp_str)
print(type(conv_str), conv_str)

<class 'str'> 5.3
<class 'float'> 5.3


## Operators

### Arithmetic operators <a class="anchor" id="arithmetic-operators"></a>

**Operators** allow you to do things with variables, like add them. They are represented by special symbols, like `+` and `*`. For now, we will focus on **arithmetic** operators. Python's arithmetic operators are:

|action|operator|
|:-------|:----------:|
|addition | `+`|
|subtraction | `-`|
|multiplication | `*`|
|division | `/`|
|raise to power | `**`|
|modulo | `%`|
|floor division | `//`|

**Warning**: Do not use the `^` operator to raise to a power. That is actually the operator for bitwise XOR, which we will not cover for now.

#### Operations on integers <a class="anchor" id="operating-int"></a>

Let's try the arithmetic operators on integers:

In [14]:
3+5

8

In [15]:
3-5

-2

In [16]:
3*5

15

In [17]:
3/5

0.6

In [18]:
3**5

243

In [19]:
3%5

3

In [20]:
3//5

0

Notice that `3/5` produces a `float`, even though `3` and `5` are `int`: 

In [21]:
print(type(3+5))
print(type(3/5))

<class 'int'>
<class 'float'>


Note than you cannot divide by zero:

In [22]:
7/0

ZeroDivisionError: division by zero

#### Operations on floats <a class="anchor" id="operating-float"></a>

Let's now try the arithmetic operators on floats:

In [23]:
2.1 + 3.2

5.300000000000001

Wait a minute!  We know `2.1 + 3.2 = 5.3`, but Python gives `5.300000000000001`. This is due to the fact that floating point numbers are stored with a finite number of binary bits. There will always be some rounding errors. This means that as far as the computer is concerned, it cannot tell you that `2.1 + 3.2` and `5.3` are equal. This is important to remember when dealing with floats, as we will see later.

In [24]:
# Very very close to zero because of finite precision
5.3 - (2.1 + 3.2)

-8.881784197001252e-16

In [25]:
2.1-3-2

-2.9

In [26]:
2.1*3.2

6.720000000000001

In [27]:
2.1/3.2

0.65625

In [28]:
2.1**3.2

10.74241047739471

In [29]:
2.1%3.2

2.1

In [30]:
2.1//3.2

0.0

Everything works as expected aside from the floating point precision previously mentioned. As before, you cannot divide by zero:

In [31]:
7.4/0.0

ZeroDivisionError: float division by zero

Note that you can operate on integers and floats. In other words, you do not need to convert integers into floats to perform mixed operations:

In [32]:
1+6.8

7.8

#### Operations on strings <a class="anchor" id="operating-str"></a>

Finally let's try some of these operations on strings. Yes, we will perform mathematical operations on strings! What will we get? Well, let's see...

In [33]:
'We can '+'add strings!'

'We can add strings!'

The result is intuitive: adding strings together concatenates them! How about subtracting strings?

In [34]:
'Can we '-'subtract strings?'

TypeError: unsupported operand type(s) for -: 'str' and 'str'

Ah, too bad, we cannot subtract strings. Well, it actually makes sense, subtracting strings would be weird. At least, we got a nice error message explaining us that the `str` and `str` are unsupported operant types for the `-` operation. 

Similarly, we cannot perform multiplication, raising of power, etc., with two strings. How about multiplying a string by an integer?

In [35]:
'cat '*3

'cat cat cat '

Wow, three `'cat '`! It makes sense: multiplication by an integer is the same thing as just adding multiple times, so the Python interpreter concatenates the string several times.

#### Order of operations <a class="anchor" id="operating-order"></a>

The order of operations follows common convention. Exponentiation comes first, followed by multiplication and division, floor division, and modulo. Next comes addition and subtraction. In order of precedence, our arithmetic operator table is

|precedence|operators|
|:-------:|:----------:|
|1 | `**`|
|2 | `*`, `/`, `//`, `%`|
|3 | `+`, `-`|

You can also group operations with parentheses. Operations within parentheses are always evaluated first.

<span style='color:blue'> **Tips:** </span> *do not* use excessive parentheses. Excessive parentheses makes your code less readable, and can lead to mistakes. Trust the order of operations ;)

In [36]:
1**3 + 2**3 + 3**3 + 4**3 + 5**3

225

In [37]:
(1+2+3+4+5)**2

225

Wooow! The sum of the cubes of 1, 2, ..., 5 is equal to the square of the sum from 1 to 5. Can you demonstrate that this property is true for all *n*?

### Assignment operators <a class="anchor" id="assignment-operators"></a>

Assignment operators are used to assign values to variables. We have already encountered one of them: that's right, the  `=` operator that allows to initiate a variable.

In [38]:
var = 7
print(type(var), var)

<class 'int'> 7


Now, let's say we want to update the value of our variable `var`. As previously mentioned, you can directly overwrite a variable. You can also use operations. For instance, suppose we want to add `3.9` to `var`. You can use the `+` operator:

In [39]:
var = var+3.9
print(type(var), var)

<class 'float'> 10.9


Notice that we changed the type of our variable `var` from an `int` to a `float`.

Instead of using the `+` arithmetic operator to update our variable, there was a more efficient way, using the assignment operator `+=`:

In [40]:
var = 7
var+= 3.9
print(var)

10.9


The `+=` operator told the interpreter to take the value of `var` and add `3.9` to it, changing the type of `var` in the intuitive way if need be. 

Similarly, the other arithmetic operators have similar assignment operators:

|Operator|Example|Same as|
|:-------:|:----------:|:----------:|
|`=` | `var = 7` | `var = 7` |  
|`+=` | `var += 7` | `var = var + 7`| 
|`-=` | `var -= 7` | `var = var - 7`|
|`*=` | `var *= 7` | `var = var * 7`|
|`/=` | `var /= 7` | `var = var / 7`|
|`**=` | `var **= 7` | `var = var ** 7`|
|`%=` | `var %= 7` | `var = var % 7`| 
|`//=` | `var //= 7` | `var = var // 7`|

### Comparison operators and Boolean <a class="anchor" id="comparison-operators"></a>

**Comparison operators** (also called **relational operators**) are used to compare two values.

Let's start by assessing if two values are equal. We use the `==` operator:

In [41]:
8 == 8

True

In [42]:
8==9

False

Wow! Python confirmed that 8 is equal to 8 but is not equal to 9!

Wait a minute, we know what "True" and "False" mean in English, i.e., words that indicate truth. We can guess they have the same meaning in Python. But what is their type? After all, we have so far seen `str`, `int`, `float`, and `complex` data types. Are `True` and `False` strings? No! `True` and `False` have a special type, called `bool`, short for **Boolean**.

In [43]:
print(type(True))
print(type(False))

<class 'bool'>
<class 'bool'>


Boolean are associated to numerical value: `True`has the value `1`, and `False` has the value `0`:

In [44]:
True == 1

True

In [45]:
False == 0

True

You can even perform arithmetic operations on boolean. The result will be an `int`:

In [46]:
sum_bool = True + False

print(type(sum_bool), sum_bool)

<class 'int'> 1


Ok, now that we understand what boolean are, let's test it with some floats:

In [47]:
5.3 == 5.3 

True

As expected. One more time:

In [48]:
2.1+3.2 == 5.3

False

As expect... Wait, what?! How come `2.1 + 3.2` is not `5.3`? Well, remember, there was rounding errors when summing `2.1` and `3.2`. This is the floating point arithmetic issue. Note that floating point numbers that can be exactly represented with binary numbers do not have this problem:

In [49]:
2.2+3.2 == 5.4

True

Unfortunately, this behavior is unpredictable, so **never use the `==` operator with `float`**.

Comparison is not restricted to equality. Here are the other comparison operators:

|English|Python|
|:-------|:----------:|
|is equal to | `==`|
|is not equal to | `!=`|
|is greater than | `>`|
|is less than | `<`|
|is greater than or equal to | `>=`|
|is less than or equal to | `<=`|

Let's try them!

In [50]:
-1 > 6

False

In [51]:
4 <= 4

True

We can even chain comparison operators:

In [52]:
1<2<3

True

However, even if it is legal, do not mix the direction of the comparison operators: 

In [53]:
1 < 3 > 2

True

See, chaining comparison operators check the relation element-by-element. In the above example, it means that `1` and `2` are not compared. 

Finally, we can use comparison operators on strings:

In [54]:
'Federer' > 'Nadal'

False

Wait, what?! Python got crazy! I mean, Python has never seen a tennis match so how does it compares tennis players anyway? Well, it does not. It actually compares the characters of strings. 

How so? In Python, characters are encoded with [Unicode](https://en.wikipedia.org/wiki/Unicode). This is a standardized library of characters from many languages around the world that contains over 100,000 characters. Each character has a unique number associated with it. We can access what number is assigned to a character using Python's built-in `ord()` function.

In [55]:
ord('a')

97

The relational operators on characters compare the values that the `ord` function returns. So, using a relational operator on `'a'` and `'b'` means you are comparing `ord('a')` and `ord('b')`. When comparing strings, the interpreter first compares the first character of each string. If they are equal, it compares the second character, and so on. So, the reason that `'Federer' > 'Nadal'` gives a value of `False` is because `ord('F') < ord('N')`. We're safe, but the debate is still not settled...

Note that a result of this scheme is that testing for equality of strings means that **all** characters must be equal. This is the most common use case for relational operators with strings.

### Identity operators <a class="anchor" id="identity"></a>

**Identity operators** are used to compare objects, not if they are equal, but if they are actually the same object, with the same memory location. The two identity operators are:

|English|Python|
|:-------|:----------:|
|is the same object | **`is`**|
|is not the same object | **`is not`**|

That's right. The operators are pretty much the same as English! Let's see these operators in action and get at the difference between `==` and `is`. Let's use the **`is`** operator to investigate how Python stored variables in memory, starting with `float`s.

In [56]:
a = 6.1
b = 6.1

a == b, a is b

(True, False)

See, `a` and `b` have the same value so the `==` operators returns `True`. However, they are not the same object because they are stored in different places in memory, so the `is` operator returns `False`. 

They can occupy the same place in memory if we do a `b = a` assignment:

In [57]:
a = 6.1
b = a

a == b, a is b

(True, True)

Because we assigned `b = a`, they necessarily have the same (immutable) value. The two variables also occupy the same place in memory for efficiency. Thus, both `==` and `is` operators return `True`.

However, if we reassign the value of `a`, then the interpreter is placing `a` in a new space in memory, so `a` and `b` are not longer the same object:

In [58]:
a = 6.1
b = a
a = 8.5

a == b, a is b

(False, False)

The same discussion is valid for most `int` and `str`. Why most and not all?

For integers between between `-5` and `256`, Python employs **integer caching**, meaning that these integers will occupy the same space in memory. This caching does not happen for more negative or larger integers:

In [59]:
a = 93
b = 93
c = 708
d = 708

a is b, c is d

(True, False)

Similarly, Python is sometimes doing [**string interning**](https://en.wikipedia.org/wiki/String_interning) which allows for (sometimes very) efficient string processing. Whether two strings occupy the same place in memory depends on what the strings are:

In [60]:
a = 'Hello'
b = 'Hello'
c = 'Hello world!'
d = 'Hello world!'

a is b, c is d

(True, False)

You generally do not need to worry about caching and interning for **immutable** variables. Immutable means that once the variables are created, their values cannot be changed. If we do change the value the variable gets a new place in memory. All variables we've encountered so far (`int`, `float`, `complex` and `str`) are immutable.

### Logical operators <a class="anchor" id="logical"></a>

**Logical operators** can be used to connect relational and identity operators. Python has three logical operators.

|Logic|Python|
|:-------|:----------:|
|AND | `and`|
|OR | `or`|
|NOT | `not`|

The `and` operator means that if both operands are `True`, return `True`. The `or` operator gives `True` if *either* of the operands are `True`. Finally, the `not` operator negates the logical result.

In [61]:
True and True

True

In [62]:
True and False

False

In [63]:
False and False

False

In [64]:
True or False

True

In [65]:
not False

True

In [66]:
not False and True

True

In [67]:
not (True and False)

True

Note that it is important to specify the ordering of your operations, particularly when using the `not` operator.

Note also that

    a < b < c
    
is equivalent to

    (a < b) and (b < c)

With these new types of operators in hand, we can construct a more complete table of operator precedence.

|precedence|operators|
|:-------|:----------:|
|1 | `**`|
|2 | `*`, `/`, `//`, `%`|
|3 | `+`, `-`|
|4 | `<`, `>`, `<=`, `>=`|
|5 | `==`, `!=`|
|6 | `=`, `+=`, `-=`, `*=`, `/=`, `**=`, `%=`, `//=`|
|7 | `is`, `is not`|
|8 | `and`, `or`, `not`|

## Conditionals

**Conditionals** are used to tell your computer to do a set of instructions depending on whether or not a Boolean is `True`. In other words, we are telling the computer:

    if something is true:
        do task a
    otherwise:
        do task b

In fact, the syntax in Python is almost exactly the same. As always, an example speaks volumes. 

We are going to study the condition for cooperation in [collective action problem](https://en.wikipedia.org/wiki/Collective_action_problem), also called social dilemma. In such situation, all individuals would be better off cooperating but fail to do so because of conflicting interests between them, which discourage joint action (see illustration below). Many environmental issues takes the form of a social dilemma. For example, recycling requires time and efforts but decreases the consumption of materials if widely adopted. Purchasing an electric vehicle is costly but reduces air pollutants, associated with various respiratory and cardio-vascular health diseases, and greenhouse gas emissions, responsible for climate change.

We will assume that individuals have *homo moralis* preferences: they consider not only their selfish payoff but also what happens when all others do the same action. The weight of selfishness and morality depends on the individual degree of morality. Recent economic literature has demonstrated that such preference provides an evolutionary advantage (see e.g., Alger & Weibull, 2013).

We can demonstrate that *homo moralis* individuals cooperate in a social dilemma (e.g., perform a pro-environmental action) when their social benefit weighted by their degree of morality is greater than their individual cost of acting weighted by their degree of selfishness. 

![Social-Dilemma2.png](attachment:eeb7d591-38db-4d01-b8f5-e7bb5998af35.png)

Ok, enough words, let's assess whether a given *homo moralis* cooperates.

*Reference*
Alger, I., & Weibull, J. W. (2013). Homo moralis—preference evolution under incomplete information and assortative matching. Econometrica, 81(6), 2269-2302. [DOI: 10.3982/ECTA10637](https://doi.org/10.3982/ECTA10637)

In [68]:
cost = 1            # individual cost 
benefit = 3         # social benefit
kappa = 0.5         # degree of morality

# condition for cooperation: 
# social benefit times kappa is greater than individual cost times (1-kappa)

if benefit*kappa >= cost*(1-kappa):
    print('The individual cooperates!')

The individual cooperates!


Youhouuu, good news for nature, the individual performs an environmental-friendly action!

Now, let's review the syntax of the `if` statement. The Boolean expression, `benefit*kappa >= cost*(1-kappa)`, is called the **condition**. If it is `True`, the indented statement below is executed. In this case, we print the string `'The individual cooperates!'`. Also, do not forget the `:` at the end of the `if` statement!

This brings up a very important aspect of Python syntax: <span style="color: dodgerblue; font-weight: bold;">
Indentation matters.
</span> Any lines with the same level of indentation will be evaluated together.

In [69]:
if benefit*kappa >= cost*(1-kappa):
    print('The individual cooperates!')
    print('Same level of intentation, so still printed!')

The individual cooperates!
Same level of intentation, so still printed!


What happens if the condition is `False`? Let's try with an individual with degree of morality `kappa=0`, i.e., the infamous fully-selfish *homo oeconomicus*:

In [70]:
kappa = 0                # degree of morality

# condition for cooperation: 
# social benefit times kappa is greater than individual cost times (1-kappa)

if benefit*kappa >= cost*(1-kappa):
    print('The individual cooperates!')

Nothing happened. This is because we did not tell Python what to do if the condition was evaluated as `False`. We can add that with an `else` **clause** in the conditional.

In [71]:
kappa = 0      # degree of morality

# condition for cooperation: 
# social benefit times kappa is greater than individual cost times (1-kappa)

if benefit*kappa >= cost*(1-kappa):
    print('The individual cooperates!')
else:
    print('What a shame, the individual does not cooperate...')

What a shame, the individual does not cooperate...


We can assess several conditions by using an `elif` clause. For example, say we have two individuals, Edoardo and Quentin, and we want to check if they both cooperates, if only one of them cooperates, or if they both do not care for the environment:

In [72]:
kappa_1 = 0.2     # degree of morality of the first individual
kappa_2 = 0.3     # degree of morality of the second individual

# condition for cooperation: 
# social benefit times kappa is greater than individual cost times (1-kappa)

if benefit*kappa_1 >= cost*(1-kappa_1) and benefit*kappa_2 >= cost*(1-kappa_2):
    print('Both individual cooperates!')
elif benefit*kappa_1 < cost*(1-kappa_1) and benefit*kappa_2 < cost*(1-kappa_2):
    print('What a shame, nobody cooperates...')
else:
    print('Only one individual cooperates')

Only one individual cooperates


It seems like either Edoardo or Quentin are not cooperating... Anyway, notice the use of the logical operator `and` in addition of the comparison operators `>=` and `<`.

## Collection of elements <a class="anchor" id="collection"></a>

We will now explore two important data types in Python: lists and tuples. They are both sequences of objects. Just like a string is a sequence (that is, an ordered collection) of characters, lists and tuples are sequences of arbitrary objects, called items or elements. They are a way to make a single object that contains many other objects.

### Lists

**Lists** are used to store multiple items in a single variable. We create lists by putting Python values or expressions inside **square brackets**, separated by **commas**:

In [73]:
my_list = [2, 3.7, 4+5j, 'dog']
print(type(my_list),my_list)

<class 'list'> [2, 3.7, (4+5j), 'dog']


Note that the type of a list is... a `list`! Also, any Python expression can be part of a list, including another list:

In [74]:
my_list2 = [2, 3.7, 4+5j, 'dog', [0,'Hi!']]
print(my_list2)

[2, 3.7, (4+5j), 'dog', [0, 'Hi!']]


You can also perform operations inside a list. If so, the operations get evaluated:

In [75]:
my_list3 = [8+9, 8-9, 8*9]
print(my_list3)

[17, -1, 72]


Now what happens when you perform operations on lists? Let's find out!

Operators on lists behave much like operators on strings. The `+` operator on lists means list concatenation.

In [76]:
[1,2,3]+[4,5,6]

[1, 2, 3, 4, 5, 6]

The * operator on lists means list replication and concatenation.

In [77]:
[1,2,3]*2

[1, 2, 3, 1, 2, 3]

### Tuples

As lists, **tuples** are used to store multiple items in a single variable. We create tuples by putting Python values or expressions inside **parenthesis**, separated by **commas**:

In [78]:
my_tuple = (2, 3.7, 4+5j, 'dog', (0,'Hi!'))
print(type(my_tuple),my_tuple)

<class 'tuple'> (2, 3.7, (4+5j), 'dog', (0, 'Hi!'))


The type of a tuple is, you guessed it, a `tuple`. As with lists, any Python expression can be part of a tuple, including another tuple, and you can also perform operations inside tuples:

In [79]:
(8+9, 8-9, 8*9)

(17, -1, 72)

Just be careful when you create a tuple with a single item: you need to include a comma after the item:

In [80]:
my_tuple = (0,)
not_a_tuple = (0) # this is just the number 0 (normal use of parantheses)

type(my_tuple), type(not_a_tuple)

(tuple, int)

Operators on tuples work as for lists, i.e., you can concatenate tuples with the `+` and `*`operators:

In [81]:
(1,2,3)+(4,5,6)

(1, 2, 3, 4, 5, 6)

In [82]:
(1,2,3)*2

(1, 2, 3, 1, 2, 3)

### Conversion

You can convert `tuple` into `list` using the function `list()`:

In [83]:
tuple_to_convert = (0,1,2,3)
converted_list = list(tuple_to_convert)

converted_list

[0, 1, 2, 3]

Similarly, you can convert `list` into `tuple` using the function `tuple()`:

In [84]:
list_to_convert = [0,1,2,3]
converted_tuple = tuple(list_to_convert)

converted_tuple

(0, 1, 2, 3)

### Indexing

Lists and tuples are **ordered**, meaning that the items have a defined order. Thus, we can access a given item in a list or a tuple. To do so, we use **brackets**. We first write the name of our list/tuple and then enclosed in square brackets we write the location (**index**) of the desired element:

In [85]:
list_index = [2, 3.7, 4+5j, 'dog', [0,'Hi!']]

list_index[1]

3.7

Wait, what?! We asked for the first element and we got the second element of our list. Does Python not know how to count? Don't worry, this behavior happens because <span style='color:red'> **indexing in Python starts at zero** </span>. This is very important. (Historical note: [Why Python uses 0-based indexing](http://python-history.blogspot.com/2013/10/why-python-uses-0-based-indexing.html).)

In [86]:
print(list_index[0])
print(list_index[4])

2
[0, 'Hi!']


Much better! 

In our second example, we accessed the list that was within our list, i.e., a sublist. A list that contain another list is called a **nested list**. The sublist can also contain another list (i.e., a subsublist), and so on. We can index a sublist by adding another set of brackets:

In [87]:
nested_list = [[1,2,3],[4,5,6]]

print(nested_list[0][1])
print(nested_list[1][0])

2
4


The same is true for tuples: you can index nested tuples with multiple set of brackets:

In [88]:
nested_tuple = ((1,2,3),(4,5,6))

print(nested_tuple[0][2])
print(nested_tuple[1][1])

3
5


Ok, now we know the basics of indexing. An amazing feature allowed by Python is **negative indexing**. This just means we start indexing from the last entry, starting at `-1`:

In [89]:
list_index2 = [2, 3.7, 4+5j, 'dog']

list_index2[-1]

'dog'

Indexing in reverse is sometimes very convenient. Let's recap the forward and backward indices for lists and tuples:

|Element|1|2|3|4|5|6|7|8|9|10|
|------|-:|-:|-:|-:|-:|-:|-:|-:|-:|-:|
|Forward indices|0|1|2|3|4|5|6|7|8|9|
|Reverse indices|-10|-9|-8|-7|-6|-5|-4|-3|-2|-1|

In [90]:
tuple_index = (1,2,3,4,5,6,7,8,9,10)

print(tuple_index[7])
print(tuple_index[-3])

8
8


### Slicing

With indexing, we have accessed a given element. How to **slice** a list or a tuple, i.e., extract several elements? We can use semicolon `(:)` for that:

In [91]:
slicing_list = [0,1,2,3,4,5,6,7,8,9]

slicing_list[0:4]

[0, 1, 2, 3]

In the above, we extracted a list with elements from `0` to `3`, despite writing `[0:4]`. In other words, the last element (`4`) is not included. 

More generally, when using colon indexing `[i:j]`, we get items `i` through `j-1`.  I.e., the range is **inclusive of the first index and exclusive of the last**. If the slice's final index is larger than the length of the sequence, the slice ends at the last element. Thus, be careful when you slice lists/tuples. 

In [92]:
slicing_list[3:100]

[3, 4, 5, 6, 7, 8, 9]

As before, you can use negative indices:

In [93]:
slicing_list[3:-2]

[3, 4, 5, 6, 7]

When `i` is larger than `j` when using colon indexing `[i:j]` (in terms of indices), then we get an empty list:

In [94]:
slicing_list[7:-5]

[]

We have so far extracted consecutive elements. What if you only want even numbers from our list `[0,1,2,3,4,5,6,7,8,9]`? Well then, you can specify a **stride** using a second colon:

In [95]:
slicing_list[0::2]

[0, 2, 4, 6, 8]

In the above example, `0` is the start index and `2` defines the stride, i.e., the step. When the start is not defined, the default is zero:

In [96]:
slicing_list[::2]   #no need to specify "0" when we want to start at the first index

[0, 2, 4, 6, 8]

Suppose we now want the odd numbers, how do we do it? Simple, we just modify the start value of our stride!

In [97]:
slicing_list[1::2]

[1, 3, 5, 7, 9]

And if we want the multiple of three? We modify the stride:

In [98]:
slicing_list[::3]

[0, 3, 6, 9]

What about the value in-between the two colons? Until now, we left it undefined. It is actually the end index:

In [99]:
slicing_list[:6:2]

[0, 2, 4]

Let's recap how indexing and slicing work. The general structure is: `[start:end:stride]`

* If there are no colons, a single element is returned.
* If there are any colons, we are slicing the list, and a list is returned.
* If there is one colon, `stride` is assumed to be 1.
* If `start` is not specified, it is assumed to be zero.
* If `end` is not specified, it is assumed you want the entire list.
* If `stride` is not specified, it is assumed to be 1.

Now let's do some crazy slicing! Imagine we want to reverse a list/tuple. Can you think of a way to do this operation using slicing? Well, we can use a negative stride!

In [100]:
slicing_list[::-1]

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

Note that the meaning of the "start" and "end" index is a bit ambiguous when you have a negative stride. When the stride is negative, we still slice from start to end, but the order is reversed. 

In [101]:
slicing_list[-1:6:-2]

[9, 7]

Take some time to practice slicing since this is a very important concept :)

### Membership operators <a class="anchor" id="membership-operators"></a>

**Membership operators** are used to test if a sequence is present in an object such as a list or a tuple. The two membership operators are:

|English|operator|
|:-------|:----------:|
|is a member of | `in`|
|is not a member of | `not in`|

The result of the operator is `True` or `False`. Let's have a look at some examples:

In [102]:
my_list = [2, 3.7, 4+5j, 'dog', [0,'Hi!']]

2 in my_list

True

Indeed, `2` is in our list, it is the first element. What about `'Hi!`? 

In [103]:
'Hi' in my_list

False

Why `'Hi!'` is not in our list? Well, it is part of our sublist `[0, 'Hi!']` but not of our "main" list `my_list`. `my_list` actually contains five elements, and `'Hi!'` is not one of them.

Let's look at an example with a tuple to make sure we master membership operators:

In [104]:
my_tuple = (2, 3.7, 4+5j, 'dog', [0,'Hi!'])

'cat' not in my_tuple

True

That's it for now, we'll use more membership operators later...

### Mutability

So far it seems `list` and `tuple` are very similar. So why would there be two different types if they behave exactly the same? Well, as you might guess, they do not. The important different between `list` and `tuple` are their mutability. 

Lists are **mutable** objects: you can change their values without creating a new list:

In [105]:
mutable_list = [0, 1, 2, 3, 4, 5]
mutable_list[3] = 'two'

mutable_list

[0, 1, 2, 'two', 4, 5]

`list` is the only data type we have encountered so far that is mutable. In other words, `int`, `float`, `complex`, `str`, and `bool` are **immutable**. Immutable means that once the variables are created, their values cannot be changed. If we do change the value the variable gets a new place in memory. `tuple` is also an immutable object. Let's try the same operation we performed above on our list and wee what happens: 

In [106]:
immutable_tuple = (0, 1, 2, 3, 4, 5)
immutable_tuple[3] = 'two'

immutable_tuple

TypeError: 'tuple' object does not support item assignment

We get an error message and rightfully so: since `tuple` is immutable, it does not support item assignment.

We can use the `id()` function to understand a bit more the mutability property. This function tells us where in memory the variable is stored. Let's try:

In [107]:
immutable_int = 89
print(id(immutable_int))

immutable_int = 90
print(id(immutable_int))

2333706441840
2333706441872


See, when we change the value of `immutable_int`, we didn't actually change its value; we made a new variable! Lists behave differently:

In [108]:
mutable_list = [0, 1, 2, 3, 4, 5]
print(id(mutable_list))

mutable_list[1] = 'one'
print(id(mutable_list))

2333780786688
2333780786688


It is still the same list even though we changed the value of the second element.

At this point you may wonder: why do we care? Well, suppose that we have a list, which we wish to keep, and that we want to make a copy of this list with one element that differs. What happens then?

In [109]:
mutable_list = [0, 1, 2, 3, 4, 5]
mutable_list_2 = mutable_list     # copy of my_list?
mutable_list_2[0] = 'zero'

print(mutable_list, mutable_list_2)

['zero', 1, 2, 3, 4, 5] ['zero', 1, 2, 3, 4, 5]


Disaster! We lost `mutable_list`!

What happened? Well, assigning a list to a variable does not copy the list in a new object, it just creates a new reference to the same object. Thus, when we modified the first element of `mutable_list_2`, we also modified `mutable_list`! This behavior can lead to nasty bugs that will bite you!

Is there a way to solve this issue? Of course there is: we can use slicing! If both the slice's starting and ending indices of a list are left out, the slice is a copy of the entire list in a new hunk of memory.

In [110]:
mutable_list = [0, 1, 2, 3, 4, 5]
mutable_list_2 = mutable_list[:]
mutable_list_2[0] = 'zero'

print(mutable_list, mutable_list_2)

[0, 1, 2, 3, 4, 5] ['zero', 1, 2, 3, 4, 5]


What a relief!

We have seen that tuples and lists are very similar, differing essentially only in mutability (actually the differences are more profound, see for instance a discussion here: [aforementioned blog post](http://www.asmeurer.com/blog/posts/tuples/)).  

So you may ask: "When should I use a tuple and when should I use a list?". Here is the advice of [Justin Bois](http://bois.caltech.edu/), whose [course](http://justinbois.github.io/bootcamp/2022_epfl/) heavily influenced this notebook:

"<span style="color: dodgerblue; font-weight: bold;">
Always use tuples instead of lists unless you need mutability.
</span>
This keeps you out of trouble. It is very easy to inadvertently change one list, and then another list (that is actually the same, but with a different variable name) gets mangled. That said, mutability is often very useful, so you can use it to make your list and adjust it as you need. However, after you have finalized your list, you should convert it to a tuple so it cannot get mangled."

### Methods for lists and tuples <a class="anchor" id="methods"></a>

We have previously performed operations on `list`. Using slicing, we have extracted elements of lists, we copied a list with `[:]`, and we even reversed a list with `[::-1]`.

What if we wish to add an element at the end of a list? Or better, insert or remove an element at a given position. In this case, we can use built-in functions. 

We already mentioned that lists are objects. Objects contain: 1) data; 2) functions that can operate on the data. The functions inside an object are called methods. Here are the built-in methods you can use on lists:

|Method|Description|
|:-------|:----------:|
|`append()` | Adds an element at the end of the list|
|`clear()` | Removes all the elements from the list|
|`copy()` | Returns a copy of the list|
|`count()` | Returns the number of elements with the specified value|
|`extend()` | Add the elements of a list (or any iterable), to the end of the current list|
|`index()` | Returns the index of the first element with the specified value|
|`insert()` | Adds an element at the specified position|
|`pop()` | Removes the element at the specified position|
|`remove()` | Removes the item with the specified value|
|`reverse()` | Reverses the order of the list|
|`sort()` | Sorts the list|


Another useful function (that is not a method), is the `len()` function. It returns the total number of items in a list. Let's try it!


In [111]:
my_list = [1,2,3,4,5,6,7]
len(my_list)

7

We have indeed seven elements in our list. Now let's count how many times does the value `3` appear in our list:

In [112]:
my_list.count(3)

1

As expected, we count one occurrence of the value `3`. But let's pause a minute. Did you notice the syntax? We first specify our list, then `.`, and finally the `count()` function. This is the structure for method. Now let's extract the index of the first element whose value is `3`. Recall that indexing starts at `0`.

In [113]:
my_list.index(3)

2

Let's keep going by adding elements and list of elements to our list.

In [114]:
my_list.append(8)

my_list

[1, 2, 3, 4, 5, 6, 7, 8]

In [115]:
my_list_2 = [9,10]
my_list.extend(my_list_2)

my_list

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Now, let's do a bit of magic: we will make the element `4` disappear!

In [116]:
my_list.remove(4)
my_list

[1, 2, 3, 5, 6, 7, 8, 9, 10]

Woow! Let's make it appear again. Again, be careful, indexing starts at `0`:

In [117]:
my_list.insert(3,4) #insert the value 4 at the index 3 in our list
my_list

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Ta-da! Instead of making a value disappear, we can also remove an element at a given position:

In [118]:
my_list.pop(8)
my_list

[1, 2, 3, 4, 5, 6, 7, 8, 10]

Let's insert the ninth element again:

In [119]:
my_list.insert(8,9) #insert the value 4 at the index 3 in our list
my_list

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Ok, all good! Now we can reverse sort our list:

In [120]:
my_list.reverse()
my_list

[10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

And sort it again:

In [121]:
my_list.sort() #Alternatively, we could reverse it again!
my_list

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Finally, let's remove all the elements of our list:

In [122]:
my_list.clear()
my_list

[]

`clear` creates an empty list. In other words, our list was not totally erased from existence, only its values were deleted.

We have performed some neat operations on lists. What about tuples? The `count()`, `index()`, and `len()` works the same with `tuple`. What about the others? Well, unfortunately they do not. Indeed we have seen before that we cannot modify a tuple or a given element of a tuple, because - let's repeat it, tuples are immutable. So what can you do when you actually want to update your tuple? Do you have to create a new one? No, there is a way! Remember that you can convert a tuple into a list using the function `list()` and then convert back your list into a tuple using the function `tuple()`:

In [123]:
my_tuple = (0,1,2,3)
my_list = list(my_tuple)
my_list[1]='one'
my_tuple = tuple(my_list)

my_tuple

(0, 'one', 2, 3)

By doing the conversion to a `list`, you can use all the methods that are working on lists, and then convert back to a `tuple`!

In addition, we can do other cool things with tuples. One is called **unpacking**, and consists in a multiple assignment statement. Let's see an example:

In [124]:
unpacking_tuple = (1, 2, 3)
a, b, c = unpacking_tuple  

print(a, b, c)

1 2 3


This is useful when we want to return more than one value from a function and further using the values as stored in different variables. 

## Iteration

### For loop <a class="anchor" id="for-loop"></a>

A `for` loop is used for iterating over a sequence, such as a list or a tuple.

Let's start with a simple example. We will print all the elements of a list:

In [125]:
canton_romand = ['Jura', 'Neuchâtel', 'Vaud', 'Genève', 'Berne', 'Fribourg', 'Valais']

for c in canton_romand:
    print(c)

Jura
Neuchâtel
Vaud
Genève
Berne
Fribourg
Valais


We printed a list of the Romandy cantons. 

Let's review what we did. For every item in our list `canton_romand`, we printed this item (canton). More generally, a `for` loop will do something for every item `in` a sequence. 

Note that a `str` is also sequence: just like a list or tuple, a string is an ordered collection of characters. Thus, we can use a `for` loop on a string:

In [126]:
for c in 'circular economy':
    print(c)

c
i
r
c
u
l
a
r
 
e
c
o
n
o
m
y


Let's go back to our illustration on cooperation in social dilemma. Remember the condition for cooperation that we previously discovered: individuals cooperate if their social benefit weighted by their degree of morality is greater or equal than their individual cost weighted by their degree of selfishness. Suppose we have several individuals with various degrees of morality stored in a list. For each of them, we wish to know if they cooperate or not. Let's try!

In [127]:
cost = 1            # individual cost 
benefit = 3         # social benefit

kappa_list = [0, 0.2, 0.3, 0.5, 0.7, 1]     # list of degrees of morality

for i in range(len(kappa_list)):
    if benefit*kappa_list[i] >= cost*(1-kappa_list[i]):
        print('Individual '+str(i+1)+' cooperates.')
    else:
        print('Individual '+str(i+1)+' does not cooperate.')

Individual 1 does not cooperate.
Individual 2 does not cooperate.
Individual 3 cooperates.
Individual 4 cooperates.
Individual 5 cooperates.
Individual 6 cooperates.


Wow, we start to do really cool stuffs! Ok, what actually happened? 

First, we used `len(kappa_list)` to know the length of our list `kappa_list` :

In [128]:
len(kappa_list)

6

Then, we created a **range** with the `range()` function. This function gives an iterable that enables counting: 

In [129]:
for i in range(6):
    print(i, end='  ')

0  1  2  3  4  5  

We see that `range(6)` gives us six numbers, from `0` to `5`. As with indexing, `range()` inclusively starts at zero by default, and the ending is exclusive. It turns out that the arguments of the `range()` function work much like indexing. If you have a single argument, you get that many integers, starting at 0 and incrementing by one. If you give two arguments, you start inclusively at the first and increment by one ending exclusively at the second argument. Finally, you can specify a stride with the third argument.

Going back to our loop. We iterated over our range, and we mixed a `for` loop with a `if` statement. For every element of our range (`0`, `1`, `2`, `3`, `4`, `5`), we assessed whether the condition `benefit*kappa_list[i] >= cost*(1-kappa_list[i])` was `True`. As we have seen before, `kappa_list[i]` extracts the element of the list `kappa_list` located at index `i`. Because we combined `range` and `len`, we iterate over all the elements of the lists, checking our condition for each degree of morality.

Finally, when the condition is `True`, we `print()` the string `'Individual '+str(i+1)+' cooperates.'`. This string joins three strings: `'Individual '`, `str(i+1)` where we converted our `int` variable `i+1` into a string using the `str()` function, and `' cooperates.'`. Similarly, when the condition is `False`, we printed `'Individual '+str(i+1)+' does not cooperate.'`.

Neat!

Well, technically, there was an even better way to obtain the same result, using the `enumerate()` function. This function gives an iterator that provides both the index and the item of a sequence. Again, this is best demonstrated in practice:

In [130]:
cost = 1            # individual cost 
benefit = 3         # social benefit

kappa_list = [0, 0.2, 0.3, 0.5, 0.7, 1]     # list of degrees of morality

for i, kappa in enumerate(kappa_list):
    if benefit*kappa >= cost*(1-kappa):
        print('Individual '+str(i+1)+' cooperates.')
    else:
        print('Individual '+str(i+1)+' does not cooperate.')

Individual 1 does not cooperate.
Individual 2 does not cooperate.
Individual 3 cooperates.
Individual 4 cooperates.
Individual 5 cooperates.
Individual 6 cooperates.


The `enumerate()` function allowed us to use an index and a degree of morality `kappa` at the same time. Let's visualize this by printing the index and degree of morality for each individual:

In [131]:
for i, kappa in enumerate(kappa_list):
    print(i, kappa)

0 0
1 0.2
2 0.3
3 0.5
4 0.7
5 1


The `enumerate()` function is really useful and should be used in favor of just doing indexing. It is indeed more generic: the `range(len())` construct will break on an object without support for `len()`. 

Note that you can use the underscore, `_`, as a throwaway variable when you do not use it. There is no rule for this, but this is generally accepted Python syntax and helps signal that you are not going to use the variable.

Here are a two other useful iterators functions. First, the `zip()` function enables us to iterate over several iterables at once. In the example below we iterate over the jersey numbers and names of ice hockey players playing for the Detroit Red Wings (What do you mean you do not know them??)

In [132]:
names = ('Raymond', 'Seider', 'Larkin')
numbers = (23, 53, 71)

for num, name in zip(numbers, names):
    print(num, name)

23 Raymond
53 Seider
71 Larkin


Second, the `reversed()` function  is useful for giving an iterator that goes in the reverse direction. Imagine we are the NASA counting down:

In [133]:
count_up = ('ignition', 1, 2, 3, 4, 5, 6, 7, 8 ,9, 10)

for count in reversed(count_up):
    print(count)

10
9
8
7
6
5
4
3
2
1
ignition


### While loop <a class="anchor" id="while-loop"></a>

A `while` loop allows iteration until a conditional expression evaluates `False`. 

Let's go back to our example on cooperation in a social dilemma. Suppose we would like to find the threshold degree of morality allowing cooperation. We can use a `while` loop:

In [134]:
cost = 1            # individual cost 
benefit = 3         # social benefit

# Initialize sequence index
k = 0               # degree of morality

# condition for not cooperating: 
# social benefit times k is stricly lower than individual cost times (1-k)
while benefit*k < cost*(1-k):
    k+=0.01

print(k)

0.25000000000000006


Thus, in this illustration, all individuals with a degree of morality greater than 0.25 will cooperate. 

Let's take a minute to understand what is happening in a `while` loop. The value of `k` is changing with each iteration, being incremented by `0.01`. Each time we consider doing another iteration, the condition is checked: is the social benefit weighted by the degree of morality strictly lower than the individual cost weighted by the degree of selfishness? If yes, i.e., the condition is evaluated to `True`, then the iteration continues. In other words, iteration continues in a `while` loop until the condition returns `False`.

![image.png](attachment:62239492-131b-41c8-a39e-f85293e8e427.png)
Image : Estefania Cassingena Navone - [Python While Loop Tutorial](https://www.freecodecamp.org/news/python-while-loop-tutorial/)

We have to be extra cautious when using `while` loop. If the condition is always `True`, then the condition can never returns `False`. We are thus stuck in an **infinite loop** and the code runs forever! If this happen, you can interrupt the kernel to stop the endless calculation. 

Better, do not be stuck in infinite loop! For example, you can first check the condition with a few values before writing the while loop to make sure it can returns `False`. Also check that your incrementation is working as expected outside of the loop. You can add a second condition, one that will for sure returns `False` after a given number of iterations. For instance:

```python
cost = 1            # individual cost 
benefit = 3         # social benefit

# Initialize sequence index
k = 0               # degree of morality
i = 0               # iteration
max_it = 1000       # maximum number of iteration

# condition for cooperation: 
# social benefit times k is greater than individual cost times (1-k)
while benefit*k < cost*(1-k) and i < max_it:
    k+=0.01
    i+=1
```

In the above code, `i` keeps track of the number of iteration and `max_it` defines a maximum number of iteration (1000). We added the condition `i < max_it`. This condition will for sure returns `False` if we reach 1000 iterations.

Finally, another way to avoid infinite loop is to use the `break` statement that we will discover below.

Now you may wonder, when to use `for` loop and when to use `while` loop? In most cases, you could use one or the other. Here is a general rule:
- If you know how many times you have to do something (or if your program knows), use a `for` loop. 
- If you don't know how many times the loop needs to run until you run it, use a `while` loop. 

### Break, continue, else <a class="anchor" id="break-continue-else"></a>

The `break` statement can stop a `for` or `while` loop before it has looped through all the items. 

For example, going back to our example on Romandy cantons, imagine we wish to stop the loop once we have reached `'Vaud'`:

In [135]:
canton_romand = ['Jura', 'Neuchâtel', 'Vaud', 'Genève', 'Berne', 'Fribourg', 'Valais']

for c in canton_romand:
    print(c)
    if c == 'Vaud':
        break

Jura
Neuchâtel
Vaud


The `continue` statement can stop the current iteration of the loop, and continue with the next.

For example, let's print all the cantons except `'Vaud'`:

In [136]:
canton_romand = ['Jura', 'Neuchâtel', 'Vaud', 'Genève', 'Berne', 'Fribourg', 'Valais']

for c in canton_romand:
    if c == 'Vaud':
        continue
    print(c)

Jura
Neuchâtel
Genève
Berne
Fribourg
Valais


The `else` keyword specifies a block of code to be executed when the loop is finished.

In [137]:
canton_romand = ['Jura', 'Neuchâtel', 'Vaud', 'Genève', 'Berne', 'Fribourg', 'Valais']

for c in canton_romand:
    print(c)
else:
    print('In Romandy cantons, French is an official language. Unfortunately, Python is not.')

Jura
Neuchâtel
Vaud
Genève
Berne
Fribourg
Valais
In Romandy cantons, French is an official language. Unfortunately, Python is not.


Note that the `else` block will NOT be executed if the loop is stopped by a `break` statement:

In [138]:
canton_romand = ['Jura', 'Neuchâtel', 'Vaud', 'Genève', 'Berne', 'Fribourg', 'Valais']

for c in canton_romand:
    print(c)
    if c == 'Vaud':
        break
else:
    print('In Romandy cantons, French is an official language. Unfortunately, Python is not.')

Jura
Neuchâtel
Vaud
