# Basics II - Data Structures

**Author**: [Gabriele Pompa](https://www.linkedin.com/in/gabrielepompa/): gabriele.pompa@unisi.com

# Table of contents

[Executive Summary](#summary)
1. [Tuples](#tuple)\
    1.1. [Definition](#tuple_definition)\
    1.2. [Indexing and Slicing](#tuple_index)\
    1.3. [Changing Values (and why you shouldn't do that)](#tuple_change)\
    1.4. [Nested Tuples](#nested_tuple)
2. [Lists](#list)\
    2.1. [Definition](#list_definition)\
    2.2. [Indexing and Slicing](#list_index)\
    2.3. [ Nested Lists](#nested_list)\
    2.4. [Indexing of Nested Sequences](#nested_list_index)\
    2.5. [Changing Values](#modify_list)\
    2.6. [Built-in methods](#list_methods)\
    2.7. [`for` loop](#for)\
    &nbsp; &nbsp; &nbsp; &nbsp; 2.7.1. [for loop over a list](#for_over_list)\
    &nbsp; &nbsp; &nbsp; &nbsp; 2.7.2. [Counter-based looping and `range()` function](#for_range)\
    &nbsp; &nbsp; &nbsp; &nbsp; 2.7.3. [`break` Statement in loops](#for_break)\
    &nbsp; &nbsp; &nbsp; &nbsp; 2.7.3. [`enumerate()` looping](#for_enumerate)\
    2.8 [List comprehension](#list_comprehension)
3. [Dicts](#dict)\
    3.1. [Definition](#dict_def)\
    3.2. [key-based indexing](#key_index)\
    3.3. [Changing Values](#modify_dict)\
    3.4. [Built-in methods](#dict_methods)\
    3.5. [Looping over dicts](#dict_loop)
4. [Sets](#set)\
    4.1. [Definition](#set_def)\
    4.2. [Test for membership](#set_membership)\
    4.3. [Set operations](#set_operations)\
    &nbsp; &nbsp; &nbsp; &nbsp; 4.3.1. [`.union()`](#union)\
    &nbsp; &nbsp; &nbsp; &nbsp; 4.3.2. [`.intersection()`](#intersection)\
    &nbsp; &nbsp; &nbsp; &nbsp; 4.3.3. [ `.difference()`](#difference)\
    4.4. [Getting rid of duplicates from a list](#set_duplicates_list)

### **Resources**: 

- [_Python for Finance (2nd ed.)_](http://shop.oreilly.com/product/0636920117728.do): Sec. 3.Basic Data Structures (Section 3.Excursus: Functional Programming is optional)
- [_The Python Tutorial_](https://docs.python.org/3.7/tutorial/): Sec. [3.1.3](https://docs.python.org/3.7/tutorial/introduction.html#lists) (Lists), [4.2](https://docs.python.org/3.7/tutorial/controlflow.html#for-statements) (for Statements), [4.3](https://docs.python.org/3.7/tutorial/controlflow.html#the-range-function) (The `range()` Function), [4.4](https://docs.python.org/3.7/tutorial/controlflow.html#break-and-continue-statements-and-else-clauses-on-loops) (break and continue Statemenents, and else Clauses on Loops), [5.1](https://docs.python.org/3.7/tutorial/datastructures.html#more-on-lists) (More on Lists), [5.3](https://docs.python.org/3.7/tutorial/datastructures.html#tuples-and-sequences) (Tuples and Sequences), [5.4](https://docs.python.org/3.7/tutorial/datastructures.html#sets) (Sets), [5.5](https://docs.python.org/3.7/tutorial/datastructures.html#dictionaries) (Dictionaries)

# Executive Summary <a name="summary"></a>

Intuitively, a _data structure_ is an object containing other objects, not necessarily of the same _data type_.

Standard Python provides four basic data structures, which can be differentiated at high level by being:
- _ordered_ or _not ordered:_ that is, whether they preserve the order in which entries are added or not;
- _mutable_ or _immutable:_ that is, whether - once defined - they can be modified or not.

These data-strucutures are:

data-structure | ordered (or not) | mutable (or not)
--- | --- | ---
Tuples  | ordered | immutable |
Lists | ordered | mutable |
Dicts | not ordered | mutable |
Sets | not ordered | mutable |

The function `type()` can be called over any defined data-structure and returns its type: `tuple` for Tuples, `list` for Lists, `dict` for Dicts and `set` for Sets.

The following sections are organized as follows: 
- In Sec. [1](#tuple) Tuples (`tuple`) are introduced as the Python data-structure for _ordered_ sequence-like objects that _cannot be_ modified once defined. 
- In Sec. [2](#list) Lists (`list`) are introduced as the Python data-structure for _ordered_ sequence-like objects that _can be_ modified once defined. In this context `for` loops are introduced in Sec. [2.7](#for).
- In Sec. [3](#dict) Dicts (`dict`) are introduced as the Python data-structure for _not ordered_ collection-like objects that _can be_ modified once defined and that implement a _key-to-value_ map.
- In Sec. [4](#set) Sets (`set`) are introduced as the Python data-structure for _not ordered_ collection-like objects that _can be_ modified once defined and that contain unique elements (that is, every elements appears only once). 

# 1. Tuples <a name="tuple"></a>

[Tuples](https://docs.python.org/3.7/tutorial/datastructures.html#tuples-and-sequences) consists of a number of values - of heterogeneous data-type, in general - packed together in an immutable sequence and separated by commas. 

In my experience, I didn't use tuple that often, probably because their _immutability_ goes against the dynamism of trail-n-error phases of a typical quantitative analysis. In fact, for the same reason, tuples may be a good asset as they guarantee the safety of data stored in them.

### 1.1. Definition <a name="tuple_definition"></a>

Tuples can be defined with or without parenthesis `()` surrounding the `,`-separated sequence.

In [1]:
my_tuple = (10, 20, 30)

In [2]:
print(my_tuple)

(10, 20, 30)


In [3]:
my_etherogenous_tuple = (10, "somebody's name", True, 10, True, 0.001, 'single quotes string')

In [5]:
print(my_etherogenous_tuple)

(10, "somebody's name", True, 10, True, 0.001, 'single quotes string')


In [6]:
my_integer = 10
my_string = 'a very nice string'
my_boolean_value = False

In [7]:
my_tuple_with_variables = (my_integer, my_string, my_boolean_value)

In [8]:
print(my_tuple_with_variables)

(10, 'a very nice string', False)


In [9]:
my_integer = 100

In [10]:
print(my_tuple_with_variables)

(10, 'a very nice string', False)


In [11]:
my_evaluated_tuple = (my_integer, my_integer + 100, 'my favorite string: ' + 'something')

In [12]:
print(my_evaluated_tuple)

(100, 200, 'my favorite string: something')


In [13]:
my_error_tuple = (my_integer, my_integer / 0)

ZeroDivisionError: division by zero

In [14]:
my_error_tuple

NameError: name 'my_error_tuple' is not defined

In [15]:
my_nested_tuple = (my_integer, 0.2, my_boolean_value)

In [16]:
print(my_nested_tuple)

(100, 0.2, False)


In [17]:
my_outer_tuple = ('my name', my_nested_tuple, 42)

In [18]:
print(my_outer_tuple)

('my name', (100, 0.2, False), 42)


In [19]:
my_very_crazy_tuple = ( (10, 20, 30), (100, 'nice string',  (True, False)), (0.2, 0.3, True))

In [20]:
print(my_very_crazy_tuple)

((10, 20, 30), (100, 'nice string', (True, False)), (0.2, 0.3, True))


### 1.2. Indexing and Slicing <a name="tuple_index"></a>

Tuples share a lot of properties with other sequence-like data-structure. For details take a look at [Sequence Types — list, tuple, range](https://docs.python.org/3.7/library/stdtypes.html#sequence-types-list-tuple-range) page of the Python standard library.

In particular, tuples share indexing features with strings (see [Basics_I___Data_Types.ipynb](https://github.com/gabrielepompa88/IT-For-Business-And-Finance-2019-20/blob/master/Notebooks/Basics_I___Data_Types.ipynb)) and lists (see Sec. [2](#list)).
In particular, elements of a tuple can be accessed by _zero-based_ indexes:

In [21]:
tup = (100, 'string')

##### INDEXES START AT 0

In [23]:
tup[1]

'string'

In [24]:
nested_tuple = (10, tup, 42)

In [27]:
nested_tuple[1]

(100, 'string')

In [28]:
my_tuple = (10, 20)
another_tuple = (30, 40)

In [29]:
my_tuple + another_tuple

(10, 20, 30, 40)

In [30]:
my_tuple - another_tuple

TypeError: unsupported operand type(s) for -: 'tuple' and 'tuple'

In [31]:
my_tuple * another_tuple

TypeError: can't multiply sequence by non-int of type 'tuple'

In [33]:
help(my_tuple.count)

Help on built-in function count:

count(value, /) method of builtins.tuple instance
    Return number of occurrences of value.



In [34]:
my_tuple

(10, 20)

In [37]:
my_tuple.count(42)

0

In [38]:
my_long_tuple = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

In [41]:
my_long_tuple[0:5]   #### last index is NOT contained

(1, 2, 3, 4, 5)

In [43]:
my_long_tuple[0] = 10

TypeError: 'tuple' object does not support item assignment

In [44]:
my_long_tuple[0]

1

In [45]:
changing_tuple = (1, 2, 3)

In [46]:
changing_tuple

(1, 2, 3)

In [47]:
changing_tuple[2] = -1

TypeError: 'tuple' object does not support item assignment

In [48]:
changing_tuple = ( changing_tuple[0],  changing_tuple[1],  -1)

In [49]:
changing_tuple

(1, 2, -1)

and tuples can be sliced. That is, you can select few elements only of the tuple if you want to

In [6]:
tup_slice = tup[0:2] # elements from position 0 (included) to 2 (excluded)

print(tup_slice)
type(tup_slice)

(1, 0.35)


tuple

In [7]:
tup[2:5] # elements from position 2 (included) to 5 (excluded)

('GBP',)

In [8]:
tup[:2]   # elements from the beginning to position 2 (excluded) --- equivalent to tup[0:2]

(1, 0.35)

In [9]:
tup[-2:]  # elements from the second-last (included) to the end

(0.35, 'GBP')

### 1.3. Changing Values (and why you shouldn't do that) <a name="tuple_change"></a>

Analogously to strings - but differently from lists - tuples are _immutable_ objects.  That is, if you try to change one of its elements, you get
```python
TypeError: 'tuple' object does not support item assignment
```

In [10]:
# p[0] = 17

In particular, you cannot simply use the `+` operator as you would do with a string to concatenate characters. That is, something like

```python
17 + tup[1:]
```
would cause the following error

```python
TypeError: unsupported operand type(s) for +: 'int' and 'tuple'
```

that simply tells you that you cannot _add_ `int` objects (like `17`) with `tuple` objects (like the slice `tup[1:]`).

In [11]:
# 17 + tup[1:]

**Workaround to modify a tuple**: of course there is a workaround (and it will be clear once you have covered Sec. [2](#list) on Lists). You can:
- convert the tuple into a list using `list()` casting function, 
- change the list (which is a mutable object)
- convert the list back into a tuple using the `tuple()` casting function.

In [12]:
list_tup = list(tup) # cast tup as a list

print(list_tup)
type(list_tup)

[1, 0.35, 'GBP']


list

In [13]:
list_tup[0] = 17 # change the element

In [14]:
tup = tuple(list_tup) # cast-back as a tuple

print(tup)
type(tup)

(17, 0.35, 'GBP')


tuple

Nevertheless, if you need to modify a tuple, the real question is why did you pack your data into a tuple? So, the take-home message is: if you need to modify a tuple, re-think your code... you're likely to need a list instead of a tuple.

### 1.4. Nested Tuples <a name="nested_tuple"></a>
**Read this section once you have covered Sec. [2](#list) on Lists**

Notice that even if the tuple itself is not mutable, its elements can consist of _mutable_ objects (such as lists) and/or _immutable_ objects (such as tuple themselves).

In [15]:
l = [87, 100, 99]          # a list
t = ("ACT/365", "ACT/360") # a tuple

nested_tup = (l, t, 100)

print(nested_tup)
type(nested_tup)

([87, 100, 99], ('ACT/365', 'ACT/360'), 100)


tuple

As we have seen, elements of `nested_tup` can be accessed through indexing:

In [16]:
print(nested_tup[0])
type(nested_tup[0])

[87, 100, 99]


list

In [17]:
print(nested_tup[1])
type(nested_tup[1])

('ACT/365', 'ACT/360')


tuple

In [18]:
print(nested_tup[2])
type(nested_tup[2])

100


int

In the same way as we have seen in Section [2.1. Nested Lists](#nested_list) for nested lists, you can as well access nested elements of list `l` and tuple `s` using nested-indexing of `nested_tup`:

In [19]:
# [0][0] is the index of the first element 
# of (list 'l' which is) the first element of the tuple 'nested_tup'
print(nested_tup[0][0]) 
type(nested_tup[0][0])

87


int

In [20]:
# [0][2] is the index of the third element 
# of (list 'l' which is) the first element of the tuple 'nested_tup'
print(nested_tup[0][2])
type(nested_tup[0][2])

99


int

In [21]:
# [1][0] is the index of the first element 
# of (tuple 't' which is) the second element of the tuple 'nested_tup'
print(nested_tup[1][0])
type(nested_tup[1][0])

ACT/365


str

In [22]:
# [1][1] is the index of the second element 
# of (tuple 't' which is) the second element of the tuple 'nested_tup'
print(nested_tup[1][1])
type(nested_tup[1][1])

ACT/360


str

**Warning**: in the same way as they apply to nested-indexing of lists, indexing of tuples may raise (_repetita iuvant_ ):

- _out of range_ `IndexError`: if you try to refer to an index that does not correspond to any element of the data structure (or of its nested data-structures, if any)

In [23]:
# produces: IndexError: tuple index out of range 
# because index 3 would refer to the 4th element of nested_tup, that does not exist.

# nested_tup[3]   

In [24]:
# produces: IndexError: list index out of range
# because index 3 would refer to the 4th element of nested_tup[0] (i.e. list 'l'), that does not exist

# nested_tup[0][3]

In [25]:
# produces: IndexError: tuple index out of range
# because index 2 would refer to the 3rd element of nested_tup[1] (i.e. tuple 't'), that does not exist

# nested_tup[1][2] 

-  _object is not subscriptable_ `TypeError`: if you try to refer with an index to an element that is not indexable (like Integers, Floats,...)

In [26]:
nested_tup[2]

100

In [27]:
# produces: TypeError: 'int' object is not subscriptable
# because we are trying to refer to the first element of nested_tup[2] (i.e. integer 100), 
# that, in poor words, does not have any element inside and thus doesn't admit indexing.

# nested_tup[2][0]

Keeping in mind that if you need to modify a tuple, you're likely to have chosen the wrong data structure to pack together your data (why not opting for a `list` in the first place?), still for completeness let's briefly discuss what you can modify in a nested tuple. 

Getting back to our nested tuple, you can modify only its mutable nested elements (if any):

In [28]:
nested_tup

([87, 100, 99], ('ACT/365', 'ACT/360'), 100)

In [29]:
nested_tup[0][1] = 98
nested_tup

([87, 98, 99], ('ACT/365', 'ACT/360'), 100)

In [30]:
nested_tup[0].append(75)  # integer 75 is appended at the end of list [87, 98, 99]
nested_tup

([87, 98, 99, 75], ('ACT/365', 'ACT/360'), 100)

but you cannot explicitly re-define elements of the tuple `nested_tup` because this would constrast with the immutability of the (nested) tuple itself. 

To make an example: an explicit redefinition of the element `nested_tup[0]` of `nested_tup` like this one
```python
nested_tup[0] = [67, 89]
```
would produce
```python
TypeError: 'tuple' object does not support item assignment
```
If you find strange that you can change values of the tuple's element `nested_tup[0]` or even adding values to it, as above, but you cannot re-define it as a whole... well, I'm with you. Let's go ahead, you can live with this :)

# 2. Lists <a name="list"></a>

[Lists](https://docs.python.org/3.7/tutorial/introduction.html#lists) consists of a number of values - in general, of heterogeneous data-type - packed together in a mutable sequence and separated by commas between square brackets. 

Lists are very versatile data structures, since they offer flexibility (since they are mutable) and feature several built-in methods that can speed up coding. 

### 2.1. Definition <a name="list_definition"></a>

Lists are defined with square brackets `[]` surrounding the `,`-separated sequence.

In [50]:
my_list = [10, 20, 30, 'a string', True, False]

In [51]:
print(my_list)

[10, 20, 30, 'a string', True, False]


In [52]:
my_nested_list = [ 10, 20, [True, 'my name'], False, [True]]

In [53]:
my_nested_list

[10, 20, [True, 'my name'], False, [True]]

In [54]:
my_list_of_one_element = [True]

In [55]:
my_list_of_one_element

[True]

In [56]:
my_crazy_list = [ (10, 20, 'name'), [10, (42, 0.3)], (True, [])]

In [57]:
my_crazy_list

[(10, 20, 'name'), [10, (42, 0.3)], (True, [])]

In [60]:
my_list[0]

10

In [61]:
my_list[1:4]

[20, 30, 'a string']

In [62]:
my_mutable_list = [42, True, 'answer']

In [63]:
my_mutable_list[1] 

True

In [64]:
my_mutable_list[1] = False

In [65]:
my_mutable_list

[42, False, 'answer']

In [66]:
nicholas_list = [ 42, 'nicholas', (29, 30) ]

In [69]:
(nicholas_list[2])[1]

30

In [70]:
help(nicholas_list.append)

Help on built-in function append:

append(object, /) method of builtins.list instance
    Append object to the end of the list.



In [71]:
my_appendage = 1337
nicholas_list.append(my_appendage)

In [72]:
print(nicholas_list)

[42, 'nicholas', (29, 30), 1337]


In [73]:
nicholas_list + 10

TypeError: can only concatenate list (not "int") to list

In [74]:
nicholas_list + [10]

[42, 'nicholas', (29, 30), 1337, 10]

In [75]:
nicholas_list

[42, 'nicholas', (29, 30), 1337]

In [76]:
help(nicholas_list.index)

Help on built-in function index:

index(value, start=0, stop=9223372036854775807, /) method of builtins.list instance
    Return first index of value.
    
    Raises ValueError if the value is not present.



In [79]:
nicholas_list.index((29, 31))

ValueError: (29, 31) is not in list

In [81]:
my_risky_tuple = ( 10, 20, [True, True] )

In [83]:
my_risky_tuple[0] = 10

TypeError: 'tuple' object does not support item assignment

In [87]:
(my_risky_tuple[2])[0] = False

In [88]:
my_risky_tuple

(10, 20, [False, True])

In [118]:
my_trojan = [True, True]

In [119]:
my_risky_tuple = (10, 20, my_trojan)

In [117]:
my_risky_tuple

(10, 20, [True, True])

In [120]:
my_trojan[0] = False

In [121]:
my_risky_tuple

(10, 20, [False, True])

### 2.2. Indexing and Slicing <a name="list_index"></a>

Lists share a lot of properties with other sequence-like data-structures. In particular, they share _zero-based_ indexing and slicing with Strings and Tuples.

In [33]:
# 0 is the index of the first element of the list
print(lis[0])
type(lis[0])

1


int

In [34]:
# -1 is the index of the last element of the list
print(lis[-1])
type(lis[-1])

GBP


str

Here is how to slice a list (yes, always the same way):

In [35]:
lis_slice = lis[0:2] # elements from position 0 (included) to 2 (excluded)

print(lis_slice)
type(lis_slice)

[1, 0.35]


list

In [36]:
lis[2:5] # elements from position 2 (included) to 5 (excluded)

['GBP']

In [37]:
lis[:2]   # elements from the beginning to position 2 (excluded) --- equivalent to lis[0:2]

[1, 0.35]

In [38]:
lis[-2:]  # elements from the second-last (included) to the end

[0.35, 'GBP']

### 2.3. Nested Lists <a name="nested_list"></a>

Lists can nest other data structures, both _mutable_ objects (such as other lists) and/or _immutable_ objects (such as tuples).

In [39]:
l = [87, 100, 99]          # a list
t = ("ACT/365", "ACT/360") # a tuple

nested_lis = [l, t, 100]

print(nested_lis)
type(nested_lis)

[[87, 100, 99], ('ACT/365', 'ACT/360'), 100]


list

As we have seen, elements of `nested_lis` can be accessed through indexing:

In [40]:
print(nested_lis[0])
type(nested_lis[0])

[87, 100, 99]


list

In [41]:
print(nested_lis[1])
type(nested_lis[1])

('ACT/365', 'ACT/360')


tuple

In [42]:
print(nested_lis[2])
type(nested_lis[2])

100


int

You can as well access elements of list `l` and tuple `s` using a nested-indexing of `nested_lis`:

In [43]:
nested_lis

[[87, 100, 99], ('ACT/365', 'ACT/360'), 100]

In [44]:
# [0][0] is the index of the first element 
# of (list 'l' which is) the first element of the list 'nested_lis'
print(nested_lis[0][0]) 
type(nested_lis[0][0])

87


int

In [45]:
# [0][2] is the index of the third element 
# of (list 'l' which is) the first element of the list 'nested_lis'
print(nested_lis[0][2])
type(nested_lis[0][2])

99


int

In [46]:
# [1][0] is the index of the first element 
# of (list 't' which is) the second element of the list 'nested_lis'
print(nested_lis[1][0])
type(nested_lis[1][0])

ACT/365


str

In [47]:
# [1][1] is the index of the second element 
# of (tuple 't' which is) the second element of the list 'nested_lis'
print(nested_lis[1][1])
type(nested_lis[1][1])

ACT/360


str

### 2.4. Indexing of Nested Sequences <a name="nested_list_index"></a>

Ok you have understood how it works... This is actually a general rule, that applies to all the sequence-like data structures: Tuples (`tuple`), Lists (`list`) but also Numpy arrays (`numpy.ndarray`, with the slightly changed syntax `[i,j]` instead of `[i][j]`, as we'll show in a future lesson) that will be introduced in a future notebook.

If a sequence-like data structure, say `seq`, has nested sequence-like elements, then

```python
seq[i][j]
```

is the element of index `j` of the element of index `i`, `seq[i]`, of `seq`. That is, `seq[i][j]` is the $(j+1)$-th element of the $(i+1)$-th element, `seq[i]`, of `seq`.

**Warning**: nested-indexing of lists may raise:

- _out of range_ `IndexError`: if you try to refer to an index that does not correspond to any element of the data structure (or of its nested data-structures, if any)

In [48]:
nested_lis

[[87, 100, 99], ('ACT/365', 'ACT/360'), 100]

In [49]:
# produces: IndexError: list index out of range 
# because index 3 would refer to the 4th element of nested_lis, that does not exist.

nested_lis[len(nested_lis)-1]   

100

In [50]:
# produces: IndexError: list index out of range
# because index 3 would refer to the 4th element of nested_lis[0] (i.e. list 'l'), that does not exist

# nested_lis[0][3]

In [51]:
# produces: IndexError: tuple index out of range
# because index 2 would refer to the 3rd element of nested_lis[1] (i.e. tuple 't'), that does not exist

nested_lis[1][-1] 

'ACT/360'

- _object is not subscriptable_ `TypeError`: if you try to refer with an index to an element that is not indexable (like Integers, Floats,...)

In [52]:
nested_lis[2]

100

In [53]:
# produces: TypeError: 'int' object is not subscriptable
# because we are trying to refer to the first element of nested_lis[2] (i.e. integer 100), 
# that, in poor words, does not have any element inside and thus doesn't admit indexing.

# nested_lis[2][0]

### 2.5. Changing Values <a name="modify_list"></a>

Differently from strings and tuples, lists are _mutable_ objects and their elements can be changed in a straightforward way.

In [54]:
lis

[1, 0.35, 'GBP']

In [55]:
lis[0] = 17
lis

[17, 0.35, 'GBP']

Being a mutable object, you can modify both mutable and immutable (tuples) objects of a nested list, but of course you cannot change elements inside a nested immutable object (e.g. a tuple inside the list).

Let's go back to our `nested_lis` and suppose you want to change the first day count convention from "ACT/365" to "ACT/360"

In [56]:
nested_lis

[[87, 100, 99], ('ACT/365', 'ACT/360'), 100]

you cannot explicitly modify the element "ACT/365" of the tuple
```python
 ("ACT/365", "ACT/360")
```
because this would result in a 
```python
TypeError: 'tuple' object does not support item assignment
```

In [57]:
# produced TypeError

# nested_lis[1][0] = "ACT/360"

but you could directly re-define the whole tuple `nested_lis[1]` as it is an element of the `nested_lis` list, which is a mutable object.

In [58]:
nested_lis[1] = 0.345
nested_lis

[[87, 100, 99], 0.345, 100]

Other - mutable - elements can be modified as you want: 

In [59]:
nested_lis[0][1] = 98
nested_lis

[[87, 98, 99], 0.345, 100]

In [60]:
nested_lis[0].append(75)  # integer 75 is appended at the end of list [87, 98, 99]
nested_lis

[[87, 98, 99, 75], 0.345, 100]

In [61]:
nested_lis[2] -= 10 # x -= 1 is a short-cut for x = x-1. Other are +=, *= and /=
nested_lis

[[87, 98, 99, 75], 0.345, 90]

### 2.6. Built-in methods <a name="list_methods"></a>

For details on built-in methods see [5.1. More on Lists](https://docs.python.org/3.7/tutorial/datastructures.html#more-on-lists) of the Python tutorial. In particular, two particularly useful built-in methods are worth of mention:
- `list.append(x)`: which appends element `x` to the end of the list, extending it

In [62]:
lis

[17, 0.35, 'GBP']

In [63]:
lis.append('EUR')
lis

[17, 0.35, 'GBP', 'EUR']

- `list.sort()`: that sorts in ascending order a list.

Notice that the list to be sorted must have elements of homogenous data-type, otherwise the interpreter will complain, as in this case:

```python
TypeError: '<' not supported between instances of 'str' and 'float'
```

In [64]:
# lis.sort()

In [65]:
lis[:2]

[17, 0.35]

but, for our list `lis`, the sorting will work if we define a new string as the `[17, 0.35]` slice of the original one

In [66]:
lis_slice = lis[:2]  # [17, 0.35] slice
lis_slice.sort()
lis_slice

[0.35, 17]

In [67]:
lis_slice2 = lis[-2:]
lis_slice2

['GBP', 'EUR']

In [68]:
lis_slice2.sort()
lis_slice2

['EUR', 'GBP']

### 2.7. `for` loop <a name="for"></a>

A [`for` loop](https://docs.python.org/3.7/tutorial/controlflow.html#for-statements) in Python is declared as follows:
```python
for variable in sequence:
    statement(s) possibly using variable
```

In [98]:
my_sequence = [10, 20, 30, 40, 'ciao']

In [99]:
for element in my_sequence:
    print(element)

10
20
30
40
ciao


In [102]:
for whatever_you_want in my_sequence:
    print(whatever_you_want)

10
20
30
40
ciao


In [103]:
my_nested_sequence = (10, 20, 30, ['string1', 'string2', 'string3'])

In [104]:
for thing in my_nested_sequence:
    print(thing)

10
20
30
['string1', 'string2', 'string3']


In [105]:
thing

['string1', 'string2', 'string3']

In [113]:
for value in range(0, 10, 2):
    print(value)

0
2
4
6
8


In [114]:
range(10000000000000000000000000000000000000000000000000000000000000000000000000000000)

range(0, 10000000000000000000000000000000000000000000000000000000000000000000000000000000)

In [106]:
help(range)

Help on class range in module builtins:

class range(object)
 |  range(stop) -> range object
 |  range(start, stop[, step]) -> range object
 |  
 |  Return an object that produces a sequence of integers from start (inclusive)
 |  to stop (exclusive) by step.  range(i, j) produces i, i+1, i+2, ..., j-1.
 |  start defaults to 0, and stop is omitted!  range(4) produces 0, 1, 2, 3.
 |  These are exactly the valid indices for a list of 4 elements.
 |  When step is given, it specifies the increment (or decrement).
 |  
 |  Methods defined here:
 |  
 |  __bool__(self, /)
 |      self != 0
 |  
 |  __contains__(self, key, /)
 |      Return key in self.
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __getitem__(self, key, /)
 |      Return self[key].
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __hash__(self, /)
 |

#### 2.7.1. for loop over a list <a name="for_over_list"></a>

To make an example, we can print elements of a list:

In [69]:
x = [10, 20, 30]

for xi in x:
    print(xi)

10
20
30


Notice how this `for` loop is not _counter-based,_ that is is the Python interpreter that loops into the sequence, returning us the current element of the sequence at each iteration.

#### 2.7.2. Counter-based looping and `range()` function <a name="for_range"></a>

In some occasions it could be good to loop over a sequence being able to access to its elements through their indexes. 
This can be achieved using the [`range()` function](https://docs.python.org/3.7/tutorial/controlflow.html#the-range-function) which generates a sequence of numbers as an object of the (strange) type `range`.

In [70]:
x  = range(10)

print(x)
type(x)

range(0, 10)


range

`range()` is mostly used in `for` loops as follows:

In [71]:
# a loop over the first 5 numbers from zero
for i in range(5):
    print(i)

0
1
2
3
4


In [72]:
# a loop over numbers from 1 to 9
for i in range(1,10):
    print(i)

1
2
3
4
5
6
7
8
9


But it can be used also to loop over a list, accessing its indexes:

In [73]:
# a loop over the elements of list `lis`
for i in range(len(x)):
    print(x[i])

0
1
2
3
4
5
6
7
8
9


In [74]:
x = [10, 20, 30]

for i in range(3):
    print(i)
    print(x[i])

0
10
1
20
2
30


where we have used the fact that `range(len(x))` returns numbers from `0` to `len(x)-1`, which are first and last indexes of list `x`.

**Example**: let's get back to our Fibonacci numbers (see Sec. 3.1 while loop of [Basics_I___Data_Types.ipynb](https://github.com/gabrielepompa88/IT-For-Business-And-Finance-2019-20/blob/master/Notebooks/Basics_I___Data_Types.ipynb) for a refresh) and let's suppose we want to compute the $n$-th number $F_n$ 

In [75]:
def fib_nth(n):
    """
    This function computes the n-th Fibonacci number using inline assignments.
    
    Parameters:
        n (int): which number to compute.
    
    Returns:
        F_n2 (int): n-th Fibonacci number.
    """
    
    if n <= 2:
        return n-1
    
    # inline initialization
    F_n2, F_n1 = 0, 1 # F_{n-2}, F_{n-1}

    
    for k in range(n-1):

        # uncomment this line below if you want to print to screen the current number
        print(F_n2)

        # inline update of the last two numbers
        F_n2, F_n1 = F_n1, F_n1 + F_n2
    
    return F_n2

In [76]:
N = 10


fib_nth(N)

0
1
1
2
3
5
8
13
21


34

#### 2.7.3. `break` Statement in loops <a name="for_break"></a>

The [`break`](https://docs.python.org/3.7/tutorial/controlflow.html#break-and-continue-statements-and-else-clauses-on-loops) statement interrupts a `while` or `for` loop that wouldn't be concluded yet otherwise. It is typically used in combination with an `if` statement to break the loop once the condition triggered by the `if` is met. 

To make a practical example, let's check whether an item is in a list looping over the list.

In [77]:
lis = [1, "A", 0.35, 1/4, "@"]

item = 0.35

for element in lis:
    
    print("Checking element {}".format(element))
    
    if element == item:
        print("Item {} found".format(item))
        break

Checking element 1
Checking element A
Checking element 0.35
Item 0.35 found


As you can see, elements `1/4` and `@` are not checked, because the `element == item` condition of the `if` statement is triggered for element `0.35`. 

Just for your knowledge, this was just a pedantic example. If you really want to check whether an item is in a list, well it's super-easy: you have to use the general syntax:
````python
object in sequence
```
where is the `in` operator that manages the checking operations and returns `True` or `False` depending on the fact that the `object` is really found in the `sequence` or not. Therefore `in` is typically used in the `condition` part of an `if` statement.

In [78]:
if (item in lis):
    print("Item {} found".format(item))
else:
    print("Item {} not found".format(item))    

Item 0.35 found


#### 2.7.4. `enumerate()` looping  <a name="for_enumerate"></a>

There is one more way to loop over a list, which allows us to access both indexes and values of a sequence. It's the built-in function [ernumerate()](https://docs.python.org/3.7/library/functions.html?highlight=enumerate#enumerate).  

In [79]:
x = [10, 20, 30]

for i, xi in enumerate(x):
    print("index i={}; value xi={}".format(i, xi))  # xi is equivalent to do x[i]

index i=0; value xi=10
index i=1; value xi=20
index i=2; value xi=30


### 2.8. List comprehension <a name="list_comprehension"></a>

Let's suppose we want to create a list of the squares of the first `n` numbers: 0, 1, 4, 9,.... We could do this way with a `for` loop

In [80]:
def create_squares_list(n):
    
    lis = []  # an empty list

    for i in range(n):
        print(lis)
        lis += [i**2]  # equivalent to lis.append(i**2) 
    
    return lis

In [81]:
n = 10

In [82]:
x = create_squares_list(n)
x

[]
[0]
[0, 1]
[0, 1, 4]
[0, 1, 4, 9]
[0, 1, 4, 9, 16]
[0, 1, 4, 9, 16, 25]
[0, 1, 4, 9, 16, 25, 36]
[0, 1, 4, 9, 16, 25, 36, 49]
[0, 1, 4, 9, 16, 25, 36, 49, 64]


[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Timing the code (let's re-define `create_squares_list(n)` function without the print, which would consumes a lot of computing time otherwise)

In [83]:
def create_squares_list(n):
    
    lis = []  # an empty list

    for i in range(n):
        lis += [i**2]  # equivalent to lis.append(i**2) 
    
    return lis

In [84]:
%timeit create_squares_list(n)

3.46 µs ± 40.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


Alternatively we can create a list through a list comprehension feature (see sectio [5.1.3. List Comprehensions](https://docs.python.org/3.7/tutorial/datastructures.html#list-comprehensions)), which is an elegant and fast way to generate lists:

In [85]:
x = [i**2 for i in range(n)]
x

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [86]:
y = [pippo**3 for pippo in range(n)]
y

[0, 1, 8, 27, 64, 125, 216, 343, 512, 729]

In [87]:
9**3

729

as you can see, the result is the same. For small dimensions like `n=10`, also the time is comparable

In [88]:
%timeit [i**2 for i in range(n)]

2.98 µs ± 38.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


but let's see what happen for bigger dimensions:

In [89]:
n = int(1e6)
n

1000000

In [90]:
%timeit create_squares_list(n)

368 ms ± 3.51 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [91]:
%timeit [i**2 for i in range(n)]

334 ms ± 30.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


For such an easy definition (we are only computing squares) the improvement is marginal, but for more complex list definitions, the speed-improvement can be significant. Plus, what the entire `create_squares_list(n)` does has been replaced by the one-line of code `[i**2 for i in range(n)]`. Nice, isn't it?

# 3. Dicts <a name="dict"></a>

[Dicts](https://docs.python.org/3.7/tutorial/datastructures.html#dictionaries) consists of collection of _key: value_ pairs where _keys_ are a _unique_ set of any immutable data-type.

Dictionaries are very good to implement mapping-tables or any kind of logic association between a given set of unique indexers (the keys) and data (the values). 

### 3.1. Definition <a name="dict_def"></a>

Dicts are defined with curly brackets `{}` surrounding the `,`-separated list of `key: value` pairs.

In [92]:
d = {"AAA": 5, "A+": 20, "BBB": 50, "D": 100}

d

{'AAA': 5, 'A+': 20, 'BBB': 50, 'D': 100}

So, differently from lists, that are indexed by a range of numbers, dictionaries are indexed by (unique) keys.

Notice that keys must be of immutable data-type: so ok `str`, `int`, `float`... but not `list` (because they can be modified in place)

In [93]:
d1 = {
    10: 1,
    20: 2,
    30: 3
}

d1

{10: 1, 20: 2, 30: 3}

If you try to use lists as keys of a dictionary you get
```python
TypeError: unhashable type: 'list'
```

In [94]:
# d2 = {
#     ["AAA"]: 5,
#     ['A+']: 20, 
#     ['BBB']: 50, 
#     ['D']: 100
# }

Another difference from lists is that dicts are not ordered (don't have memory of the order in which the list of key: value pairs has been inputed) and, in general, cannot be sorted. In particular, there is not such indexing like `dictName[0]`, `dictName[1]`, or so, simply because there is no guarantee that the first pair that we put in input is actually stored in the first position (whatever _first-position_ could mean) and that could be retrieved accordingly.

### 3.2. key-based indexing <a name="key_index"></a>

Indexing in dictionaries is implemented through their keys. Once you know the key, you can retrieve the corresponding value as `dictName[key]`. Like this:

In [95]:
d

{'AAA': 5, 'A+': 20, 'BBB': 50, 'D': 100}

In [96]:
d["AAA"]

5

Once we have a dict, representing a map key-to-value, we typically want to look whether a key is in the dictionary:

In [97]:
key_list = ["AAA", "AA", "B+", "D"]

for key in key_list:
    if key in d:
        print("key {} found".format(key))
    else:
        print("key {} not found".format(key))

key AAA found
key AA not found
key B+ not found
key D found


Notice that if you try to access a dictionary with a key it does not have, you are going to receive a `KeyError` simply stating that that key is not part of the dictionary's keys.

In [98]:
d

{'AAA': 5, 'A+': 20, 'BBB': 50, 'D': 100}

In [99]:
# raises KeyError
# d["B+"]

### 3.3. Changing Values <a name="modify_dict"></a>

If you want to change a value corresponding to a given key, you just assign to it a new value: `dictName[key] = newValue`

In [100]:
d["AAA"] = 1
d

{'AAA': 1, 'A+': 20, 'BBB': 50, 'D': 100}

### 3.4. Built-in methods <a name="dict_methods"></a>

Some useful methods:
- `dict.keys()` returns a sequence of dict keys;
- `dict.values()` returns a sequence of dict values;
- `dict.items()` returns a sequence of key: value pairs packed as tuples

In [101]:
d.keys()

dict_keys(['AAA', 'A+', 'BBB', 'D'])

In [102]:
d.values()

dict_values([1, 20, 50, 100])

In [103]:
d.items()

dict_items([('AAA', 1), ('A+', 20), ('BBB', 50), ('D', 100)])

### 3.5. Looping over dicts <a name="dict_loop"></a>

Looping can be done in several ways. The basic one is looping over the keys of the dictionary, which can be done in this way:

In [104]:
for key in d:
    print("key: {} - value: {}".format(key, d[key]))

key: AAA - value: 1
key: A+ - value: 20
key: BBB - value: 50
key: D - value: 100


The interpreter finds out that `d` is a dictionary and understands that it has to look for its keys and loop over them, assigning each tyme to `key` variable.

# 4. Sets <a name="set"></a>

[Sets](https://docs.python.org/3.7/tutorial/datastructures.html#sets) consists of an unordered collection with no duplicate elements. 

Sets are typically used to store unique values and to check whether a value is in there. Plus basic sets operations like union, intersection, etc...

### 4.1. Definition <a name="set_def"></a>

Sets are defined with curly brackets `{}` surrounding a `,`-separated list 

In [105]:
my_set = {1, 2, 2, 4,  6, 6}
my_set

{1, 2, 4, 6}

Alernatively you can use the key-word `set()`

In [106]:
my_set = set([1, 2, 2, 4,  6, 6])
my_set

{1, 2, 4, 6}

Notice how repeated values are automatically counted only one.

### 4.2. Test for membership <a name="set_membership"></a>

Checking whether an element is in the set can be done easily:

In [107]:
element_list = [i for i in range(10)]
element_list
for element in element_list:
    if element in my_set:
        print("element {} found".format(element))
    else:
        print("element {} not found".format(element))

element 0 not found
element 1 found
element 2 found
element 3 not found
element 4 found
element 5 not found
element 6 found
element 7 not found
element 8 not found
element 9 not found


In [108]:
element_list

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

### 4.3. Set operations <a name="set_operations"></a>

Let's review basic set operations:

In [109]:
set1 = set([i for i in range(10)])    # yes, you can use list comprehension here
set2 = {i**2 for i in range(10)}      # yes, this is another way to create a set: "set-comprehension"

In [110]:
set1

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

In [111]:
set2

{0, 1, 4, 9, 16, 25, 36, 49, 64, 81}

#### 4.3.1. `.union()`  <a name="union"></a>

The union of two sets is the set of elements in `set1`, `set2`, or both

In [112]:
set1.union(set2)

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 16, 25, 36, 49, 64, 81}

or alternatively

In [113]:
set1 | set2

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 16, 25, 36, 49, 64, 81}

#### 4.3.2. `.intersection()`  <a name="intersection"></a>

The intersection of two sets is the set of elements in common between `set1` and `set2`

In [114]:
set1.intersection(set2)

{0, 1, 4, 9}

or alternatively

In [115]:
set1 & set2

{0, 1, 4, 9}

#### 4.3.3. `.difference()`  <a name="difference"></a>

The difference of two sets is the set of elements in `set1` but not in `set2`

In [116]:
set1.difference(set2) # notice that is different from set2.difference(set1)

{2, 3, 5, 6, 7, 8}

or alternatively

In [117]:
set1 - set2

{2, 3, 5, 6, 7, 8}

### 4.4. Getting rid of duplicates from a list <a name="set_duplicates_list"></a>

To conclude, often sets are used to get rid of duplicates in a list:

In [118]:
lis = [(-1)**n for n in range(10)]
lis

[1, -1, 1, -1, 1, -1, 1, -1, 1, -1]

In [119]:
set(lis)

{-1, 1}