# Basics II - Data Structures

# Table of contents

[Executive Summary](#summary)
1. [Tuples](#tuple)
2. [Lists](#list)\
    2.1 [for loop](#for)
3. [Dicts](#dict)
4. [Sets](#set)

### **Resources**: 

- [_Python for Finance (2nd ed.)_](http://shop.oreilly.com/product/0636920117728.do): Sec. 3.Basic Data Structures (Section 3.Excursus: Functional Programming is optional)
- [_The Python Tutorial_](https://docs.python.org/3.7/tutorial/): Sec. 3.1.3 (Lists), 4.2 (for Statements), 4.3 (The range() Function), 4.4 (break and continue Statemenents, and else Clauses on Loops), 5.1 (More on Lists), 5.3 (Tuples and Sequences), 5.4 (Sets), 5.5 (Dictionaries)

# Executive Summary <a name="summary"></a>

Intuitively, a _data structure_ is an object containing other objects, not necessarily of the same _data type_.

Standard Python provides four basic data structures, which can be differentiated at high level by being:
- _ordered_ or _not ordered:_ that is, whether they preserve the order in which entries are added or not;
- _mutable_ or _immutable:_ that is, whether - once defined - they can be modified or not.

These data-strucutures are:

data-structure | ordered (or not) | mutable (or not)
--- | --- | ---
Tuples  | ordered | immutable |
Lists | ordered | mutable |
Dicts | not ordered | mutable |
Sets | not ordered | mutable |

The function `type()` can be called over any defined data-structure and returns its type: `tuple` for Tuples, `list` for Lists, `dict` for Dicts and `set` for Sets.

The following sections are organized as follows: 
- In Sec. [1](#tuple) Tuples (`tuple`) are introduced as the Python data-structure for _ordered_ sequence-like objects that _cannot be_ modified once defined. 
- In Sec. [2](#list) Lists (`list`) are introduced as the Python data-structure for _ordered_ sequence-like objects that _can be_ modified once defined. In this context `for` loops are introduced in Sec. [2.1](#for).
- In Sec. [3](#dict) Dicts (`dict`) are introduced as the Python data-structure for _not ordered_ collection-like objects that _can be_ modified once defined and that implement a _key-to-value_ map.
- In Sec. [4](#set) Sets (`set`) are introduced as the Python data-structure for _not ordered_ collection-like objects that _can be_ modified once defined and that contain unique elements (that is, every elements appears only once). 

# 1. Tuples <a name="tuple"></a>

Informally, tuples consists of a number of values, in general of heterogeneous data-type, packed together in an immutable sequence and separated by commas. 

Tuples can be defined with or without parenthesis `()` surrounding the `,`-separated sequence.

In [26]:
tup = (1, 0.35, "GBP")

print(tup)
type(tup)

(1, 0.35, 'GBP')


tuple

In [2]:
tup = 1, 0.35, "GBP"

print(tup)
type(tup)

(1, 0.35, 'GBP')


tuple

Tuples share indexing features with strings (see [Basics_I___Data_Types.ipynb](Notebooks/Basics_I___Data_Types.ipynb)) and lists (see Sec. [2](#list)).
In particular, elements of a tuple can be accessed by _zero-based_ indexes:

In [6]:
# 0 is the index of the first element of the tuple
print(tup[0])
type(tup[0])

1
<class 'int'>


In [7]:
# -1 is the index of the last element of the tuple
print(tup[-1])
type(tup[-1])

GBP


str

and tuples can be sliced. That is, you can select only few elements of the tuple.

In [8]:
tup_slice = tup[0:2] # elements from position 0 (included) to 2 (excluded)

print(tup_slice)
type(tup_slice)

(1, 0.35)


tuple

In [10]:
tup[2:5] # elements from position 2 (included) to 5 (excluded)

('GBP',)

In [11]:
tup[:2]   # elements from the beginning to position 2 (excluded) --- equivalent to s[0:2]

(1, 0.35)

In [12]:
tup[-2:]  # elements from the second-last (included) to the end

(0.35, 'GBP')

Similarly to strings, but differently from lists, tuples are _immutable_ like `str` objects.  That is, if you try to change one of its elements, you get
```python
TypeError: 'tuple' object does not support item assignment
```

In [15]:
# tup[0] = 17

In particular, you cannot simply use the `+` operator as you would do with a string to concatenate characters. Namely, something like

```python
17 + tup[1:]
```
would cause the following error

```python
TypeError: unsupported operand type(s) for +: 'int' and 'tuple'
```

that simply tells you that you cannot _add_ `int` objects (like `17`) with `tuple` objects (like the slice `tup[1:]`). Nevertheless, there is a workaround... read below once you know about `list` data-structures.

In [24]:
# 17 + tup[1:]

---

**Read this once you have covered Sec. [2](#list) on Lists**: even if you cannot change directly an element of a tuple, you can: 
- use the `list()` _casting_ function to cast the tuple as a list
- modify the list
- re-cast it back as tuple using the casting function `tuple()`

In [27]:
list_tup = list(tup) # cast tup as a list

print(list_tup)
type(list_tup)

[1, 0.35, 'GBP']


list

In [28]:
list_tup[0] = 17 # change the element

In [30]:
tup = tuple(list_tup) # cast-back as a tuple

print(tup)
type(tup)

(17, 0.35, 'GBP')


tuple

---

**Read this once you have covered Sec. [2](#list) on Lists**: notice that even if the tuple itself is not mutable, its element may consist of mutable objects (such as lists) and/or immutable objects (such as tuple themselves).

In [39]:
l = [1, 0.5, "ITA"]        # a list
t = ("ACT/365", "ACT/360") # a tuple

nested_tup = (l, t, 100)

print(nested_tup)
type(nested_tup)

([1, 0.5, 'ITA'], ('ACT/365', 'ACT/360'), 100)


tuple

As we have seen, elements of `tup` can be accessed through indexing:

In [40]:
print(nested_tup[0])
type(nested_tup[0])

[1, 0.5, 'ITA']


list

In [41]:
print(nested_tup[1])
type(nested_tup[1])

('ACT/365', 'ACT/360')


tuple

In [42]:
print(nested_tup[2])
type(nested_tup[2])

100


int

You can as well access elements of list `l` and tuple `s` using a nested-indexing syntax:

In [44]:
# [0][0] is the index of the first element 
# of (list 'l' which is) the first element of the tuple 'nested_tup'
print(nested_tup[0][0]) 
type(nested_tup[0][0])

1


int

In [45]:
# [0][2] is the index of the third element 
# of (list 'l' which is) the first element of the tuple 'nested_tup'
print(nested_tup[0][2])
type(nested_tup[0][2])

ITA


str

In [47]:
# [1][0] is the index of the first element 
# of (tuple 't' which is) the second element of the tuple 'nested_tup'
print(nested_tup[1][0])
type(nested_tup[1][0])

ACT/365


str

In [48]:
# [1][1] is the index of the second element 
# of (tuple 't' which is) the second element of the tuple 'nested_tup'
print(nested_tup[1][1])
type(nested_tup[1][1])

ACT/360


str

Ok you have understood how it works... This is actually a general rule, that applies to both Tuples (`tuple`), Lists (`list`) and Numpy arrays (`numpy.ndarray`, we'll talk about these in a future notebook), that is it applies to all the three basic sequence-like data-structures used in Python.

If a sequence-like data structure, say `seq`, has nested sequence-like elements, then

```python
seq[i][j]
```

will point to the element of index `j` of the element of index `i` of `seq`.

**Warning**: 

- if you try to refer to an index that does not correspond to any element of the data structure (or of its nested data-structures, if any), Python interpreter shall raise an _out of range_ `IndexError`

In [65]:
# produces: IndexError: tuple index out of range 
# because index 3 would refer to the 4th element of nested_tup, that does not exist.

# nested_tup[3]   

In [66]:
# produces: IndexError: list index out of range
# because index 3 would refer to the 4th element of nested_tup[0] (i.e. list 'l'), that does not exist

# nested_tup[0][3]

In [67]:
# produces: IndexError: tuple index out of range
# because index 2 would refer to the 3rd element of nested_tup[1] (i.e. tuple 't'), that does not exist

# nested_tup[1][2] 

- if you try to refer with an index to an element that is not indexable (like Integers, Floats,...), Python interpreter shall raise an _object is not subscriptable_ `TypeError`

In [63]:
nested_tup[2]

100

In [68]:
# produces: TypeError: 'int' object is not subscriptable
# because we are trying to refer to the first element of nested_tup[2] (i.e. integer 100), 
# that, in poor words, does not have any element inside and thus doesn't admit indexing.

# nested_tup[2][0]

# 2. Lists <a name="list"></a>

## 2.1. for loop <a name="for"></a>

# 3. Dicts <a name="dict"></a>

# 4. Sets <a name="set"></a>