# 2. Data Structures

In the second section we learn about data structures such as lists, dictionaries, tuples, and sets. 
This includes 

* when to use which structure,
* how to change the elements of the data structures,
* how indexing in Python works,
* nested structures (e.g. a list in a list) and
* a little bit more on strings.

Keywords: ```list```, ```dict```, ```len```, ```append```, ```extend```, ```insert```,  ```remove```,
```del```, ```help```,
```pop```, ```sort```, ```::2```, ```split```, ```items```, ```keys```, ```values```, ```tuples```, 
```add```, ```discard```

***
## Lists

A list is a sequence of ordered values with each element being indexed by an integer.

In [1]:
days = [ 'Monday','Tuesday','Wednesday' ]
print(days)

['Monday', 'Tuesday', 'Wednesday']


In [2]:
days[0]

'Monday'

#### Note
that the indexing in Python starts with 0. 

In [3]:
type(days)

list

In [4]:
len(days)

3

#### Note
that with ```len``` the length of the list is obtained.

In [5]:
days[1]

'Tuesday'

In [6]:
days[2]

'Wednesday'

In [7]:
days[3]

IndexError: list index out of range

#### Lists are mutable

In the following we will apply list __methods__ which allow to perform manipulation of 
the list objects. You identify a list method by the list name followed by a dot and the name of the list
method.

Here we see how to change the content of a list. All the changes happen _in place_, i.e. you don't need to 
assign the resulting changes to the list.

In [8]:
days.append('Friday')
print(days)

['Monday', 'Tuesday', 'Wednesday', 'Friday']


In [9]:
days.extend( ['Saturday', 'Sunday'] )
print(days)

['Monday', 'Tuesday', 'Wednesday', 'Friday', 'Saturday', 'Sunday']


In [10]:
days.insert(3, 'Thursday')
print(days)

['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']


In [11]:
days.pop()

'Sunday'

In [12]:
days

['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']

In [13]:
days.insert(0, 'Friday')
print(days)

['Friday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']


In [14]:
days.remove('Friday')
print(days)

['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']


#### Note
that the ```remove``` method only removes one occurence of the specified element. Let's perform 
the removal again:

In [15]:
days.remove('Friday')
print(days)

['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Saturday']


In [16]:
del days[0]
print(days)

['Tuesday', 'Wednesday', 'Thursday', 'Saturday']


#### Note 
that you can also delete by index with ```del```. 

In [17]:
help(list.pop)

Help on method_descriptor:

pop(self, index=-1, /)
    Remove and return item at index (default last).
    
    Raises IndexError if list is empty or index is out of range.



In [20]:
test_list = [1,2,3,4]
test_list.pop(2)

3

In [21]:
help(list.insert)

Help on method_descriptor:

insert(self, index, object, /)
    Insert object before index.



#### Note

that you can use ```help``` to display some documentation and
learn e.g. a bit more about particular methods for lists.


Let's add the missing days again:

In [22]:
days.insert(0,'Monday')
days.insert(4,'Friday')
days.append('Sunday')
print(days)

['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']


You can initialise an empty list with ```list()``` or ```[]```.

In [39]:
numbers = []    # numbers = list()
print("The initialised list is empty:", numbers)
numbers.extend( [14,2,5,1,101] )
print("Now we filled it with numbers:", numbers)

The initialised list is empty: []
Now we filled it with numbers: [14, 2, 5, 1, 101]


In [34]:
numbers.sort()
print(numbers)

numbers.sort(reverse=True)
print(numbers)

[1, 2, 5, 14, 101]
[101, 14, 5, 2, 1]


In [35]:
numbers + days

[101,
 14,
 5,
 2,
 1,
 'Monday',
 'Tuesday',
 'Wednesday',
 'Thursday',
 'Friday',
 'Saturday',
 'Sunday']

In [40]:
print(numbers)
print(days)

[14, 2, 5, 1, 101]
['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']


In [41]:
result = numbers.extend(days)
print(result)
print(numbers)

None
[14, 2, 5, 1, 101, 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']


#### Note
that you would get the same result with

```python
numbers.extend(days)
```

In [26]:
numbers*2

[101, 14, 5, 2, 1, 101, 14, 5, 2, 1]

#### Note
that you would get the same result with

```python
numbers.extend(numbers)
```

### More on indexing

In [27]:
days

['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']

In [28]:
days[1:3]

['Tuesday', 'Wednesday']

In [42]:
days[:5]

['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']

In [43]:
weekend = days[5:]
print(weekend)

['Saturday', 'Sunday']


In [46]:
days[::2]

['Monday', 'Wednesday', 'Friday', 'Sunday']

#### Note 
that you can also use negative indices to query list from the end.

In [47]:
days[-1]

'Sunday'

***
## Nested lists

A list can contain lists itself. You can for example think of a matrix as a list of lists as in the following example.

In [48]:
matrix = [ [1,2] , [3,4] ]

In [49]:
matrix

[[1, 2], [3, 4]]

In [50]:
matrix[0]

[1, 2]

In [51]:
matrix[0][0]

1

In [52]:
matrix[1][1]

4

In [53]:
matrix[1][1] = 400
print(matrix)

[[1, 2], [3, 400]]


#### Note 
that nested lists can be as deep as you need them. For example, you can construct a 
3 dimensional matrix (a tensor) with nested lists.

In [54]:
tensor = [ [ [1,2],[3,4] ] ,[[5,6],[7,8] ] , [[9,10],[11,12]] ]
tensor

[[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]]

In [55]:
new_list = [ 'a', 3, 1.5, [1,2,3] ]

In [56]:
print(new_list)

['a', 3, 1.5, [1, 2, 3]]


In [57]:
new_list.sort()

TypeError: '<' not supported between instances of 'int' and 'str'

In [58]:
'a' < 3

TypeError: '<' not supported between instances of 'str' and 'int'

***
## A little bit more on strings

You can think of a string as a list where each letter has an index. For 

In [59]:
string = "This is an example"

the indices would be:

| T | h | i | s | _ | i | s | _ | a | n | _ | e | x | a | m | p | l | e |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 |

In [60]:
string[0]

'T'

In [61]:
string[:7]

'This is'

In [63]:
string[4]

' '

In [64]:
new_string = string + "end we would like to ext"
new_string

'This is an example we would like to extend'

In [65]:
new_string.split(" ")

['This', 'is', 'an', 'example', 'we', 'would', 'like', 'to', 'extend']

In [66]:
string[0] = 't'

TypeError: 'str' object does not support item assignment

In [67]:
integer_string = '123'
integer_string*3

'123123123'

In [68]:
len(integer_string)

3

***
## Dictionaries

A special data structure in Python is the dictionary, where __key-value__ pairs can be defined. 
Suppose you would like to encode a table with the country calling prefix codes like 

| Country | Code |
|---|---|
| France | 33 |
| Germany | 49 |
| Italy | 39 |
| Switzerland | 41 |
| UK | 44 |

In [70]:
country_codes = {'Switzerland': 41, 'France': 33, 'Italy': 39, 
                 'UK': 44, 'Germany': 49}
country_codes

{'Switzerland': 41, 'France': 33, 'Italy': 39, 'UK': 44, 'Germany': 49}

In [71]:
country_codes['Italy']

39

Dictionaries have special methods to query the contained information:

In [72]:
country_codes.items()

dict_items([('Switzerland', 41), ('France', 33), ('Italy', 39), ('UK', 44), ('Germany', 49)])

In [73]:
country_codes.values()

dict_values([41, 33, 39, 44, 49])

In [74]:
country_codes.keys()

dict_keys(['Switzerland', 'France', 'Italy', 'UK', 'Germany'])

Adding new entries to the dictionary is straight forward:

In [75]:
country_codes['Spain'] = 34
print(country_codes)

{'Switzerland': 41, 'France': 33, 'Italy': 39, 'UK': 44, 'Germany': 49, 'Spain': 34}


In [76]:
del country_codes['UK']
country_codes

{'Switzerland': 41, 'France': 33, 'Italy': 39, 'Germany': 49, 'Spain': 34}

In [77]:
country_codes['UK']

KeyError: 'UK'

In [78]:
country_codes.get('Italy')

39

In [80]:
country_codes.get('UK')

In [81]:
new_var = country_codes.get('UK')
print(new_var)

None


In [82]:
country_codes

{'Switzerland': 41, 'France': 33, 'Italy': 39, 'Germany': 49, 'Spain': 34}

In [83]:
dict_pop = country_codes.pop("Italy", None)

In [84]:
print(dict_pop)

39


In [85]:
country_codes

{'Switzerland': 41, 'France': 33, 'Germany': 49, 'Spain': 34}

In [86]:
dict_pop = country_codes.pop("UK", 999)
print(dict_pop)

999


#### Note 
that ```get``` and ```pop``` are useful methods to access potential
elements of a dictionary, which do not issue an error when an element 
is not in the dictionary. 

In [87]:
nested_dict = {'Switzerland': {'Capital': 'Bern', 
                               'Country_Code': 41}, 
               'France': {'Capital': 'Paris', 
                          'Country_Code': 33}, 
               'Italy': {'Capital': 'Rome', 
                         'Country_Code': 39}, 
               'Germany': {'Capital': 'Berlin', 
                           'Country_Code': 49}
              }

In [88]:
nested_dict['France']

{'Capital': 'Paris', 'Country_Code': 33}

In [89]:
nested_dict['France']['Capital']

'Paris'

Checking if a key is present in the dictionary:

In [90]:
'France' in nested_dict

True

In [91]:
'UK' in nested_dict

False

In [92]:
nested_dict

{'Switzerland': {'Capital': 'Bern', 'Country_Code': 41},
 'France': {'Capital': 'Paris', 'Country_Code': 33},
 'Italy': {'Capital': 'Rome', 'Country_Code': 39},
 'Germany': {'Capital': 'Berlin', 'Country_Code': 49}}

In [93]:
other_dict = {1: 'a', 2: 'b'}
other_dict

{1: 'a', 2: 'b'}

***
## Other data structures: Tuples & sets



In [29]:
my_tuple = (1,2,3,4)
print(my_tuple)

(1, 2, 3, 4)


In [30]:
len(my_tuple)

4

In [31]:
my_tuple[1]

2

#### Note
that tuples are immutable, you can't change the contained elements.

In [33]:
my_tuple[1] = 100

TypeError: 'tuple' object does not support item assignment

In [34]:
convert2list = list(my_tuple)
print(convert2list)
convert2list[1] = 100
print(convert2list)

[1, 2, 3, 4]
[1, 100, 3, 4]


In [98]:
my_set = {'a','b','c','d','c','b','a'}
print(my_set)

{'c', 'a', 'b', 'd'}


In [99]:
my_set.add('e')
print(my_set)

{'a', 'c', 'b', 'e', 'd'}


In [100]:
my_set.discard('a')
print(my_set)

{'c', 'b', 'e', 'd'}


In [101]:
list(my_set)

['c', 'b', 'e', 'd']

### Overview

In [102]:
country_codes_dict  = {'Switzerland': 41, 'France': 33, 'Italy': 39, 'UK': 44, 'Germany': 49}
country_codes_list  = ['Switzerland', 'France', 'Italy', 'UK', 'Germany']
country_codes_tuple = ('Switzerland', 'France', 'Italy', 'UK', 'Germany')
country_codes_set   = {'Switzerland', 'France', 'Italy', 'UK', 'Germany'}

***
## Some caveats

In [103]:
matrix = [ [1,2],[3,4] ]

copy_matrix = matrix
print(copy_matrix)

[[1, 2], [3, 4]]


In [104]:
copy_matrix[1][1] = 40
print("The copied matrix looks like\n", copy_matrix)
print("But also the original matrix looks now like\n", matrix)

The copied matrix looks like
 [[1, 2], [3, 40]]
But also the original matrix looks now like
 [[1, 2], [3, 40]]


In [105]:
print(matrix)

[[1, 2], [3, 40]]


In [108]:
print(copy_matrix)

[[1, 2], [3, 40]]


#### What

is the problem here? It is because

In [107]:
copy_matrix is matrix

True

#### Note

that you should not use variable names which are 
already used by Python in some way. For example, 
do not use built-in function names as variable names.

So something like the following would be a bad idea

```Python
list = [1, 10, 100]
dict = {'a':1, 'b':2}
print = "Just a string"
```

***
## Exercise section

(1.) Rearange the following list so that the resulting list displays all integers from 1 to 8, i.e. 
```[1, 2, 3, 4, 5, 6, 7, 8]```.

In [21]:
new_numbers = [9,7,6,2]

Put your solution here:

In [22]:
del new_numbers[0] # Variant 3
print(new_numbers)
new_numbers.extend( [1,3,4,5,8] )
new_numbers.sort()
print(new_numbers)
# new_numbers.pop() # Variant 1
# new_numbers.remove(9) # Variant 2

[7, 6, 2]
[1, 2, 3, 4, 5, 6, 7, 8]


In [23]:
print("Your solution is", new_numbers)

Your solution is [1, 2, 3, 4, 5, 6, 7, 8]


***

(2.) Add to dictionary 

In [24]:
constants = {'c': 299792458}

the following additional constants 

* $\pi$ = 3.14159
* $e$ = 2.71828
* $\phi$ = 1.61803

where you use _pi, e, phi_ as the keywords and the respective floats as the values.

Put your solution here:

In [27]:
constants["pi"] = 3.14159
constants["e"] = 2.71828
constants["phi"] = 1.61803

In [28]:
print(constants)

{'c': 299792458, 'pi': 3.14159, 'e': 2.71828, 'phi': 1.61803}


***
## Proposed Solution

(1.) Rearange the following list so that the resulting list displays all integers from 1 to 8:

In [1]:
new_numbers = [9,7,6,2]

Put your solution here:

In [2]:
new_numbers.extend( [1,3,4,5,8] )
new_numbers.sort()
new_numbers.remove(9)

In [3]:
print("Your solution is", new_numbers)

Your solution is [1, 2, 3, 4, 5, 6, 7, 8]


***

(2.) Add to dictionary 

In [4]:
constants = {'c': 299792458}

the following additional constants 

* $\pi$ = 3.14159
* $e$ = 2.71828
* $\phi$ = 1.61803

where you use _pi, e, phi_ as the keywords and the respective floats as the values.

Put your solution here:

In [5]:
constants['pi'] = 3.14159
constants['e'] = 2.71828
constants['phi'] = 1.61803

In [6]:
print(constants)

{'c': 299792458, 'pi': 3.14159, 'e': 2.71828, 'phi': 1.61803}


In [7]:
constants = {'c': 299792458}
print("Before update:", constants)
constants.update( {'pi': 3.14159, 'e': 2.71828, 'phi': 1.61803} )
print("After update:", constants)

Before update: {'c': 299792458}
After update: {'c': 299792458, 'pi': 3.14159, 'e': 2.71828, 'phi': 1.61803}
