# Containers

**Table of contents**<a id='toc0_'></a>    
- 1. [Lists](#toc1_)    
  - 1.1. [Slicing a list](#toc1_1_)    
  - 1.2. [Referencing](#toc1_2_)    
- 2. [Tuples](#toc2_)    
- 3. [Dictionaries](#toc3_)    
- 4. [Summary](#toc4_)    
- 5. [Extra](#toc5_)    
  - 5.1. [SimpleNamespace](#toc5_1_)    

<!-- vscode-jupyter-toc-config
	numbering=true
	anchor=true
	flat=false
	minLevel=2
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

* A more complicated type of variable than atomic is a **container**.  
* This is an object, which consists of several objects, for instance atomic types.  
* Therefore, containers are also called **collection types**. 
* **Types of containers**
    * Lists
    * Tuples 
    * Dictionaries
    * Pandas data frames
    * ...

## 1. <a id='toc1_'></a>[Lists](#toc0_)

A first example is a **list**.  A list contains **elements** each **referencing** some data in memory.

In [1]:
x = [1,'abc'] 
# variable x references a list type object with elements
# referencing 1 and 'abc'

print(x,'is a', type(x))

[1, 'abc'] is a <class 'list'>


The **length** (size) of a list can be found with the **len** function.

In [2]:
print(f'the number of elements in x is {len(x)}')

the number of elements in x is 2


A list is **subscriptable** and starts, like everything in Python, from **index 0**. Beware!

In [3]:
print(x[0]) # 1st element 
print(x[1]) # 2nd element

1
abc


A list is **mutable**, i.e. you can change its elements on the fly.  
That is, you can change its **references** to objects.

In [4]:
x[0] = 'def'
x[1] = 2
print('x =', x, 'has id =',id(x))

# Change x[1]
x[1] = 5
print('x =', x, 'has id =',id(x))

x = ['def', 2] has id = 140276796362624
x = ['def', 5] has id = 140276796362624


and add more elements

In [5]:
x.append('wtf') # add new element to end of list
print(x)

['def', 5, 'wtf']


**Link:** [Why is 0 the first index?](http://python-history.blogspot.com/2013/10/why-python-uses-0-based-indexing.html)  

### 1.1. <a id='toc1_1_'></a>[Slicing a list](#toc0_)

A list is **slicable**, i.e. you can extract a list from a list.

In [6]:
x = [0,1,2,3,4,5]
print(x[0:3]) # x[0] included, x[3] not included
print(x[1:3])
print(x[:3])
print(x[1:])
print(x[:99]) # This is very particular to Python. Normally you'd get an error.  
print(x[:-1]) # x[-1] is the last element

print(type(x[:-1])) # Slicing yields a list
print(type(x[-1])) # Unless only 1 element

[0, 1, 2]
[1, 2]
[0, 1, 2]
[1, 2, 3, 4, 5]
[0, 1, 2, 3, 4, 5]
[0, 1, 2, 3, 4]
<class 'list'>
<class 'int'>


**Explantion:** 
* Slices are half-open intervals. 
* ``x[i:i+n]`` means starting from element ``x[i]`` and create a list of (up to) ``n`` elements.
* Sort of nice if you have calculated ``i`` and know you need ``n`` elements. 

In [7]:
# splitting a list at x[3] and x[5] is: 
print(x[0:3])
print(x[3:5])
print(x[5:])

[0, 1, 2]
[3, 4]
[5]


**Question**: Consider the following code:

In [9]:
x = [0,1,2,3,4,5]
print(x[-4:-2])

[2, 3]


What is the result of `print(x[-4:-2])`?

- **A:** [1,2,3]
- **B:** [2,3,4]
- **C:** [2,3]
- **D:** [3,4]
- **E:** Don't know

### 1.2. <a id='toc1_2_'></a>[Referencing](#toc0_)
**Container types, incl. lists, are non-atomic** 
* Several variables can refer to the **same** list.
* If you change the data of a list that **one** variable refers to, **you change them all**.
* Variables refering to the same object has the same id.

In [10]:
x = [1,2,3]
print('initial x =',x)
print('id of x is',id(x))
y = x # y now references the same list as x
print('id of y is',id(y))
y[0] = 2 # change the first element in the list y
print('x =',x) # x is also changed because it references the same list as y

initial x = [1, 2, 3]
id of x is 140276779120256
id of y is 140276779120256
x = [2, 2, 3]


If you want to know if two variables contain the same reference, use the **is** operator. 

In [11]:
print(y is x) 
z = [1,2]
w = [1,2] 
print(z is w) # z and w have the same numerical content, but do not reference the same object. 

True
False


**Conclusion:** The `=` sign copy the reference, not the content!

Atomic types cannot be changed and keep their identity.

In [12]:
z = 10
w = z
print(z is w) # w is now the same reference as z
z += 5
print(z, w)
print(z is w) # z was overwritten in the augmentation statement. 

True
15 10
False


If one variable is deleted, the other one still references the list.

In [13]:
del x # delete the variable x
print(y)

[2, 2, 3]


Containers should be **copied** by using the copy-module:

In [14]:
from copy import copy

x = [1,2,3]
y = copy(x) # y now a copy of x
y[0] = 2
print(y)
print(x) # x is not changed when y is changed
print(x is y) # as they are not the same reference

[2, 2, 3]
[1, 2, 3]
False


or by slicing:

In [15]:
x = [1,2,3]
y = x[:] # y now a copy of x
y[0] = 2
print(y)
print(x) # x is not changed when y is changed

[2, 2, 3]
[1, 2, 3]


**Advanced**: A **deepcopy** is necessary, when the list contains mutable objects.

In [16]:
from copy import deepcopy

a = [1,2,3]
x = [a,2,3] # x is a list of a list and two integers
y1 = copy(x) # y1 now a copy x
y2 = deepcopy(x) # y2 is a deep copy

a[0] = 10 # change1
x[-1] = 1 # change2
print(x) # Both changes happened
print(y1) # y1[0] reference the same list as x[0]. Only change1 happened 
print(y2) # y2[0] is a copy of the original list referenced by x[0]

[[10, 2, 3], 2, 1]
[[10, 2, 3], 2, 3]
[[1, 2, 3], 2, 3]


**Question**: Consider the following code:

In [17]:
x = [1,2,3]
y = [x,x]
z = x
z[0] = 3
z[2] = 1
print(y[0])

[3, 2, 1]


What is the result of `print(y[0])`?

- **A:** 1
- **B:** 3
- **C:** [3,2,1]
- **D:** [1,2,3]
- **E:** Don't know

## 2. <a id='toc2_'></a>[Tuples](#toc0_)

* A **tuple** is an **immutable list**.
* Tuples are created with soft parenthesis, `t = (1,3,9)`.
* As with lists, elements are accessed by brackets, `t[0]`. 
* **Immutable:** `t[0]=10` will produce an error.
* We use tuples to pass variables around that should not change by accident.  
* **Functions** will **output** tuples if you specify multiple output variables.
* Tuples can also be used as arguments to function.

In [18]:
x = (1,2,3) # note: parentheses instead of square backets
print('x =',x,'is a',type(x))
print('x[2] =', x[2], 'is a', type(x[2]))
print('x[:2] =', x[:2], 'is a', type(x[:2]))

x = (1, 2, 3) is a <class 'tuple'>
x[2] = 3 is a <class 'int'>
x[:2] = (1, 2) is a <class 'tuple'>


But it **cannot be changed** (it is immutable):

In [20]:
try: # try to run this block
    x[0] = 2
    print('did succeed in setting x[0]=2')
except: # if any error found run this block instead
    print('did NOT succeed in setting x[0]=2')
print(x)

did NOT succeed in setting x[0]=2
(1, 2, 3)


## 3. <a id='toc3_'></a>[Dictionaries](#toc0_)

* A **dictionary** is a **key-based** container. 
* Lists and tuples use numerical indices.
* Initialized with curly brackets. `d={}` is an empty dictionary.
* Should be used when you need to look up data quickly by a name.
* Arch example: a **phone book**.  
    You know the name of a person (*key*), and want the phone number (*value*).
* Frequent use: want to make a **collection** of variables or parameters used in a model. 
* **Keys:** All immutable objects are valid keys (eg. `str` or `int`).
* **Values:** Fully unrestricted. 

In [24]:
x = {'abc': 1.2, 'D': 1, 'ef': 2.74, 'G': 30} # Create a dictionary
print("x['abc'] =", x['abc']) # Extracting content
x['abc'] = 100 # Changing content
print("x['abc'] =", x['abc'])

x['abc'] = 1.2
x['abc'] = 100


Elements of a dictionary are **extracted** using their keyword. Can be a variable 

In [25]:
key = 'abc'
value = x[key]
print(value)

100


**Content is deleted** using its key:

In [26]:
print(x)
del x['abc']
print(x)

{'abc': 100, 'D': 1, 'ef': 2.74, 'G': 30}
{'D': 1, 'ef': 2.74, 'G': 30}


**Task:** Create a dictionary called `capitals` with the capital names of Denmark, Sweden and Norway as values and country names as keys.

In [33]:
capital = {'denmark':'copenhagen', 'sweden':'stockholm', 'norway':'oslo'}
print(capital)
capital_of_denmark = print('capital of denmark is =', capital['denmark'])
capital_of_sweden = print('capital of sweden is =', capital['sweden'])
capital_of_norway = print('capital of norway is =', capital['norway'])

{'denmark': 'copenhagen', 'sweden': 'stockholm', 'norway': 'oslo'}
capital of denmark is = copenhagen
capital of sweden is = stockholm
capital of norway is = oslo


**Answer:**

In [34]:
capitals = {}
capitals['denmark'] = 'copenhagen'
capitals['sweden'] = 'stockholm'
capitals['norway'] = 'oslo'

capital_of_sweden = capitals['sweden']
print(capital_of_sweden)

stockholm


**Note:** All atomic types as immutable, and only strings are subscriptable.

In [35]:
x = 'abcdef'
print(x[:3])
print(x[3:5])
print(x[5:])
try:
    x[0] = 'f'
except:
    print('strings are immutable')

abc
de
f
strings are immutable


## 4. <a id='toc4_'></a>[Summary](#toc0_)

The new central concepts are:

1. Containers (lists, tuples, dictionaries)
2. Mutable/immutable
3. Slicing of lists and tuples
4. Referencing (copy and deepcopy)
5. Key-value pairs for dictionaries

## 5. <a id='toc5_'></a>[Extra](#toc0_)

Other interesting containers can be found in [**collections**](https://docs.python.org/2/library/collections.html), see also for example [**sets**](https://docs.python.org/2/library/stdtypes.html#frozenset).

### 5.1. <a id='toc5_1_'></a>[SimpleNamespace](#toc0_)

[SimpleNamespace](https://docs.python.org/3/library/types.html) are also sometimes handy as replacements for dictionaries. They have fewer capabilities (are not really containers) but offer shorter syntax, as they replace *x['abc']* with *x.abc*

In [None]:
from types import SimpleNamespace

In [None]:
x = {'abc': 1.2, 'D': 1, 'ef': 2.74, 'G': 30} # Re-reate a dictionary
y = SimpleNamespace(**x)  # Turn it into a SimpleNamespace. 
# Alternatively it can be created as SimpleNamespace(abc=1.2,D=1, ef=2.74, G=30)

print(y)

Now we reference the elements using dot notation

In [None]:
print('From Dictionary:', x['abc'])
print('From SimpleNamespace:', y.abc)

print('From Dictionary:', x['G'])
print('From SimpleNamespace:', y.G)

The only reason for wanting to do this transformation is the simpler syntax as you avoid writing *['']* <br>
But you do loose some dictionary capabilities you will learn about later in the course. You can easily turn a simple name space into a dictionary:

In [None]:
y_asdict = y.__dict__
print(type(y_asdict))
print(y_asdict)