
# Data Structures

Data structures are constructs that can contain one or more variables. They are containers that can store a lot of data into a single entity.

**Python's four basic data structures are:**
 * Lists
 * Dictionaries
 * Tuples
 * Sets

 Note, DataFrames are not a standard part of Python. While DataFrames are often used, they are part of the `Pandas` library.

## Lists 
Lists are defined by square brackets `[]` with elements separated by commas. They can have elements of any data type.

Lists are arguably the most used data structure in Python.

### List syntax 
<code> L = [item_1, item_2, ..., item_n] </code>

### Mutability
Lists are ***mutable***. They can be changed after creation.

### List examples
Some examples of lists:

In [2]:
# List with integers
a = [10, 20, 30, 40]

# Multiple data types in the same list
b = [1, True, 'SNOTEL', 4.3]                       

# List of lists
c = [['Nested', 'lists'], ['are', 'possible']]  

# Print the lists
print("List a:", a)
print("List b:", b)
print("List c:", c)

List a: [10, 20, 30, 40]
List b: [1, True, 'SNOTEL', 4.3]
List c: [['Nested', 'lists'], ['are', 'possible']]


## Dictionaries
Dictionaries have key/value pairs which are enclosed in curly brackets`{}` or we can call the `dict` function. A value can be fetched by querying the corresponding key. Refering the data via logically named keys instead of list indexes makes the code more readable.

### Dictionary syntax    
    
<code> d = {key1: value1, key2: value2, ..., key_n: value_n} </code>

<code> d = dict(key1= value1, key2 = value2, ..., key_n = value_n) </code>
    
Note that values can be of any data type like floats, strings etc., but they can also be lists or other data structures.

**Keys must be unique within the dictionary**. Otherwise it would be hard to extract the value by calling out a certain key, see the section about indexing and slicing below.

Keys also must be of an immutable type.
 
### Mutability
Dictionaries are ***mutable***. They can be changed after creation.

### Dictionary examples
Some examples of dictionaries:

In [3]:
# Strings as keys and numbers as values
d1 = {'axial_force': 319.2, 'moment': 74, 'shear': 23}      
d1_dict = dict(axial_force=319.2, moment=74, shear=23)

# Strings as keys and lists as values
d2 = {'Point1': [1.3, 51, 10.6], 'Point2': [7.1, 11, 6.7]}  
d2_dict = dict(Point1=[1.3, 51, 10.6], Point2=[7.1, 11, 6.7])

# Keys of different types (int and str, don't do this!)
d3 = {1: True, 'hej': 23}                                   

The first two dictionaries above have a certain trend. For `d1` the keys are strings and the values are integers. For `d2` the keys are strings and the values are lists. These are well-structured dictionaries.

However, `d3` has keys that are of mixed types! The first key is an integer and the second is a string. This is totally valid syntax, but not a good idea to do.

As with most stuff in Python the flexibility is very nice, but it can also be confusing to have many different types mixed in the same data structure. To make code more readable, it is often preferred to keep the same trend throughout the dictionary. I.e. all keys are of same type and all values are of the same type as in `d1` and `d2`.

The keys and values can be extracted separately by the methods `dict.keys()` and `dict.values()`:

In [5]:
#print the keys in the dictionaries
print("d1 keys:", d1.keys())
print("d1_dict keys:", d1_dict.keys())
print("d2 keys:", d2.keys())
print("d2_dict keys:", d2_dict.keys())
print("d3 keys:", d3.keys())



d1 keys: dict_keys(['axial_force', 'moment', 'shear'])
d1_dict keys: dict_keys(['axial_force', 'moment', 'shear'])
d2 keys: dict_keys(['Point1', 'Point2'])
d2_dict keys: dict_keys(['Point1', 'Point2'])
d3 keys: dict_keys([1, 'hej'])


In [6]:
#print the values in the dictionaries
print("d1 values:", d1.values())
print("d1_dict values:", d1_dict.values())
print("d2 values:", d2.values())
print("d2_dict values:", d2_dict.values())
print("d3 values:", d3.values())

d1 values: dict_values([319.2, 74, 23])
d1_dict values: dict_values([319.2, 74, 23])
d2 values: dict_values([[1.3, 51, 10.6], [7.1, 11, 6.7]])
d2_dict values: dict_values([[1.3, 51, 10.6], [7.1, 11, 6.7]])
d3 values: dict_values([True, 23])


In [8]:
#Call items in the dictionaries and print them
print("d1 axial_force:", d1['axial_force'])
print("d1_dict force:", d1_dict['axial_force'])
print("d2 Point1:", d2['Point1'])
print("d2_dict Point1:", d2_dict['Point1'])
print("d3 key 1:", d3[1])

d1 axial_force: 319.2
d1_dict force: 319.2
d2 Point1: [1.3, 51, 10.6]
d2_dict Point1: [1.3, 51, 10.6]
d3 key 1: True


## Tuples
Tuples are very comparable to lists, but they are defined by parentheses `()`. Most notable difference from lists is that tuples are **immutable**.

### Tuple syntax 
<code> t = (item_1, item_2, ..., item_n) </code>

### Mutability
Tuples are ***immutable***. They cannot be changed after creation.

### Tuple examples

In [9]:
# Simple tuple of integers
t1 = (1, 24, 56)   

# Multiple types as tuple elements
t2 = (1, 1.62, '12', [1, 2 , 3])  #we have an integer, a float, a string, and a list

# Tuple of tuples
points = ((4, 5), (12, 6), (14, 9))   

In [10]:
#print the tuples
print("Tuple t1:", t1)
print("Tuple t2:", t2)
print("Tuple points:", points)

Tuple t1: (1, 24, 56)
Tuple t2: (1, 1.62, '12', [1, 2, 3])
Tuple points: ((4, 5), (12, 6), (14, 9))


In [11]:
#try to change a tuple element (this will raise an error)
print('this is the first element of t1:', t1[0]) #we can call the first element
t1[0] = 10  # the [0] index refers to the first element

this is the first element of t1: 1


TypeError: 'tuple' object does not support item assignment

## Sets
Sets are defined with curly brackets `{}`. They are **unordered and don't have an index**. See description of indexing further down. Sets also have **unique items**.

### Set syntax
   
<code> s = {item_1, item_2, ..., item_n} </code>

The primary idea about sets is the ability to perform set operations. These are known from mathematics and can determine the *union*, *intersection*, *difference* etc. of two given sets.

A list, string or tuple can be converted to a set by `set(sequence_to_convert)`. Since sets only have unique items, the set resulting from the operation has same values as the input sequence, but with duplicates removed. This can be a way to create a list with only unique elements. 

For example:
~~~python
# Convert list to set and back to list again with now only unique elements
list_uniques = list(set(list_with_duplicates))  
~~~

### Mutability
Sets are ***mutable***. They can be changed after creation.

### Set examples

In [12]:
s1 = {32, 3, 1, 86, 6, 8} # Set with integers
s2 = {8, 6, 21, 7, 26}  # Another set with integers

print(f"Set s1: {s1}. It has {len(s1)} elements.")
print(f"Set s2: {s2}. It has {len(s2)} elements.")    

# Find the union of the two sets
print(f"Union of s1 and s2: {s1.union(s2)}, which has {len(s1.union(s2))} elements.") # we note that duplicates are removed 

Set s1: {32, 1, 3, 6, 86, 8}. It has 6 elements.
Set s2: {21, 6, 7, 8, 26}. It has 5 elements.
Union of s1 and s2: {32, 1, 3, 6, 7, 8, 21, 86, 26}, which has 9 elements.


In [14]:
# Find the intersection of the two sets
print(f"The intersection of s1 and s2: {s1.intersection(s2)}, which has {len(s1.intersection(s2))} elements.") #both 6 and 8 are in both sets

The intersection of s1 and s2: {8, 6}, which has 2 elements.


In [16]:
list_with_duplicates = [1, 2, 3, 4, 5, 2, 2, 3, 1] #create a list with duplicates
print(f"List with duplicates: {list_with_duplicates}. It has {len(list_with_duplicates)} elements.")

# Create a set of the list (which removed duplicates)
s3 = set(list_with_duplicates)   
print(f"Set s3 created from list_with_duplicates: {s3}. It has {len(s3)} elements.")

List with duplicates: [1, 2, 3, 4, 5, 2, 2, 3, 1]. It has 9 elements.
Set s3 created from list_with_duplicates: {1, 2, 3, 4, 5}. It has 5 elements.


If a `list` is wanted again:

In [17]:
list(s3) #convert the set back to a list

[1, 2, 3, 4, 5]

In [18]:
# With Python, we can use set() to remove duplicates from a list and return a list without duplicates in one line:list_with_duplicates = [1, 2, 3, 4, 5, 2, 2, 3, 1] #create a list with duplicates
list_with_duplicates = [1, 5,4,2, 3, 4, 5, 2, 2, 3, 1] #create a list with duplicates
print(f"List with duplicates: {list_with_duplicates}. It has {len(list_with_duplicates)} elements.")
list_without_duplicates = list(set(list_with_duplicates)) #convert to set to remove duplicates, then back to list
print(f"List without duplicates: {list_without_duplicates}. It has {len(list_without_duplicates)} elements.")   #the list also sorts the elements in ascending order

List with duplicates: [1, 5, 4, 2, 3, 4, 5, 2, 2, 3, 1]. It has 11 elements.
List without duplicates: [1, 2, 3, 4, 5]. It has 5 elements.


## The `in` operator
The `in` operator can be used to check whether a certain item is contained in a sequence. The result of the evaluation is a `boolean` (`True` or `False`): 

In [19]:
2 in [1, 2, 3] # see if 2 is in the list

True

In [20]:
l = [1, 2, 3, 4, 5]  # Create a list

#see if 3 is in the list
3 in l #similar to above, l is the same as the list [1, 2, 3, 4, 5]

True

In [None]:
'ma' in 'Denmark' #when might this be useful?

True

In [29]:
'er' in 'Denmark'  

False

Remember this error. You will get it a lot!

### Extracting values from dictionaries
Dictionaries differ from data structures like strings, lists and tuples since **they do not have an index**. Instead, a value is extracted by using the corresponding key:

In [39]:
d = {'N': 83, 'My': 154, 'Mz': 317} # Create a dictionary
print("Value for key 'My':", d['My'])

Value for key 'My': 154


See demonstation below, where the key `'a'` is defined twice. The second defintion overwrites the first one.

In [None]:
{'a': 1, 'b': 2, 'c':3, 'a': 4} # Dictionary with duplicate keys, the last value for 'a' will be used

{'a': 4, 'b': 2, 'c': 3}

## Copying mutable objects
When copying objects that are mutable like, lists or dictionaries, there are some things to be aware of. This is demonstrated by a list example below.

In [52]:
x = [1, 2, 3]     
y = x          # <-- This does not make y a copy of x 
y              # It makes y a pointer to the same underlying object (or id) as x has

[1, 2, 3]

In [54]:
#if we change x, y will also change
x.append(4)
print("x after append:", x)
print("y after x append:", y)  # y also changed because it points to the same object as x

#if we change y, x will also change
y.pop()
print("y after pop:", y)
print("x after y pop:", x)  # x also changed because it points to the

x after append: [1, 2, 3, 4, 4]
y after x append: [1, 2, 3, 4, 4]
y after pop: [1, 2, 3, 4]
x after y pop: [1, 2, 3, 4]


In [55]:
id(x)  # Behind the scenes, the variable x gets assigned a unique object id

22378358311680

In [56]:
id(y)  # y is seen to have the same underlying object id

22378358311680

This means that when we mutate (or modify) `y`, the original list `x` gets changed as well, which is often not desired. This is because it's a pointer to the same object as y. This is often not the intention!

> **When copying** a mutable object `K` use `K.copy()` or `K[:]`.

An example is shown below by using the `list.copy()`method:

In [57]:
# Redefining x since it was mutated above
x_new = [1, 2, 3]   

# Copy to new list
y_new = x_new.copy()

# Show list
y_new

[1, 2, 3]

In [58]:
# Append a value to y_new 
y_new.append(327)

# Show list
y_new

[1, 2, 3, 327]

In [59]:
# x has not changed
x_new

[1, 2, 3]

In [None]:
# Print object id's as f-string 
print(f'x_new has object id: {id(x_new)} \ny_new has object id: {id(y_new)}') # the \n creates a new line

x_new has object id: 22378358309888 
y_new has object id: 22378351078208


# Exercise 1
Make the below list and remove the duplicates from the following list:

~~~python
L2 = ['Hi', 'Hello', 'Hi!', 'Hey', 'Hi', 'hey', 'Hey']
~~~

# Exercise 2
Given the dictionary

~~~python
d = {2: 122, 3: 535, 't': 'T', 'rum': 'cola'}
~~~
Show that you can extract the values by calling out the keys for all key/value pairs.   

# Exercise 3
Find the intersection of the two lists (elements that occur in both lists)
~~~python
s1 = ['HE170B', 'HE210B', 'HE190A', 'HE200A', 'HE210A', 'HE210A']

s2 = ['HE200A', 'HE210A', 'HE240A', 'HE200A', 'HE210B', 'HE340A']
~~~

Next [Module](./Module4-SlicingAndIndexing.ipynb)