# Data Structures

## Lesson Goal

 - Compose simple programs to control the flow with which the operators we have studied so far are executed on:
  - single value variables.
  - data structures (holding mutiple variables)


## Objectives

 - Express collections of mulitple variables as `list`, `tuple` and dictionary (`dict`) data structures.
 
- Use iteratation to visit entries in a data structure 

- Learn to select the right data structure for an application

In the last seminar we learnt to generate a rnage of numbers for use in control flow of  a program, using the function `range()`:

In [110]:
for j in range(20):
    
    if j % 4 == 0:  # Check remainer of j/4
        continue    # continue to next value of j
        
    print(j, "is not a multiple of 4")

1 is not a multiple of 4
2 is not a multiple of 4
3 is not a multiple of 4
5 is not a multiple of 4
6 is not a multiple of 4
7 is not a multiple of 4
9 is not a multiple of 4
10 is not a multiple of 4
11 is not a multiple of 4
13 is not a multiple of 4
14 is not a multiple of 4
15 is not a multiple of 4
17 is not a multiple of 4
18 is not a multiple of 4
19 is not a multiple of 4


## Data Structures

Often we want to manipulate data that is more meaningful than ranges of numbers.

These collections of variables might include:
 - the results of an experiment
 - a list of names
 - the components of a vector
 - a telephone directory with names and associated numbers.
 
Python has different __data structures__ that can be used to store and manipulate these values.

Like variable types (`string`, `int`,`float`...) different data structures behave in different ways.

Today we will learn to use `list`, `tuple` and dictionary (`dict`) data structures.

We will study the differences in how they behave so that you can learn to select the most suitable data structure for an application. 
 
 

Programs use data structure to collect data into useful packages. 

$$
r = [u, v, w]
$$

For example, rather than representing a vector `r` of length 3 using three seperate floats `ru`, `rv` and `rw`, we could represent 
it as a __list__ of floats:

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; `r = [u, v, w]`. 

We will learn what a __list__ is in a moment.

If we want to store the names of students in a laboratory group, rather than representing each students using an individual string variable, we could use a list of names, e.g.:



In [59]:
lab_group0 = ["Sarah", "John", "Joe", "Emily"]
lab_group1 = ["Roger", "Rachel", "Amer", "Caroline", "Colin"]

This is useful because we can perform operations on lists such as:
 - checking its length (number of students in a lab group)
 - sorting the names in the list into alphabetical order
 - making a list of lists (we call this a *nested list*):


In [60]:
lab_groups = [lab_group0, lab_group1]

## 1.0 Lists

A list is a sequence of data. 

We call each item in the sequence an *element*. 

A list is constructed using square brackets:



In [61]:
a = [1, 2, 3]

A `range` can be converted to a list with the `list` function.

In [62]:
print(list(range(10)))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


When `range` has just one *argument* (the entry in the parentheses), it will generate a range from 0 up to but not including the specified number. 


In [63]:
print(list(range(10,20)))

[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]


When a range has two arguments:
 - the first value is the starting value.
 - the second value is the stoping value.
 - the stopping value is not included in the range

You can optionally include a step:

In [64]:
print(list(range(10, 20, 2)))

[10, 12, 14, 16, 18]


A list can hold a mixture of types (`int`, `string`....).

In [65]:
a = [1, 2.0, "three"]

An empty list is created by

In [67]:
my_list = []

A list of length 5 with repeated values can be created by

In [68]:
my_list = ["Hello"]*5
print(my_list)

['Hello', 'Hello', 'Hello', 'Hello', 'Hello']


We can check if an item is in a list using the function `in`:


In [131]:
print("Hello" in my_list)
print("Goodbye" in my_list)

True
False


### 1.1 Iterating over lists

Looping over each item in a list is called *iterating*. 

To iterate over a list of the lab group we use a `for` loop.

Each iteration, variable data takes the value of the next item in the list:

In [69]:
for data in [1, 2.0, "three"]:    
    print('the value of data is:', data)

the value of data is: 1
the value of data is: 2.0
the value of data is: three


__Try it yourself__
<br>
In the cell below *iterate* over the list `[1, 2.0, "three"]`.
<br>
Each time the code loops print the value of data __cast as a string__.
<br>
(Hint: Look at Seminar 1, Section 8.5.2 for how to cast a variable as a different type).  

In [112]:
# Iterate over a list and cast each item as a string

### 1.2 Manipulating lists 

There are many functions for manipulating lists. 

We can find the length (number of items) of a list using the function `len()`, by including the name of the list in the brackets. 

In the example below, we find the length of the list `lab_group0`. 

In [149]:
lab_group0 = ["Sara", "Mari", "Quang"]

size = len(lab_group0)

print("Lab group members:", lab_group0)

print("Size of lab group:", size)

print("Check the Python object type:", type(lab_group0))

Lab group members: ['Sara', 'Mari', 'Quang']
Size of lab group: 3
Check the Python object type: <class 'list'>


To sort the list we use `sorted()`.

If the list contains numerical variables, the numbers is sorted in ascending order.

In [114]:
numbers = [7, 1, 3.0]

print(numbers)

numbers = sorted(numbers)

print(numbers)

[7, 1, 3.0]
[1, 3.0, 7]


__Note:__ We can sort a list with mixed numeric types (e.g. `float` and `int`). 
<br>
However, we cannot sort a list with types that cannot be sorted by the same ordering rule 
<br>
(e.g. `numbers = sorted([seven, 1, 3.0])` causes an error.)

In [115]:
# numbers = sorted([seven, 1, 3.0])

If the list contains strings, the list is sorted by alphabetical order. 

In [117]:
lab_group0 = ["Sara", "Mari", "Quang"]

print(lab_group0)

lab_group0 = sorted(lab_group0)

print(lab_group0)

['Sara', 'Mari', 'Quang']
['Mari', 'Quang', 'Sara']


As with `len()` we include the name of the list we want to sort in the brackets. 

There is a shortcut for sorting a list

`sort` is known as a 'method' of a `list`. 

If we suffix a list with `.sort()`, it performs an *in-place* sort.

In [118]:
lab_group0 = ["Sara", "Mari", "Quang"]

print(lab_group0)

#lab_group0 = sorted(lab_group0)
lab_group0.sort()

print(lab_group0)

['Sara', 'Mari', 'Quang']
['Mari', 'Quang', 'Sara']


__Try it yourself__
<br>
In the cell below create a list of numeric or string values.
<br>
Sort the list using `sorted()` or `.sort()`.
<br>
Print the sorted list.
<br>
Print the length of the list using `len()`.

In [None]:
# Sorting a list

We can remove items from a list using the method `pop`.

We place the index of the element we wich to remove in brackets. 

In [75]:
print(lab_group0)

# Remove the second student 
# remember indexing starts from 0
# 1 is the second element

lab_group0.pop(1)
print(lab_group0)

['Mari', 'Quang', 'Sara']
['Mari', 'Sara']


We can add items at the end of a list using the method `append`.

We place the element we want to add to the end of the list in brackets. 

In [76]:
# Add new student "Lia" at the end of the list
lab_group0.append("Lia")
print(lab_group0)

['Mari', 'Sara', 'Lia']


__Try it yourself__
<br>
In the cell below.
<br>
Remove Sara from the list.
<br>
Print the new list.
<br>
Add a new lab group member, Tom, to the list.
<br>
Print the new list.

In [119]:
# Adding and removing items from a list.

## 1.3 Indexing

Lists store data in order.

We can select a single element of a list using its index.

You are familiar with this process; it is the same as selecting individual characters of a `string`:

In [77]:
a = "string"
b = a[1]
print(b)

t


In [121]:
first_member = lab_group0[0]
print(first_member)

Mari


Indices can be useful when looping through the items in a list.`

In [79]:
# We can express the following for loop:
# ITERATING
for i in lab_group0:
    print(i)
    
# as:
# INDEXING
for i in range(len(lab_group0)):
    print(lab_group0[i])

Mari
Sara
Lia
Mari
Sara
Lia


__Note:__<br>
- Some data structures that support *iterating* but do not support *indexing*. <br> When possible, it is better to iterate over a list rather than use indexing.
- When indexing:
   - the first value in the range is 0.
   - the last value in the range is (list length - 1). 

Lists and indexing can be useful for numerical computations. 

### Example: Vectors

__Vector:__ A quantity with magnitude and direction.

Position vectors (or displacement vectors) in 3D space can always be expressed in terms of x,y, and z-directions.  

<img src="../../../ILAS_seminars/intro to python/img/3d_position_vector.png" alt="Drawing" style="width: 175px;"/>

The position vector 𝒓 indicates the position of a point in 3D space.

$$
\mathbf{r} = x\mathbf{i} + y\mathbf{j} + z\mathbf{k}
$$

𝒊 is the displacement one unit in the x-direction<br>
𝒋 is the displacement one unit in the y-direction<br>
𝒌 is the displacement one unit in the z-direction

We can conveniently express $\mathbf{r}$ as a matrix: 
$$
\mathbf{r} = [x, y, z]
$$

__...which looks a lot like a Python list!__



You will encounter 3D vectors a lot in your engineering studies as they are used to describe many physical quantities, e.g. force.

### Example: The dot product of two vectors:

The __dot product__ is a really useful algebraic operation that takes two equal-length sequences of numbers (usually coordinate vectors) and returns a single number.

It can be expressed mathematically as:

__GEOMETRIC REPRESENTATION__

\begin{align}
\mathbf{A} \cdot \mathbf{B} = |\mathbf{A}| |\mathbf{B}| cos(\theta)
\end{align}

<img src="../../../ILAS_seminars/intro to python/img/dot_product.gif" alt="Drawing" style="width: 250px;"/>

$\mathbf{B} cos(\theta)$ is the component of $B$ acting in the direction of $A$.

For example, the component of a force acting in the direction of the velocity of an object:

<img src="../../../ILAS_seminars/intro to python/img/resolving_force.png" alt="Drawing" style="width: 250px;"/>

$$
\mathbf{F_{app,x}} = \mathbf{F_{app}}cos(\theta)
$$

__ALGEBRAIC REPRESENTATION__

>The dot product of two $n$-length-vectors:
> <br> $ \mathbf{A} = [A_1, A_2, ... A_n]$
> <br> $ \mathbf{B} = [B_1, B_2, ... B_n]$
> <br> is: 

\begin{align}
\mathbf{A} \cdot \mathbf{B} = \sum_{i=1}^n A_i B_i.
\end{align}

>So the dot product of two 3D vectors:
> <br> $ \mathbf{A} = [A_x, A_y, A_z]$
> <br> $ \mathbf{B} = [B_x, B_y, B_z]$
> <br> is:

\begin{align}
\mathbf{A} \cdot \mathbf{B} &= \sum_{i=1}^n A_i B_i \\
&= A_x B_x + A_y B_y + A_z B_z.
\end{align}

__Example:__ 
<br> The dot product $\mathbf{A} \cdot \mathbf{B}$:
> <br> $ \mathbf{A} = [1, 3, −5]$
> <br> $ \mathbf{B} = [4, −2, −1]$



\begin{align}
      {\displaystyle {\begin{aligned}\ [1,3,-5]\cdot [4,-2,-1]&=(1)(4)+(3)(-2)+(-5)(-1)\\&=4-6+5\\&=3\end{aligned}}} 
\end{align}

We can solve this very easily using a Python `for` loop.



In [124]:
A = [1.0, 3.0, -5.0]
B = [4.0, -2.0, -1.0]

# Create a variable called dot_product with value, 0.
dot_product = 0.0

for i in range(len(A)):
    dot_product += A[i]*B[i]

print(dot_product)

3.0


From is __GEOMETRIC__ representation, we can see that the dot product allows us to quickly solve many engineering-related problems 

\begin{align}
\mathbf{A} \cdot \mathbf{B} = |\mathbf{A}| |\mathbf{B}| cos(\theta)
\end{align}

Examples:
 - Test if two vectors are:
   - perpendicular ($\mathbf{A} \cdot \mathbf{B}==0$)
   - acute ($\mathbf{A} \cdot \mathbf{B}>0$)
   - obtuse ($\mathbf{A} \cdot \mathbf{B}<0$)
 - Find the angle between two vectors (from its cosine).
 - Find the magnitude of one vector in the direction of another.
 <br>(e.g. resolving forces into their component directions). 
 - Find physical quantities e.g. the work, W, when pushing an object a certain distance, d, with force, F:
 
 <img src="../../../ILAS_seminars/intro to python/img/work_equation.jpg" alt="Drawing" style="width: 300px;"/>


__Try it yourself:__ 

$\mathbf{C} = [2, 4, 3.5]$

$\mathbf{D} = [1, 2, -6]$

In the cell below find:
$\mathbf{C} \cdot \mathbf{D}$

using a for loop and the indices of Python lists. 

Is the angle between the vectors obtuse or acute or are the vectors perpendicular? 

(Perpendicular if $\mathbf{A} \cdot \mathbf{B}==0$, acute if $\mathbf{A} \cdot \mathbf{B}>0$, or obtuse if $\mathbf{A} \cdot \mathbf{B}<0$).
 

In [81]:
# The dot product of C and D

A *nested list* is a list within a list. (Recall a *nested loop* from Section 1). 

To access a __single element__ we need as many indices as there are levels of nested list. 

This is more easily explained with an example:

In [82]:
lab_group0 = ["Sara", "Mika", "Ryo", "Am"]
lab_group1 = ["Hemma", "Miri", "Qui", "Sajid"]
lab_group2 = ["Adam", "Yukari", "Farad", "Fumitoshi"]

lab_groups = [lab_group0, lab_group1]

`lab_group0`, `lab_group1` and `lab_group2` are nested within `lab_groups`.

Therefore there are __two__ levels of nested lists.

Therefore we need __two__ indices to select a single elememt from `lab_group0`, `lab_group1` or `lab_group2`. 
    The first index: a list (`lab_group0`, `lab_group1` or `lab_group2`). 
The second index: an element in that list. 

In [125]:
group = lab_groups[0]
print(group)

name = lab_groups[1][2]
print(name)

['Sara', 'Mika', 'Ryo', 'Am']
Qui


## 2.0 Tuples

Tuples are similar to lists. 

However, after creatig a tuple:
 - you can't add or remove elements from it without creating a new tuple. 
 - you can't change the value of a single tuple element e.g. by indexing. 

Tuples are therefore used for values that should not change after being created.
<br> e.g. a vector of length three with fixed entries
<br>It is 'safer' in this case since it cannot be modified accidentally in a program. 

To create a tuple, use round brackets. 

__Example__
In Kyoto University, each professor is assigned an office.

Philamore-sensei is given room 32:

In [85]:
room = ("Philamore", 32)

print("Room allocation:", room)

print("Length of entry:", len(room))

print(type(room))

Room allocation: ('Philamore', 32)
Length of entry: 2
<class 'tuple'>


We can *iterate* over tuples in the same way as with lists,

In [86]:
# Iterate over tuple values
for d in room:
    print(d)

Philamore
32


and we can index into a tuple:

In [15]:
# Index into tuple values
print(room[1])
print(room[0])

32
Philamore


__Note__ Take care when creating a tuple of length 1:

In [23]:
# Creating a list of length 1 
a = [1]
print(a)
print(type(a))
print(len(a))

[1]
1
<class 'list'>


However, if we use the same process for a tuple:

In [26]:
a = (1)
print(a)
print(type(a))
#print(len(a))

1
<class 'int'>


To create a tuple of length 1, we use a comma:

In [28]:
a = (1,)
print(a)
print(type(a))
print(len(a))

(1,)
<class 'tuple'>
1


In [20]:
room = ("Endo",)
print("Room allocation:", room))
print("Length of entry:", len(room))
print(type(room))

1

As part of a rooms database, we can create a list of tuples:

In [127]:
room_allocation = [("Endo",), 
                   ("Philamore", 32), 
                   ("Matsuno", 31), 
                   ("Sawaragi", 28), 
                   ("Okino", 28), 
                   ("Kumegawa", 19)]

print(room_allocation)

[('Endo',), ('Philamore', 32), ('Matsuno', 31), ('Sawaragi', 28), ('Okino', 28), ('Kumegawa', 19)]


Index into the list room allocation 

(See Section 2.3 for how to index into *nested* data structures.)

In the cell below use indexing to print:
 - Matsuno-sensei's room number
 - Kumegawa-sensei's room number
 - The variable type of Kumegawa-sensei's room number

In [129]:
# Matsuno-sensei's room number

# Kumegawa-sensei's room number

# The Python variable type of Kumegawa-sensei's room number


To make it easier to look up the office number each professor, we can __sort__ the list of tuples into an office directory.

The ordering rule is determined by the __first element__ of each tuple.

If the first element of each tuple is a numeric type (`int`, `float`...) the tulpes are sorted by ascending numerical order of the first element:

If the first element of each tuple is a `string` (as in this case), the tuples are sorted by alphabetical order of the first element.

Look back at Section 2.2 (Manipulating Lists) and remind yourself how to sort a list.

In the cell provided below, sort the list, `room_allocation` by alphabetical order. 

In [43]:
# room_allocation sorted by alphabetical order

[('Endo',), ('Kumegawa', 19), ('Matsuno', 31), ('Okino', 28), ('Philamore', 32), ('Sawaragi', 28)]


The office directory can be improved by excluding professors who do not have an office at Yoshida campus:

In [45]:
for entry in room_allocation:
    
    # only professors with an office have an entry length > 1
    if len(entry) > 1:
        print("Name:", entry[0], ", Room:", entry[1])

Name: Kumegawa , Room: 19
Name: Matsuno , Room: 31
Name: Okino , Room: 28
Name: Philamore , Room: 32
Name: Sawaragi , Room: 28


In summary, use tuples over lists when the length will not change.

## 3.0 Dictionaries (maps)

We used a list of tuples in the previous section to store room allocations. 

What if we wanted to use a program to find which room a particular professor has been allocated?

we would need to either:
- iterate through the list and check each name. 

> For a very large list, this might not be very efficient.

- use the index to select a specific entry of a list or tuple. 

> This works if we know the index to the entry of interest. For a very large list, this is unlikely.





A human looking would identify individuals in an office directory by name (or "keyword") rather than a continuous set of integers. 

Using a Python __dictionary__ we can build a 'map' from names (*keys*) to room numbers (*values*). 

A Python dictionary (`dict`) is declared using curly braces:

In [135]:
room_allocation = {"Endo": None, 
                   "Philamore": 32, 
                   "Matsuno": 31, 
                   "Sawaragi": 28, 
                   "Okino": 28, 
                   "Kumegawa": 19}

print(room_allocation)

print(type(room_allocation))

{'Okino': 28, 'Matsuno': 31, 'Kumegawa': 19, 'Sawaragi': 28, 'Philamore': 32, 'Endo': None}
<class 'dict'>


Each entry is separated by a comma. 

For each entry we have:
 - a 'key' (followed by a colon)
 - a 'value'. 
 
__Note:__ For empty values (e.g. `Endo` in the example above) we use '`None`' for the value.

`None` is a Python keyword for 'nothing' or 'empty'.

Now if we want to know which office belongs to Philamore-sensei, we can query the dictionary by key:

In [136]:
philamore_office = room_allocation["Philamore"]
print(philamore_office)

32


We can __*iterate*__ over the keys in a dictionary as we iterated over the elements of a list or tuple:

__Try it yourself:__
<br>
Refer back to Sections 2.1.3 and 2.2 to remind yourself how to *iterate* over a data structure.
<br>
Using __exactly the same method__, iterate over the entries in the dictionary `room allocation` using a `for` loop.
<br>
Each time the code loops, print the next dictionary entry. 

In [137]:
# iterate over the dictionary, room_allocation.
# print each entry


We can also iterate over `keys` and `values` seperately by:
 - creating two variable names before `in` 
 - putting `items()` after the dictionary name

In [138]:
for name, room_number in room_allocation.items():
    print(name, room_number)    

Okino 28
Matsuno 31
Kumegawa 19
Sawaragi 28
Philamore 32
Endo None


__Try it yourself__<br>
Copy and paste the code from the cell above.
<br>
Edit it so that it prints the names only. 

Remember you can __"comment out"__ the existing code (instead of deleting it) so that you can refer to it later.
e.g.
```python
#print(name, room_number)
```


In [139]:
# iterate over the dictionary, room_allocation.
# print each name

Note that the order of the printed entries in the dictionary is different from the input order. 
<br>
A dictionary stores data differently from a list or tuple. 
<br>
Lists and tuples store entries as continuous pieces of memory, which is why we can access entries by index. 
<br>
Indexing cannot be used to access the entries of a dictionary. For example:
```python
print(room_allocation[0])
```
raises an error. 
<br>

Dictionaries use a different type of storage which allows us to perform look-ups using a 'key'.





In [140]:
print(room_allocation["Philamore"])

32


And we use this same code to add new entries to an existing dictionary: 

In [141]:
print(room_allocation)

room_allocation["Fujiwara"]= 34

print("")

print(room_allocation)


{'Okino': 28, 'Matsuno': 31, 'Kumegawa': 19, 'Sawaragi': 28, 'Philamore': 32, 'Endo': None}

{'Okino': 28, 'Matsuno': 31, 'Fujiwara': 34, 'Kumegawa': 19, 'Sawaragi': 28, 'Philamore': 32, 'Endo': None}


To remove an item from a disctionary we use the command `del`.

In [142]:
print(room_allocation)

del room_allocation["Fujiwara"]

print("")

print(room_allocation)

{'Okino': 28, 'Matsuno': 31, 'Fujiwara': 34, 'Kumegawa': 19, 'Sawaragi': 28, 'Philamore': 32, 'Endo': None}

{'Okino': 28, 'Matsuno': 31, 'Kumegawa': 19, 'Sawaragi': 28, 'Philamore': 32, 'Endo': None}


__Try it yourself__
<br>
Okino-sensei is leaving Kyoto University. Her office will be re-allocated to a new member of staff, Ito-sensei.
<br>
In the cell below, update the dictionary by deleting the entry for Okino-sensei and creating a new entry for Ito-sensei.

In [143]:
# Remove Okino-sensei (room 28) from the dictionary.
# Add a new entry for Ito-sensei (room 28)

So far we have used a string variable types for the dictionary keys.
<br>
However, we can use almost any variable type as a key and we can mix types. 

__Example__: We could 'invert' the room allocation dictionary to create a room-to-name map.

We are going to build a new dictionary (`room_map`) by looping through the old dictionary (`room_allocation`) using a `for` loop:

In [144]:
# Create empty dictionary
room_map = {}

# Build dictionary to map 'room number' -> name 
for name, room_number in room_allocation.items():
    
    # Insert entry into new dictionary
    room_map[room_number] = name

print(room_map)

{32: 'Philamore', None: 'Endo', 19: 'Kumegawa', 28: 'Sawaragi', 31: 'Matsuno'}


We can now consult the room-to-name map to find out if a particular room is occupied and by whom.

Let's assume some rooms are unoccupied and therefore do not exist in this dictionary.

If we try to use a key that does not exist in the dictionary, e.g.

   occupant17 = room_map[17]

Python will give an error (raise an exception). 
<br>
If we're not sure that a __key__ is present (that a room is occupied or unocupied in this case), we can check using the funstion in '`in`' 
<br>(we used this function to check wether an entry exists in a __list__)



In [146]:
print(19 in room_map)
print(17 in room_map)

True
False
False


So we know that:
 - room 17 is unoccupied
 - room 19 is occupied


When using `in`, take care to check for the __key__ (not the value)

In [147]:
print('Kumegawa' in room_map)

False


We could `in` to avoid generating errors if unoccupied room numbers are entered.  

For example, in a program that checks the occupants of rooms by entreing the room number: 

In [148]:
rooms_to_check = [17, 19]

for room in rooms_to_check:
    
    if room in room_map:
        print("Room", room, "is occupied by", room_map[room], "-sensei")
    
    else:
        print("Room", room, "is unoccupied.")

Room 17 is unoccupied.
Room 19 is occupied by Kumegawa -sensei


## 4.0 Choosing a data structure

An important task when developing a computer program is selecting the *appropriate* data structure for a task.

Here are some examples of the suitablity of the data types we have studied for some common computing tasks.

- __Dynamically changing individual elements of a data structure.__ 
<br> 
e.g. updating the occupant of a room number or adding a name to a list of group members.<br> 
__Lists and dictionaries__ allow us to do this.<br> 
__Tuples__ do not.


- __Storing items in a perticular sequence (so that they can be addressed by index or in a particular order)__.
<br> 
e.g. representing the x, y, z coordinates of a 3D position vector, storing data collected from an experiment as a time series. 
<br> 
__Lists and tuples__ allow us to do this.
<br> 
__Dictionaries__ do not.


- __Performing an operation on every item in a sequence.__ 
<br> 
e.g. checking every item in a data set against a particular condition (e.g. prime number, multiple of 5....etc), performing an algebraic operation on every item in a data set. 
<br> 
__Lists and tuples__ make this simple as we can call each entry in turn using its index.
<br> 
__Dictionaries__ this is less efficient as it requires more code.


- __Selecting a single item from a data structure without knowing its position in a sequence.__  
e.g. looking up the profile of a person using their name, avoiding looping through a large data set in order to identify a single entry. 
<br> 
__Dictionaries__ allow us to select a single entry by an associated (unique) key variable.
<br> 
__Lists and tuples__ make this difficult as to pick out a single value we must either i) know it's position in an ordered sequence, ii)loop through every item until we find it. 


- __Protecting individual items of a data sequence from being added, removed or changed within the program.__
<br>
e.g. representing a vector of fixed length with fixed values, representing the coordintes of a fixed point. 
<br> 
__Tuples__ allow us to do this.
<br> 
__Lists and dictionaries__ do not. 


- __Speed__
For many numerical computations, efficiency is essential. More flexible data structures are generally less efficient computationally. They require more computer memory. We will study the difference in speed there can be between different data structures in a later seminar.

## 5.0 Review Exercises
Here are a series of engineering problems for you to practise each of the new Python skills that you have learnt today.

### 5.1 Review Exercise:  <a name="back1"></a> Data structures.

__(A)__ In the cell below, what type of data structure is C?

__(B)__ Write a line of code that checks whether 3 exists within the data strcuture.

__(C)__ Write a line of code that checks whether 3.0 exists within the data strcuture.

__(D)__ Write a line of code that checks whether "3" exists within the data strcuture.


In [94]:
C = (2, 3, 5, 6, 1, "hello")

### 5.2 Review Exercise:  <a name="back1"></a> `for` loops.

In the cell below, create a list with the names of the months. 
<br>
Create a second list with the number of days in each month (for a regular year). 
<br>
Create a `for` loop that prints:

`The number of days in MONTH is XX days`

where, `MONTH` is the name of the month and `XX` is the correct number of days in that month.

Hint: See computing the dot product in Section 2.1.3 for how to use two vectors in a loop.

In [95]:
# A for loop to print the number of days in each month


### 5.3 Review Exercise:  <a name="back1"></a> Indexing.

__(A)__ In the cell below write a program that adds two vectors, $\mathbf{A}$ and $\mathbf{B}$, expressed as lists 
<br> (Hint: See computing the dot product in Section 2.1.3 for how to use two vectors in a loop).

The vectors can be of any length ($n$).
<br> However the length of the two vectors must be equal (or an error will be generated). 

 $ \mathbf{A} = [A_1, A_2, ... A_n]$
 
 $ \mathbf{B} = [B_1, B_2, ... B_n]$
 
 $ \mathbf{A} + \mathbf{B} = [(A_1 + B_1), 
                              (A_2 + B_2),
                              ...
                              (A_n + B_n)]$
Use your code to add vectors:

$\mathbf{A} = [-2, 1, 3]$

$\mathbf{B} = [6, 2, 2]$
                              
 
<br>
__(B)__ Use the function `len()` (Section 2.1) to 
find the length of $\mathbf{A}$ and the length of $\mathbf{B}$ before adding the two vectors.

<br>
__(C)__ Use a logical operator (`==`, `<`, `>`....) to determine if the length of $\mathbf{A}$ and the length of $\mathbf{B}$ are equal or unequal(Seminar 1, Section 7.1)before adding the two vectors.

<br>
__(D)__ Use `if` and `else` statements (Section 1.1) to:
- perform the addition __only__ if the length of $\mathbf{A}$ and the length of $\mathbf{B}$ are equal.
- print a message  (e.g. "`unequal vector length!`") and __do not__ perform the addition, if the length of $\mathbf{A}$ and the length of $\mathbf{B}$ are unequal.   

<br>
__(E)__ Check your code works by testing it using vectors of equal length and vectors of mismatched length.

In [96]:
# Vector addition program with length check.

### 5.4 Review Exercise: <a name="back1"></a> `if` and `else` statements.

__(A)__ Copy and paste the program you wrote earlier to find the dot product of two vectors (Section 2.1.3) into the cell below.

__(B)__ Within the loop use `if`, `elif` and else `else` to make the program print:
 - "`The angle between vectors is acute`" if the dot product is positive.
 - "`The angle between vectors is obtuse`" if the dot product is negative.
 - "`The vectors are perpendicular`" if the dot product is 0.

In [97]:
# Determinig angle types using the dot product.

### 5.5 Review Exercise: <a name="back1"></a> Dictionaries.

<img src="../../../ILAS_seminars/intro to python/img/periodic_table.gif" alt="Drawing" style="width: 300px;"/>

__(A)__ Choose 5 elements from the periodic table.
<br>
In the cell below create a dictionary: 
 - __keys:__ chemical symbol names 
 - __values:__ atomic numbers 
 
 e.g. 
 ```python
 dictionary = {"C":6, "N":7, "O":8....}
 ```

__(B)__ Remove one entry from the dictionary and print it.

__(C)__ Add a new entry (chemical symbol and atomic number) to the dictionary and print it.

__(D)__ Use a `for` loop to create a new dictionary:  
 - __keys:__ atomic numbers 
 - __values:__ chemical symbols
using your original dictionary (See section 2.3)

__*Optional Extension*__

Generate a __list__ of the chemical symbols in your dictionary, sorted into alphabetical order.
Hints:
 - Create an empty list (Section 2.1)
 - Use a for loop to add each chemical symbol to the list (Section 2.3 shows how to add to a data structure with each loop, Section 2.1.2 shows how to add an item to a list)
 - Put the list in alphabetical order (2.1.2)

In [98]:
# Dictionary of periodic table items.

## 5.6 Extension Exercise: Selecting data structures.

<img src="../../../ILAS_seminars/intro to python/img/2d_poly.png" alt="Drawing" style="width: 300px;"/>

For a simple (non-intersecting) polygon:

 - with $n$ vertices 
 ($(x_0, y_0)$, $(x_1, y_1, ...x_{n-1}, y_{n-1})$
 - where $(x_n, y_n) = (x_0, y_0)$. 
 - where the vertices are ordered as you move around the polygon.

the area $A$ is given by:
$$
A = \left| \frac{1}{2} \sum_{i=0}^{n-1} \left(x_{i} y _{i+1} - x_{i+1} y_{i} \right) \right|
$$

Write a program that computes the area of a simple polygon with an arbitrary number of vertices.

Write a for loop to to the summation as you did when finding the dot product. 

Before doing this you must choose a data structure to represent the coordinates of each vertex of the polygon. 

Test your function for some simple shapes. 

In [99]:
# Program to calculate the area of a polygon.

## Summary: Data Structures
 - A data structure is used to assign a collection of values to a single collection name.
 - A Python list can store multiple items of data in sequentially numbered elements (numbering starts at zero)
 - Data stored in a list element can be referenced using the list name can be referenced using the list name followed by an index number in [] square brackets.
 - The `len()` function returns the length of a specified list.
 - A Python tuple whose values can not be individually changed, removed or added to (except by adding another tuple).
 - Data stored in a tuple element can be referenced using the tuple name followed by an index number in [] square brackets.
 - A Python dictionary is a list of key: value pairs of data in which each key must be unique.
 - Data stored in a dictionary element can be referenced using the dictionary name followed by its key in [] square brackets. 