# 02 Data Structures and Libraries
## CLASS MATERIAL

<br> <a href='#DataStructures'>1. Data Structures</a>
<br> <a href='#Libraries__'>2. Libraries</a> 
<br> <a href='#ReviewExercises'>3. Review Exercises</a>

# Download the new class notes.
__Navigate to the directory where your files are stored.__

__Update the course notes by downloading the changes__




##### Windows
Search for __Git Bash__ in the programs menu.

Select __Git Bash__, a terminal will open.

Use `cd` to navigate to *inside* the __ILAS_PyEng2019__ repository you downloaded. 

Run the command:
>`./automerge`



##### Mac
Open a terminal. 

Use `cd` to navigate to *inside* the __ILAS_PyEng2019__ repository you downloaded. 

Run the command:
>`sudo ./automerge`

Enter your password when prompted. 

<a id='Summary'></a>
# Primer Summary 
For more information refer to the primer notebook for this class 02_DataStructures_Libraries__Primer.ipynb


###### Data Structures

- Python has an extensive __standard library__ of built-in functions. 
- More specialised libraries of functions and constants are available. We call these __packages__. 
- Packages are imported using the keyword `import`
- The function documentation tells is what it does and how to use it.
- When calling a library function it must be prefixed with a __namespace__ is used to show from which package it should be called.   



###### Libraries
- Python has an extensive __standard library__ of built-in functions. 
- More specialised libraries of functions and constants are available. We call these __packages__. 
- Packages are imported using the keyword `import`
- The function documentation tells is what it does and how to use it.
- When calling a library function it must be prefixed with a __namespace__ is used to show from which package it should be called.  
- The magic function `%timeit` can be used to time the execution of a function. 




### Fundamental programming concepts
 - Importing existing libraries of code to use in your program
 - Storing data in grid-like structures

<a id='DataStructures'></a>
# 1. Data Structures

In the last seminar we learnt to generate a range of numbers for use in control flow of  a program, using the function `range()`:


       for j in range(20):
           ...
    
        
Often we want to manipulate data that is more meaningful than ranges of numbers.

These collections of variables might include:
 - the results of an experiment
 - a list of names
 - the components of a vector
 - a telephone directory with names and associated numbers.
    

Python has different __data structures__ that can be used to store and manipulate these values.

Like variable types (`string`, `int`,`float`...) different data structures behave in different ways.

Today we will learn to use `list`s 

A list is a container with compartments in which we can store data:
<p align="center">
  <img src="img/ice_cube_tray.png" alt="Drawing" style="width: 500px;"/>
</p>

Example

If we want to store the names of students in a laboratory group, 
rather than representing each students using an individual string variable, we could use a list of names. 



In [1]:
lab_group0 = ["Yukari", "Sajid", "Hemma", "Ayako"]
lab_group1 = ["Sara", "Mari", "Quang", "Sam", "Ryo", "Nao", "Takashi"]

print(lab_group0)
print(lab_group1)

['Sarah', 'John', 'Joe', 'Emily']
['Roger', 'Rachel', 'Amer', 'Caroline', 'Colin']


This is useful because we can perform operations on lists such as:
 - checking its length (number of students in a lab group)
 - sorting the names in the list into alphabetical order
 - making a list of lists (we call this a *nested list*):


In [2]:
lab_groups = [lab_group0, lab_group1]

<a id='ExampleChangePosition'></a>
### Example: Change in Position: (Representing Vectors using Lists)

__Vector:__ A quantity with magnitude and direction.

The position of a point in 2D space (e.g. the position of a character in a game), can be expressed in terms of horizontal (x) and vertical (y) conrdinates. 

The movement to a new position can be expressed as a change in x and y. 

This change is known as the velocity with which the point moves. 

<img src="img/schiffman_velocity_vector.png" alt="Drawing" style="width: 700px;"/>

[Daniel Schiffman, The Nature of Code]


We can conveniently express the position $\mathbf{r}$ in matrix (or basis vector) form using the coefficients $x$ and  $y$: 
$$
\mathbf{r} = [r_x, r_y]
$$


__...which looks a lot like a Python list!__


When we move a character in a game, we change it's position. 
<br>The change in position with each time-step is the __velocity__ of the character.

$$
\mathbf{v} = [v_x, v_y]
$$



To get the position at the next time step we simply add the x and y component of the veclocity to the x and y component of the initial position vector:

 <img src="img/schiffman_vector.png" alt="Drawing" style="width: 800px;"/>
 


\begin{align}
      {\displaystyle {\begin{aligned}\ 
      \mathbf{r}(t=t+1)
      &=\mathbf{r} + \mathbf{v}\\
      &=[(r_x(t)+v_x),\;\;   (r_y(t)+v_y)] \\ \end{aligned}}} 
\end{align}



For example, let's find the position at the next timestep where:
 - initial position,  $\mathbf{r} = [5, 2]$
 - velocity,  $\mathbf{v} = [3, 4]$
 
 <img src="img/schiffman_vector.png" alt="Drawing" style="width: 500px;"/>
 
 [Daniel Schiffman, The Nature of Code]

In [1]:
# Example : Change in Position

r = [5, 2]
v = [3, 4]

r = [r[0] + v[0], 
     r[1] + v[1]]

print(r)

[8, 6]


Arranging the code on seperate lines:
 - makes the code more readable
 - does not effect how the code works
 
Line breaks can only be used within code that is enclosed by at elast one set of brackets (), []. 

<a id='ExampleDotProduct'></a>
### Example: The Dot Product (Representing Vectors using Lists)

__Vector:__ A quantity with magnitude and direction.

The position vector $\mathbf{r}$ indicates the position of a point in 3D space.
$\mathbf{r}$ can be expressed in terms of x,y, and z-directions.

$$
\mathbf{r} = x\mathbf{i} + y\mathbf{j} + z\mathbf{k}
$$

$\mathbf{i}$ is the displacement one unit in the x-direction<br>
$\mathbf{j}$ is the displacement one unit in the y-direction<br>
$\mathbf{k}$ is the displacement one unit in the z-direction

<img src="img/3d_position_vector.png" alt="Drawing" style="width: 500px;"/>



We can conveniently express $\mathbf{r}$ in matrix (or basis vector) form using the coefficients $x, y$ and $z$: 
$$
\mathbf{r} = [x, y, z]
$$

__...which looks a lot like a Python list!__


You will encounter 3D vectors a lot in your engineering studies.

They are used to describe many physical quantities, e.g. force.

The __dot product__ is a really useful algebraic operation.

It takes two equal-length *sequences of numbers* (often coordinate vectors) and returns a single number. 
 

__ALGEBRAIC REPRESENTATION OF THE DOT PRODUCT__

The dot product of two $n$-length-vectors:
<br> $ \mathbf{A} = [A_1, A_2, ... A_n]$
<br> $ \mathbf{B} = [B_1, B_2, ... B_n]$

\begin{align}
\mathbf{A} \cdot \mathbf{B} = \sum_{i=1}^n A_i B_i
\end{align}



So the dot product of two 3D vectors:
<br> $ \mathbf{A} = [A_x, A_y, A_z]$
<br> $ \mathbf{B} = [B_x, B_y, B_z]$


\begin{align}
\mathbf{A} \cdot \mathbf{B} &= \sum_{i=1}^n A_i B_i \\
&= A_x B_x + A_y B_y + A_z B_z
\end{align}



__Example : Dot Product__

Let's write a program to solve this using a Python `for` loop.

1. We initailise a variable, `dot_product` with a value = 0.0.

1. With each iteration of the loop:
<br>`dot_product +=` the product of `a` and `b`.  

<p align="center">
  <img src="img/flow_diag_for_loop_dot_product.png" alt="Drawing" style="width: 400px;"/>
</p>

In [1]:
# Example : Dot Product

A = [1.0, 3.0, -5.0]
B = [4.0, -2.0, -1.0]

# Create a variable called dot_product with value, 0.0
dot_product = 0.0

# Update the value each time the code loops
for a , b in zip(A, B):
    dot_product += a * b

# Print the solution
print(dot_product)

3.0


(Solution in 02_DataStructures_LibraryFunctions_SOLS.ipynb)

__Check Your Solution:__ 

The dot product $\mathbf{A} \cdot \mathbf{B}$:
<br> $ \mathbf{A} = [1, 3, −5]$
<br> $ \mathbf{B} = [4, −2, −1]$



\begin{align}
      {\displaystyle {\begin{aligned}\ [1,3,-5]\cdot [4,-2,-1]&=(1)(4)+(3)(-2)+(-5)(-1)\\& = 4 \qquad - 6 \qquad + 5 \\&=3\end{aligned}}} 
\end{align}

__Check Your Solution:__ 


$ \mathbf{r} = [5, 2]$
<br> $ \mathbf{v} = [3, 4]$


\begin{align}
      {\displaystyle {\begin{aligned}\ 
      \mathbf{r} + \mathbf{v}
      &=[5, 2]+ [3, 4]\\
      &=[(5+3), \quad (2+4)] \\
      & = [8, 6] \end{aligned}}} 
\end{align}

<a id='Libraries__'></a>
# 2. Libraries

One of the most important concepts in good programming is to reuse code and avoid repetitions.

Python, like other modern programming languages, has an extensive *library* of built-in functions. 

These functions are designed, tested and optimised by the developers of the Python langauge.  

We can use these functions to make our code shorter, faster and more reliable.

   

<a id='StandardLibrary'></a>
## 2.1 The Standard Library

<br> &emsp;&emsp; <a href='#StandardLibrary'>__2.1 The Standard Library__</a> 
<br> &emsp;&emsp; <a href='#Packages'>__2.2 Packages__ </a> 
<br> &emsp;&emsp; <a href='#FunctionDocumentation'>__2.3 Function Documentation__</a> 
<br> &emsp;&emsp; <a href='#Namespaces'>__2.4 Namespaces__</a> 
<br> &emsp;&emsp; <a href='#ImportingFunction'>__2.5 Importing a Function__</a> 
<br> &emsp;&emsp; <a href='#Optimise'>__2.6 Using Package Functions to Optimise your Code__</a> 


Python has a large standard library. 

e.g. `print()` takes the __input__ in the parentheses and __outputs__ a visible representation.

They are listed on the Python website:
https://docs.python.org/3/library/functions.html

<a id='StandardLibrary'></a>
## 2.1 The Standard Library

Python has a large standard library. 

e.g. `print()` takes the __input__ in the parentheses and __outputs__ a visible representation.

They are listed on the Python website:
https://docs.python.org/3/library/functions.html

We could write our own code to find the minimum of a group of numbers




In [41]:
x0 = 1
x1 = 2
x2 = 4

x_min = x0
if x1 < x_min:
    x_min = x1
if x2 < x_min:
    x_min = x2
        
print(x_min)

1


However, it is much faster to use the build in function:

In [42]:
print(min(1,2,4))

1


The built-in functions can be found in (.py) files called 'modules'.

The files are neatly arranged into a system of __sub-packages__ (sub-folders) and __modules__ (files).

These files are stored on the computer you are using.

A quick google search for "python function to sum all the numbers in a list"...

https://www.google.co.jp/search?q=python+function+to+sum+all+the+numbers+in+a+list&rlz=1C5CHFA_enJP751JP751&oq=python+function+to+sum+&aqs=chrome.0.0j69i57j0l4.7962j0j7&sourceid=chrome&ie=UTF-8

...returns the function `sum()`.

`sum()` finds the sum of the values in a data structure.





In [43]:
print(sum([1,2,3,4,5]))

print(sum((1,2,3,4,5)))

a = [1,2,3,4,5]
print(sum(a))

15
15
15


The function `max()` finds the maximum value in data structure.

<a id='Packages'></a>
## 2.2 Packages

The standard library tools are available in any Python environment.

More specialised libraries, called packages, are available for more specific tasks 
<br>e.g. solving trigonometric functions.

Packages contain functions and constants.  

We install the packages to use them.   



Two widely used packages for mathematics, science and engineeirng are `NumPy` and `SciPy`.

These are already installed as part of Anaconda.

A package is a collection of Python modules: 
- a __module__ is a single Python file
- a __package__ is a directory of Python modules.<br>(It contains an __init__.py file, to distinguish it from folders that are not libraries).

The files that are stored on your computer when Pygame is installed:
<br>https://github.com/pygame/pygame

The `import` statement must appear before the use of the package in the code.  

        import numpy 

After this, any function in `numpy` can be called as:

        `numpy.function()`
        
and, any constant in `numpy` can be called as:

        `numpy.constant`.

There are a many mathematical functions available. <br>
https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.math.html

We can change the name of a package e.g. to keep our code short and neat.

Using the __`as`__ keyword:

In [138]:
import numpy as np
print(np.pi)

3.141592653589793


We only need to import a package once, at the start of the program or notebook.

<a id='UsingPackageFunctions'></a>
## Using Package Functions. 

Let's learn to use `numpy` functions in our programs. 





In [139]:
# Some examples Numpy functions with their definitions (as given in the documentation)

x = 1

# Trigonometric sine
print(np.sin(x))

# Compute tangent 
print(np.tan(x))

# Trigonometric inverse tangent
print(np.arctan(x))



0.841470984808
1.55740772465
0.785398163397


In [140]:
x = 1

# Convert angles from radians to degrees
degrees = np.degrees(x)
print(degrees)

# Convert angles from degrees to radians
radians = np.radians(degrees)
print(radians)   

57.2957795131
1.0


<a id='FunctionDocumentation'></a>
## 2.3 Function Documentation

Online documentation can be used to find out: 
- what to include in the () parentheses
- allowable data types to use as arguments
- the order in which arguments should be given 


A google search for 'numpy functions' returns:

https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.math.html

(this list is not exhaustive). 

### Try it yourself:
<br> Find a function in the Python Numpy documentation that matches the function definition and use it to solve the following problem:   

Find the hypotenuse of a right angle triangle if the lengths of the other two sides are 3 and 6. 

In [141]:
# The “legs” of a right angle triangle are 6 units and 3 units, 
# Return its hypotenuse in units.

<a id='Examplenumpycos'></a>
### Example : numpy.cos
Documentation : https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.math.html 

The documentation tells us the following information...

##### What the function does.
"Cosine element-wise."



##### All possible function arguments (parameters)

<img src="img/numpy_cos.png" alt="Drawing" style="width: 500px;"/> 

>numpy.cos(<font color='blue'>x</font>, /, <font color='red'>out=None</font>, *, <font color='green'>where=True, casting='same_kind', order='K', dtype=None, subok=True</font> [, <font color='purple'>signature, extobj</font> ]) 

In the () parentheses following the function name are:
- <font color='blue'>*positional* arguments (required)</font>
- <font color='red'>*keyword* arguments (with a default value, optionally set). Listed after the `/` slash.</font>
- <font color='green'>arguments that must be explicitly named. Listed after the `*` star.</font> 
  <br><font color='purple'>(including arguments without a default value.  Listed in `[]` brackets.)</font>



##### Function argument definitions and acceptable forms.  

<img src="img/numpy_cos_params.png" alt="Drawing" style="width: 500px;"/> 

x : array_like *(it can be an `int`, `float`, `list` or `tuple`)*

out : ndarray, None, or tuple of ndarray and None, optional

where : array_like, optional 



##### What the function returns
__y__ : ndarray<br>
&nbsp; &nbsp; &nbsp; &nbsp; The corresponding cosine values.

Let's look at the function numpy.degrees:
https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.degrees.html

What does the function do?

What __arguments__ does it take (and are there any default arguments)? 

How would we __write__ the function when __calling__ it (accept defaults)?

What __data type__ should our input be? 

<a id='Namespaces'></a>
## 2.4 Namespaces
<br>By prefixing `cos` with `np`, we are using a *namespace* (which in this case is `np`).



The namespace shows we want to use the `cos` function from the Numpy package.

If `cos` appears in more than one package we import, then there will be more than one `cos` function available.

We must make it clear which `cos` we want to use. 




Often, functions with the same name, from different packages, will use a different algorithms for performing the same or similar operation. 

They may vary in speed and accuracy. 

In some applications we might need an accurate method for computing the square root, for example, and the speed of the program may not be important. For other applications we might need speed with an allowable compromise on accuracy.


Below are two functions, both named `sqrt`. 

Both functions compute the square root of the input.

 - `math.sqrt`, from the package, `math`, gives an error if the input is a negative number. It does not support complex numbers.
 - `cmath.sqrt`, from the package, `cmath`, supports complex numbers.


In [142]:
import math
import cmath
print(math.sqrt(4))
#print(math.sqrt-5)
#print(cmath.sqrt(-5))

2.0


Two developers collaborating on the same program might choose the same name for two functions that perform similar tasks. 

If these functions are in different modules, there will be no name clash since the module name provides a 'namespace'. 

<a id='ImportingFunction'></a>
## 2.7 Importing a Function
Single functions can be imported without importing the entire package e.g. use:

        from numpy import cos

instead of:

        import numpy 

After this you call the function without the numpy prefix: 

In [143]:
from numpy import cos

cos(x)

0.54030230586813977

Be careful when doing this as there can be only one definition of each function.
In the case that a function name is already defined, it will be overwritten by a more recent definition. 

In [144]:
from cmath import sqrt
print(sqrt(-1))
from math import sqrt
#print(sqrt(-1))

1j


A potential solution to this is to rename individual functions or constants when we import them:

In [145]:
from numpy import cos as cosine

cosine(x)

0.54030230586813977

In [146]:
from numpy import pi as pi
pi

3.141592653589793

This can be useful when importing functions from different modules:

In [147]:
from math import sqrt as square_root
from cmath import sqrt as complex_square_root

print(square_root(4))
print(complex_square_root(-1))

2.0
1j


Function names should be chosen wisely.
 - relevant
 - concise

<a id='Optimise'></a>
## 2.8 Using Package Functions to Optimise your Code

Let's look at some examples of where Numpy functions can make your code shorter and neater.

The mean of a group of numbers

In [149]:
x_mean = (1 + 2 + 3)/3   

Using Numpy:

In [150]:
x_mean = np.mean([1, 2, 3])

<a id='DataStructuresFunctionArguments'></a>
## Data Structures as Function Arguments. 

Notice that the Numpy function `mean` take a lists as its argument.

The list data structure is required for the function to work. 

In [151]:
ls = [1, 2, 3]
x_mean = np.mean(ls)

<a id='ElementwiseFunctions'></a>
### Elementwise Functions
Numpy functions often operate *elementwise*. 
<br> This means if the argument is a list, they will perform the same function on each element of the list.

For example, to find the square root of each number in a list, we can use:

In [152]:
a = [9, 25, 36]
print(np.sqrt(a))

[ 3.  5.  6.]


Elementwise operation can be particularly important when performing basic mathematical operations:

In [153]:
a = [1, 2, 3]
b = [4, 5, 6]
import numpy as np

print(a + b)
print(np.add(a,b))

[1, 2, 3, 4, 5, 6]
[5 7 9]


Numpy has its own data structure that is more suitable for handling numerical data.

You can now use the imported data as a regular Numpy array.



<a id='ReviewExercises'></a>
# 3. Review Exercises

Compete the exercises below.

Save your answers as .py files and email them to:
<br>philamore.hemma.5s@kyoto-u.ac.jp

## Review Exercise 1 : Finding Functions to Import

<br>Earlier, we found the dot product of two vectors: 

$\mathbf{A} \cdot \mathbf{B} = \sum_{i=1}^n A_i B_i = A_x B_x + A_y B_y + A_z B_z$

to find the sum of two vectors:
<br>$ \mathbf{A} = [A_x, A_y, A_z]$
<br>$ \mathbf{B} = [B_x, B_y, B_z]$

There is a `numpy` function to compute the dot product.

<br>Find the `numpy` function online and use it to compute the dot product of two 3D position vectors, expressed as lists `C` and `D`.
```
C = [-1, 2, 6]
D = [4, 3, 3]
```

In [None]:
# Review Exercise 1 : Finding Functions to Import
C = [-1, 2, 6]
D = [4, 3, 3]

## Review Exercise 2 : Combining Imported Functions

The dot product of two vectors can also be expressed as:

\begin{align}
\mathbf{A} \cdot \mathbf{B} = |\mathbf{A}| |\mathbf{B}| cos(\theta)
\end{align}

Where:

<br>$\theta$ is the angle between the two vectors

$|\mathbf{A}|$ is the magnitude of vector $\mathbf{A}$.

$|\mathbf{B}|$ is the magnitude of vector $\mathbf{B}$.



The magnitude of an $n$-length vector $ \mathbf{A} = [A_1, ..., A_n]$ is:

$|\mathbf{A}| = \sqrt{A_1^2 + ... + A_n^2}$







Find the angle between the vectors `C` and `D` in __Review Exercise 1__.

*Hint:*

Search online to find a numpy function that computes *magnitude*.

Search online to find a numpy function for the *inverse cosine*.







In [None]:
# Find the angle between C and D
C = [-1, 2, 6]
D = [4, 3, 3]




## Review Exercise 3 :  Classifer

The dot product also indicates if the angle between two vectors $\mathbf{A}$ and $\mathbf{B}$ is:

   - acute ($\mathbf{A} \cdot \mathbf{B}>0$)
   - obtuse ($\mathbf{A} \cdot \mathbf{B}<0$)
   - right angle ($\mathbf{A} \cdot \mathbf{B}==0$)

Using `if`, `elif` and `else`, classify the angle between `C` and `D` as acute, obtuse or right angle.

In [1]:
# Review Exercise 3 :  Classifer

C = [-1, 2, 6]
D = [4, 3, 3]



## Review Exercise 4: Numpy Package Functions. 
Find a function in the Python Numpy documentation that matches the function definition and use it to solve the problems below:

__(A)__ Definition: *Calculates the exponential function, $y= e^x$ for all elements in the input array.*

Print a list where each element is the exponential function of the corresponding element in list `a = [0.1, 0, 10]`

In [None]:
# Print a list where each element is the exponential of the corresponding element in list a

__(B)__ Definition: *Converts angles from degrees to radians.*

Convert angle `theta`, expressed in degrees, to radians:
<br>`theta` = 47

In [None]:
# convert angle `theta`, expressed in degrees, to radians

__(C)__ Definition: *Return the positive square-root of an array, element-wise.*

Print a list where each element is the square root of the corresponding element in list `a = [4, 16, 81]`

In [None]:
# Print a list where each element is the square root of the corresponding element in list a

## Review Exercise 5:  Using a single list with a `for` loop.
In the cell below, use a `for` loop to print the first letter of each month in the list.



In [3]:
# Print the first letter of each month in the list

months = ["January",
         "February",
         "March",
         "April",
         "May",
         "June",
         "July",
         "August",
         "September",
         "October",
         "November",
         "December"]