<p style="text-align: center;"><font size="8"><b>Section 1.6: Functions and Modules</b></font><br>


# Calling Functions

Pure functions exist as methods that can be called outside the context of a particular class. For example we've already seen the `round` and `len` functions. Remember that we call `round(a)`, not `a.round()`. Python provides many built-in functions. 

![functions](https://github.com/lukasbystricky/ISC-3313/blob/master/lectures/chapter2/images/functions.png?raw=true)

In [None]:
pow(2,3)

8

In [None]:
min(-2,-2.4,9.0)

-2.4

In [None]:
ord("a")

97

# Print function

After performing a computation we need to see the result. In an interactive or notebook session we can simply type the name of the variable to see what it is.

In [None]:
a = 1
a

1

When executing scripts however, we can no longer do this. Instead we use the `print` command (in fact we've already used this several times).

The `print` function is used to print information to the console. 

In [None]:
print(a)
print("a")

1
a


Note the original Python 2 syntax, `print a`, (no parentheses) does not work in Python 3. However the syntax above (with parentheses) works in both Python 2 and Python 3.


Of course, `print` can be used to display more useful information. For example, suppose we have a variable called `t` that represents some length of time in seconds. We can use `print` to display not only `t`, but also the units by calling `print` with multiple arguments.

In [None]:
t = 5.6
print(t, "s")

5.6 s


The `print` function automatically inserts a space between the two arguments. If we wish the avoid this, we can combine the two arguments into 1.

In [None]:
print(str(t)+"s")

5.6s


Note that we have to convert `round(t,1)` to a string first before "adding" (the correct term is *concatenating*) it to `s`. The command

In [None]:
print(t+"s")

TypeError: ignored

is illegal because we are attempting to add a `float` to a `str`.

Another option is to use the f-string formatting we talked about earlier,

In [None]:
print(f"{t}s")

5.6s


# Modules

All the functions listed above are built-in to Python. This means they are automatically available once we start Python. But you'll notice that there's not actually all that many built-in functions. What if we want to take the logarithm of a number for example?

There are hundreds of other useful functions and classes that have been developed for Python which are not automatically loaded, but instead placed into specialized libraries called *modules* that can be individually loaded as needed.

If you installed Python using Anaconda, you will already have many modules installed. We will first take a look at the math module and the NumPy module, both of which come with Anaconda. 

## Math module

The math module provides functions to do mathematical operations beyond addition, multiplication, exponentiation etc. For example suppose we want to take the cosine of a number. This is not built-in to Python, however a function called `cos` in the math library does this. 

To use the `cos` function we must import it from the math module. There are three possible ways to do this. Whichever method you use should be done at the beginning of your code (this is not strictly speaking necessary, but it is the most common placement, at the very least you have to do this *before* you use the function).

1) Import the entire module. 

In [None]:
import math

At this point we still cannot use the `cos` command directly, we must specifically tell Python where the function is coming from using a *qualified name*.

In [None]:
cos(2)

NameError: ignored

In [None]:
math.cos(2)

-0.4161468365471424

2) Specifically importing `cos` from the math library. If we are using `cos` several times in the code, this can avoid repeated typing.

In [None]:
from math import cos
cos(2)

-0.4161468365471424

3) Import everything in the module by using the \* wildcard. This can be an attractive option but is generally discouraged because different modules may use the same name for different functions. This method of importing imports not only the `cos` function, but all functions from the math library, for example `sqrt` and `tan`.

In [None]:
from math import *
cos(2)

-0.4161468365471424

In [None]:
sqrt(2)

1.4142135623730951

# Expressions

Before looking at other modules, let's look quickly at expressions. We have already seen several expressions in isolation (e.g. 18+5.5).

It is quite common the perform several operations as part of a single expression.

In [None]:
a = 18 + 5.5 + 1
a

24.5

In this case behind the scenes Python adds 18 and 5.5 to get 23.5, it then adds 1 to get 24.5. In this case the order of the two operations does not matter, however in more complicated expressions the order can be important.

In [None]:
a = 18*9**2/4
a

364.5

In [None]:
a = 9/4**2*18
a

10.125

## Precedence

When there are two or more operations as part of an expression, we must figure out some way to determine which operation is performed first. We say that an operation that is performed fist is given *precedence* over the others.

Mathematical expressions in Python follow standard algebraic conventions:
1. Brackets
2. Exponents
3. Division/Multiplication
4. Addition/Subtraction

For example in the expression `1 + 2 * 3` the multiplication is done first, followed by the addition. 

In Python, as in algebra we can use brackets to prioritize an operation.

In [None]:
1+2*3

7

In [None]:
(1+2)*3

9

Most operations with equal precedence are evaluated left to right, again to mimic standard algebraic rules.  

One exception is exponents which are evaluated right to left, which again is how we typically think of exponents.

$4^{3^2} = 4^9$

In [None]:
4**3**2

262144

In [None]:
4**(3**2)

262144

In [None]:
(4**3)**2

4096

Even though precedence rules are based on algebraic rules, they are enforced for any data type. 

In [None]:
"a"*3+"b"

'aaab'

## Excersise

Write an expression that evaluates $3(7^2 + 4^{3^3} - 10)$.

Of course we can have much more complicated expressions involving the math module for example.

## Example

When boiling an egg, it has been determined that the time it takes for the center of the yolk to reach a desired temperature $T$ is given by

$$ t = \frac{M^{2/3}c\rho^{1/3}}{K\pi^2(4\pi/3)^{2/3}}\ln\left(0.76\frac{T_0 - 100}{T - 100}\right)$$

where
* $M$ is the mass of the egg
* $\rho$ is the density
* $c$ is the specific heat capacity
* $K$ is the thermal conductivity
* $T_0$ is the temperature at $t=0$

$$ t = \frac{M^{2/3}c\rho^{1/3}}{K\pi^2(4\pi/3)^{2/3}}\ln\left(0.76\frac{T_0 - 100}{T - 100}\right)$$

In [None]:
T = 70 # desired temperature
M = 47
rho = 1.038
c = 3.7
K = 5.4e-3
T0 = 4

# compute time according to above formula
# we need the natural logarithm function and the pi constant from the math module
from math import log, pi

t = (M**(2/3)*c*rho**(1/3))/(K*pi**2*(4*pi/3)**(2/3))*log(0.76*(T0 - 100)/(T - 100))
print(t)

313.09454902221637


## Exercise 

A quadratic equation can be written as:
$$ ax^2 + bx + c = 0.$$

In general this equation has two (possibly equal) solutions and they are given by the formulas:

\begin{align*}
    x_1 = \frac{-b + \sqrt{b^2 - 4ac}}{2a},\\
    x_2 = \frac{-b - \sqrt{b^2 - 4ac}}{2a}.\\
\end{align*}

Use these formulas to compute the solutions of the equation $8x^2 + 16x + 4 = 0$.


In [None]:
a = ...
b = ...
c = ...

x1 = ...
x2 = ...

## Calling Functions from Within Expressions

Function calls have high precedence. When multiple function calls are used in the same expression they are typically evaluated from left to right.

In [None]:
person = "George Washington"
person.split()[1]

'Washington'

More complicated expressions are evaluated by first resolving commands inside parentheses.

In [None]:
groceries = ["cereal", "milk", "apple"]
groceries.insert(groceries.index("milk") + 1, "eggs")
groceries

['cereal', 'milk', 'eggs', 'apple']

Here we first must evaluate `groceries.index("milk")` and then add 1 to it to find the index where we wish to insert "eggs".

## Exercise

Write a one line expression to capitalize the word to the immediate right of "milk" in the given list. Assume that you don't know the index of "milk" beforehand.

In [None]:
groceries = ["cereal", "milk", "apple"]

# change apple to Apple
groceries[groceries.index("milk") + 1] = ...

# print new list
print(groceries)

['cereal', 'milk', Ellipsis]


# NumPy

The `NumPy` module (http://www.numpy.org/) is an almost indispensible module for scientific computing. It provides objects such as arrays and matrices as well as functions spanning linear algebra, fourier transforms and statistics among numerous other things. 

NumPy will be one of the modules you'll use often in this camp (and likely in most other scientific Python codes).

To start with we must import NumPy. To reduce the amount of typing for ourselves later we will rename the module `np` when we import it. Using `np` as shorthand for NumPy is a relatively standard convention in Python programming. 

In [None]:
import numpy as np

## Arrays

One important class that NumPy provides is the `array` class. An array is similar to a `list` in that it is a collection of objects. Typically arrays store numbers. 

NumPy arrays can be initialized in a similar way to lists.

In [None]:
a = np.array([1,2.0,3.2])
print(a)
type(a)

[1.  2.  3.2]


numpy.ndarray

You'll notice the type of `a` is `numpy.ndarray`. NumPy arrays can be multidimensional. You can think of a 1D array as a kind of list (but not a Python list) and a 2D array as a kind of grid (or, if you know linear algebra as a matrix, but not actually a matrix). Higher dimensional arrays are certainly possible, you can think of a 3D array as a stack of grids. 

![np array](https://github.com/lukasbystricky/ISC-3313/blob/master/lectures/chapter2/images/np_array.jpg?raw=true)
(Image credit: Dalesha Hemrajani)


The attribute `ndim` stores the number of dimensions in the array.

In [None]:
a.ndim

1

Multidimensional arrays can be initialized as an array of arrays. 

In [None]:
b = np.array([[1, 2, 3.0], [1.2,2.2,2]])
print(b)

[[1.  2.  3. ]
 [1.2 2.2 2. ]]


In [None]:
print(b.ndim)

2


The `shape` property tells us how many rows and columns are in our array, while the `size` property tells us the total number of elements in the array.

In [None]:
print(b.shape)
print(b.size)

(2, 3)
6


Here `b.shape` is the tuple (2,3) meaning that `b` has 2 rows and 3 columns.

We could initialize an array as an array of arrays of different sizes. 

In [None]:
c = np.array([[1,2],[3,4,5,6]])

  """Entry point for launching an IPython kernel.


This is perfectly valid. What are the size and shape of `c` however?

In [None]:
print(c.ndim)
print(c.shape)
print(c.size)
print(c)

1
(2,)
2
[list([1, 2]) list([3, 4, 5, 6])]


The way we are initializing `c`, it looks like we are trying to make a multidimensional array with the first row being [1,2] and the second row being [3,4,5,6]. Clearly since the lengths of these two rows are unequal, we cannot make a grid out of them. 

Python can recognize this and instead of making an array of dimension 2, it creates a matrix of dimension 1. Instead of having 6 elements, it only has 2. Each of the elements is a Python list.

### Indexing

It's important to know how arrays are numbered. Like lists and strings, arrays are 0 indexed, meaning the first entry in an array is at index 0. 

Two-dimensional arrays have rows and columns. The entry at index [0,0] (first row, first column) is located at the upper left hand corner of the array. 

![row map](https://github.com/lukasbystricky/ISC-3313/blob/master/lectures/chapter2/images/row_column.gif?raw=true)

Like lists and strings, arrays support indexing and slicing.

In [None]:
a = np.array([1,2.0,3.2])
a[0] # first entry in a

1.0

In [None]:
b = np.array([[1, 2, 3.0], [1.2,2.2,2]])
b[1] # second entry in b, each entry is a row

array([1.2, 2.2, 2. ])

When we have a multidimensional array (or an array of arrays of equal or unequal length) we can access the element at row i and column j using the syntax:

In [None]:
b[1][0] # first element in the second row of b

1.2

Or the equivalent syntax:

In [None]:
b[1,0]

1.2

Note that is `b[1][0]` is the element in b at row 1 column 0.

`b[1][0]` can also be thought of as the element at index 0 of `b[1]`.

Note that arrays are mutable. For example we can modify an element of `b`.

In [None]:
b[0][0] = 8
print(b)

[[8.  2.  3. ]
 [1.2 2.2 2. ]]


Slicing is done in exactly the same way.

In [None]:
a = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
print(a)

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]


In [None]:
print(a[0,0:2]) # fist 2 entries of row 0

[1 2]


In [None]:
print(a[0:2,1:3]) # first 2 rows and columns 2 and 3

[[2 3]
 [5 6]]


The colon operator by itself means the entire row or column.

In [None]:
print(a[:,1]) # entire second column

[ 2  5  8 11]


## Exercise

Create a NumPy array representing the data:
$$ \begin{bmatrix} 1 & 2 & 3 & 4\\ 5 & 6 & 7 & 8\\ 9 & 10 & 11 & 12\\ 13 & 14 & 15 & 16\end{bmatrix}$$

## Exercise

Extract the middle 2x2 array from the array from the previous exercise. i.e. use slicing to extract the array:
$$ \begin{bmatrix} 6 & 7\\ 10 & 11\end{bmatrix}$$

### Operations

Arrays support several familiar operators. For example you can multiply or divide them by a number.

In [None]:
a = np.array([1,2])
print(2*a)
print(a/2)

[2 4]
[0.5 1. ]


You can also add a number to them. 

It should be made clear, when you add/subtract/multiply/divide/exponentiate/etc an array by a single number (known as a scalar) the operation is applied to **every** element of the array.

In [None]:
a = np.array([1,2])
print(a + 1)

[2 3]


Or add two arrays.

In [None]:
b = np.array([3,4])
print(a + b)

[4 6]


When you add two arrays together they must be the same size.

In [None]:
a = np.array([1,2])
a = np.array([2,4,5])
print(a + b)

ValueError: ignored

For those of you familiar with linear algebra, it may be tempting to think of 2D arrays as matrices and 1D arrays as vectors. This is __not__ true. 

For example, suppose we want to multiply two NumPy arrays, $[a_1,a_2,a_3]$ and $[b_1,b_2,b_3]$. There are three possible ways to multiply vectors:
1. dot product
2. cross product
3. outer product

Which does NumPy do?


In [None]:
a = np.array([1,2,3])
b = np.array([3,4,5])

print(a*b)

[ 3  8 15]


It turns that NumPy doesn't automatically do any of the standard vector products. Instead it does *element-wise multiplication*. In other words, `a*b` is equal to $[a_1 b_1, a_2 b_2, a_3 b_3]$.

Now suppose $A$ is a 2D array and $x$ is a 1D array. In linear algebra an array times a vector returns a vector. So what is `A*x` in Python? 

In [None]:
x = np.array([1,2])
A = np.array([[3,2], [1,2]])

print(A*x)

[[3 4]
 [1 4]]


We get a 2D array, instead of a vector. This is because array operations are defined elementwise. $A\mathbf{x}$ in this case is defined to be:

$$ \begin{bmatrix} A_{11}x_1 & A_{12}x_2\\ A_{21}x_1 & A_{22} x_2\end{bmatrix}$$

Likewise if we call $A^2$, we get

In [None]:
print(A**2)

[[9 4]
 [1 4]]


which is 
$$ \begin{bmatrix} A_{11}^2 & A_{12}^2\\A_{21}^2 & A_{22}^2\end{bmatrix}.$$
This is not what we expect from matrix squaring.

NumPy provides functions to treat arrays like matrices. For example the function `np.dot` computes the dot product of two arrays or the matrix-vector product of a 2D array and a 1D array. Likewise `np.cross` computes the cross product of two 1D arrays. 

If you need to do matrix calculations beyond matrix vector multiplication however, NumPy provides a dedicated matrix class that supports operations like matrix-vector multiplication using the \* symbol or exponentiation using \*\*.

NumPy also provides a submodule `linalg` that can do linear algebra operations: finding eigenvalues, solving linear systems etc.

## Exercise

Write a code fragment that adds the arrays:

$$A = \begin{bmatrix} 1 & 2 \\ 3 & 4\end{bmatrix}$$
and 
$$ B = \begin{bmatrix} 5 & 6\\ 7 & 8\end{bmatrix}.$$

## Other Useful Operations


### arange
NumPy provides many useful operations to generate and manipulate arrays. 

For example suppose we want to create an array ranging from 1 to 20. NumPy provides a function `np.arange` that does just that.

In [None]:
a = np.arange(1,21)
print(a)

[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20]


The arange function works in a similar way to the `range` class we saw earlier. It can take in up to three arguments: a starting value, an end value and a step size. It returns a 1D array that starts at the starting value and adds the step size until it reaches or exceeds the end value. 


Note that unlike the `range` class, the starting and ending values as well as the step size can be floats. 

In [None]:
a = np.arange(1.1,2.0,0.1)
print(a)

[1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9]


In [None]:
a = np.arange(1,2.2,0.1)
print(a)

[1.  1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.  2.1 2.2]


You'll notice however that when we use floating point numbers, we may or may not have the ending value as part of our array. This is due to floating point error. 

### linspace

This ambiguity in arange can cause problems. Fortunately, NumPy provides a seperate function that creates an array of a specified length. The `np.linspace` function takes in a start and end value as well as the number of points. Unless you are dealing with integers, linspace is preferred over arange to generate equally spaced arrays. 

In [None]:
a = np.linspace(0,2.2,5)
print(a)

[0.   0.55 1.1  1.65 2.2 ]


The last parameter in linspace is the number of  desired points in the array. If the start value is greater than the end value, then linspace automatically takes negative step sizes.

In [None]:
a = np.linspace(5,1,5)
print(a)

[5. 4. 3. 2. 1.]


### reshape

Suppose we wanted to create the array:
$$ A = \begin{bmatrix} 1 & 2 & 3\\ 4 & 5 & 6\\ 7 &8&9\\10 & 11& 12\end{bmatrix}.$$

We could create this matrix by hand, but if it was much larger that would be a pain. What if instead we used arange or linspace to create a 1D array with the same data, and then reshaped into an array with 4 rows and 3 columns? 

NumPy provides the function `np.reshape` that does just that.

In [None]:
a = np.arange(1, 13)
A = np.reshape(a, (4, 3))

print(A)

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]


Note that the second argument to `np.reshape` is a tuple (r,c) where r is the number of rows and c is the number of columns we want the new array to have. 

### zeros

Sometimes it can be useful to create an array of zeros to fill in with non-zero values later:

$$ A = \begin{bmatrix} 0 & 0 & 0\\ 0 & 0 & 0\\ 0 &0&0\\0 & 0& 0\end{bmatrix}.$$

We find that we would repeatedly create rows of zeros if we did this by hand. And again, this would be just as painful to do for a larger matrix as stated in the `reshape` example. 

Again NumPy saves the day with `np.zeros((n,m))`, where n is the number of rows and m is the number of columns of zeros that you want. We can even make 1D arrays of zeros with this function a la `np.zeros(n)`

In [None]:
a = np.zeros(5)
A = np.zeros((4, 3))
print(a,'\n')
print(A)

[0. 0. 0. 0. 0.] 

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


Making a large array of zeros is quite useful, especially if we need what is called a sparse array, where only a small number of the entries are non-zero. If we needed an array with only a 1 at row=1, column=2, it would be much easier to do `A = np.zeros((4,3))` and then `A[1,2] = 1`, instead of creating it by hand.
 

### sum and mean/average

We can also easily get the sum and mean/average of the values stored within a numpy array using `np.sum(arr)` and `np.mean(array)`, respectively. We can use these functions for arrays with any number of dimensions. If the arrays are 2D or greater, you can specify the axis over which the sum or mean is taken, e.g. with a 2D array, `np.sum(A,axis=0)` would return a sum over of each column, `np.mean(A,axis=1)` would return the mean of each row.

In [None]:
a = np.arange(1, 13)
B = np.reshape(a, (4, 3))
print(a)
print(B)
print("\nSum  of 'a':", np.sum(a))
print("Mean of 'a':", np.mean(a))
print("\nSum  of 'B':", np.sum(B))
print("Mean of 'B':", np.mean(B))
print("\nSum  of 'B' rows:", np.sum(B,axis=0))
print("Mean of 'B' rows:", np.mean(B,axis=0))
print("\nSum  of 'B' columns:", np.sum(B,axis=1))
print("Mean of 'B' columns:", np.mean(B,axis=1))


[ 1  2  3  4  5  6  7  8  9 10 11 12]
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]

Sum  of 'a': 78
Mean of 'a': 6.5

Sum  of 'B': 78
Mean of 'B': 6.5

Sum  of 'B' rows: [22 26 30]
Mean of 'B' rows: [5.5 6.5 7.5]

Sum  of 'B' columns: [ 6 15 24 33]
Mean of 'B' columns: [ 2.  5.  8. 11.]


### size and shape

The size (number of total elements in an array) and shape of arrays (number of rows, columns, etc) can also be determined with NumPy!

Both types of information can be obtained from an array directly. For example, we have a 2D array named B with 2 rows and 3 columns. Using `B.size` we get a return value of 6 elements total in B. Using `B.shape` and we get a small tuple of `(2, 3)`, giving us the number of elements in each direction of the array!


In [None]:
a = np.arange(1, 13)
B = np.reshape(a, (4, 3))
print(a)
print(B)
print("\nSize  of 'a':", a.size)
print("Shape of 'a':", a.shape)
print("\nSize  of 'B':", B.size)
print("Shape of 'B':", B.shape)


[ 1  2  3  4  5  6  7  8  9 10 11 12]
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]

Size  of 'a': 12
Shape of 'a': (12,)

Size  of 'B': 12
Shape of 'B': (4, 3)


### max and min

Finding the largest valued element and the smallest valued element in any shape array is also very easy with NumPy! One simply needs to use `np.max(Array)` for the maximally largest value and `np.min(Array)` for the minimally smallest value.

The same rules apply with these functions when it comes to arrays with more than 1-dimension. You can specify these functions to get the maximum/minimum values for a given axis of the array, e.g. `np.max(B, axis=1)` would return an array of values giving you the maximum value for each row of `B`, axis=0 would give the maximum value for each column.

In [None]:
a = np.arange(1, 13)
B = np.reshape(a, (4, 3))
print(a)
print(B)
print("\nMax of 'a':", np.max(a))
print("Min of 'a':", np.min(a))
print("\nMax  of 'B':", np.max(B))
print("Max of B rows:", np.max(B,axis=0))
B[0,0] = 100.0
print("\n",B)
print("Max of B rows:", np.max(B,axis=0))



[ 1  2  3  4  5  6  7  8  9 10 11 12]
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]

Max of 'a': 12
Min of 'a': 1

Max  of 'B': 12
Max of B rows: [10 11 12]

 [[100   2   3]
 [  4   5   6]
 [  7   8   9]
 [ 10  11  12]]
Max of B rows: [100  11  12]


### where

Suppose we wanted to take all of the values of an array that were larger than 2.5 and make them zero. This is certainly doable by looping through the array, checking a new value of the array for each iteration of the loop and changing it if it's greater than 2.5. However we can even efficiently and quickly make changes to a NumPy arrays based on these kinds of logical expressions!

The `np.where()` function is the tool of choice for this. For our example, you would use `np.where(B < 2.5, B, 0.0)` which translates to "for where in B it is less than 2.5, use the corresponding values in B, otherwise set to 0.0"

In [None]:
a = np.arange(1, 13)
B = np.reshape(a, (4, 3))
print(B,"\n")
print(np.where(B < 2.5, B, 0.0))



[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]] 

[[1. 2. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


### stack

Finally, we'll go over the `stack` method in NumPy. This function allows us to stack together multiple arrays in a larger, higher dimensional array. Suppose we had three 1D arrays of length 5 each and we wanted to stack them on top of each other to form a single, (3 by 5) 2D array, `np.stack` would provide us the means to do this.

In [None]:
a = np.arange(1, 6)
b = np.arange(6,11)
c = np.arange(11,16)
print("a = ", a)
print("b = ", b)
print("c = ", c)
D = np.stack((a,b,c),axis=0)
print("\nD = \n",D)




a =  [1 2 3 4 5]
b =  [ 6  7  8  9 10]
c =  [11 12 13 14 15]

D = 
 [[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]]


# Documentation - NumPy

But what if you forget how to properly use a function from NumPy? Or maybe you want to look for other functions that might be of use to you? Well, computer programmers have you covered, with **documentation**. Just like with informational user manuals that come with a new car, microwave, cell phone, etc, Python modules come with their own user manuals! A link the to NumPy's documentation manual can be found here [NumPy docs link](https://numpy.org/doc/stable/user/index.html). At this website, you'll find a list of contents leading to different and helpful resources for using NumPy in Python code, such as their [absolute beginner's guide](https://numpy.org/doc/stable/user/absolute_beginners.html). When in doubt, look at the documentation website!

For completeness, the Python `math` module documentation can be found here [math docs](https://docs.python.org/3/library/math.html#) which is a part of the greater [Python documentation website](https://docs.python.org/3/).

A word of caution, larger projects like NumPy are well supported, but some projects are not so fortunate. As a result, smaller Python module projects may not have as good documentation as good documentation takes a lot of time and resources to make! However, there are still ways to find out useful information concerning individual modules and their functions. Often times, Google Search will be your friend, particularly any search results that take you to stackoverflow.com, and a useful resource when stuck on a programming problem.

## Exercise

Using the linspace and reshape functions, create the array:
$$ A = \begin{bmatrix} 0.1 & 0.2 & 0.3 & 0.4 & 0.5\\ 0.6 & 0.7 & 0.8 & 0.9 & 1\end{bmatrix}.$$

In [None]:
a = np.array([0.1, 0.2, 0.3])
print(np.sum(a))

0.6000000000000001


In [None]:
A = np.array([[0.1,0.2],[0.3,0.4]])
print(np.sum(A))

1.0


For 2D (or higher) arrays, we can input an optional second parameter that tells NumPy which axis to sum along. 

In [None]:
A = np.array([[0.1,0.2],[0.3,0.4]])
print(np.sum(A,0)) # axis 0 means sum along rows

[0.4 0.6]


In [None]:
print(np.sum(A,1)) # axis 1 means sum along columns

[0.3 0.7]


### diag

The function `np.diag` either creates a 2D diagonal matrix from a 1D array, or extracts and returns the diagonal entries of a 2D matrix. 

For example if the input is a 1D array [1, 2], the output would be

$$ \begin{bmatrix} 1 & 0\\ 0 & 2\end{bmatrix}.$$

If the input is the 2D array

$$\begin{bmatrix} 3 & 4\\5 & 6\end{bmatrix}$$
the output would be the 1D array [3, 6].

In [None]:
a = np.arange(3)
print(np.diag(a))

[[0 0 0]
 [0 1 0]
 [0 0 2]]


In [None]:
A = np.arange(9).reshape((3,3))
print(A)

[[0 1 2]
 [3 4 5]
 [6 7 8]]


In [None]:
print(np.diag(A))

[0 4 8]


## Exercise

The trace of a matrix is defined as the sum of its main diagonal, i.e. the trace of a matrix

$$ A = \begin{bmatrix} a_{11} & a_{12} & a_{13}\\ a_{21} & a_{22} & a_{23}\\ a_{31} & a_{32} & a_{33}\end{bmatrix}$$

would be $a_{11} + a_{22} + a_{33}$. 

Using the diag and sum functions find the trace of the matrix represented by the following array.

In [None]:
A = np.linspace(1,100,25).reshape((5,5))
print(A)

[[  1.      5.125   9.25   13.375  17.5  ]
 [ 21.625  25.75   29.875  34.     38.125]
 [ 42.25   46.375  50.5    54.625  58.75 ]
 [ 62.875  67.     71.125  75.25   79.375]
 [ 83.5    87.625  91.75   95.875 100.   ]]


### Mathematical Operations

We saw earlier the math module. The math module provides functions like sine, cosine, arc-tangent and so on. These functions only work on numbers. If we pass in a NumPy array we get an error.

In [None]:
import math

a = np.array([0,1])
print(math.sin(a))

TypeError: ignored

NumPy provides its own implementations of many mathematical functions that take in arrays and perform operations on each element.

In [None]:
a = np.array([0,1])
print(np.sin(a))

[0.         0.84147098]


The fact that the NumPy and math modules provide functions with the same names demonstrates why it is a bad idea to import everything at once from a module.

## Exercise

Construct the following 2D array matrix

$$ A = \begin{bmatrix} 1 & 2 & 3\\ 4 & 5 & 6\\ 7 & 8 & 9\end{bmatrix}$$

Once you have the this matrix replace all of the values that are less than the mean value of the matrix with the sum of the matrix. Then print out the mean of this modified matrix.

HINT: Constructing the initial matrix can be done with some combination of `np.arange`, `np.reshape`, and/or `np.stack`

In [None]:
a = np.arange(1,4)
b = np.arange(4,7)
c = np.arange(7,10)
A = np.stack((a,b,c),axis=0)
print(A)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


Now that we have the matrix, let's try to replace the values that are less than the mean?

In [None]:
Amod = np.where(A <= np.mean(A), A, np.sum(A))
print(Amod)
print("\nMean of A-original = ", np.mean(A))
print("Mean of A-modified = ", np.mean(Amod))

[[ 1  2  3]
 [ 4  5 45]
 [45 45 45]]

Mean of A-original =  5.0
Mean of A-modified =  21.666666666666668


The fact that the NumPy and math modules provide functions with the same names demonstrates why it is a bad idea to import everything at once from a module.

## Exercise

Evaluate the expression:
$ 3a + b,$
where $a = [1, 2, 3, 4, 5]$ and $b = [e^{-0.1}, e^{-0.2}, e^{-0.3}, e^{-0.4}, e^{-0.5}].$ Use linspace or arange to create $a$ and $b$.

### Arrays vs. Lists

Arrays and lists are similar in many ways. Both represent a collection of objects. In this camp (and beyond) arrays and lists will be the most common data structures you will use. When should you use one over the other? 

For starters, arrays are mutable, however they do not support methods such as `pop` or `append`. Once initialized the size of an array cannot be easily changed. If your application needs to change the size of a collection, lists are the prefered option. 

Another difference between arrays and lists is how operators are defined. We saw earlier how we can add two arrays together or multiply them by a number. This behaviour is different from how it is handled with lists.

In [None]:
a_array = np.array([1,2])
b_array = np.array([3,4])

a_list = [1,2]
b_list = [3,4]

print("a_array + b_array:", a_array + b_array)
print("a_list + b_list:  ", a_list + b_list)

print("2*a_array:", 2*a_array)
print("2*a_list: ", 2*a_list)

a_array + b_array: [4 6]
a_list + b_list:   [1, 2, 3, 4]
2*a_array: [2 4]
2*a_list:  [1, 2, 1, 2]


It's possible to convert from a list to an array or vice versa. NumPy arrays provide the method `tolist()` which converts an array to a list.

In [None]:
c = a.tolist()
type(c)

list

NumPy also provides the function `asarray` that takes a list and returns an array.

In [None]:
d = np.asarray(c)
type(d)

numpy.ndarray