# Lecture 4.1 - Using Modules and Introduction to NumPy

# Summary

### Programming 

- Modules and namespaces
- Using scientific libraries
- The basic properties of NumPy
- NumPy arrays vs lists
- Creating NumPy arrays
- The dimensions and components of a NumPy array
- Passing NumPy arrays to functions 
- Componentwise arithmetic of NumPy arrays




## Modules and Namespaces

In Week 3 we designed some useful functions, namely `factorial` to compute the factorial of non-negative number $n$ and  `fib_iter` and `fib_rec` to compute the $n$ th Fibonacci number, either iteratively of recursively respectively.
Let's say we want to use these functions in other code, for example in other functions or in this Notebook. Well in that case we save them in a *simple text file* with extension `.py`, i.e. a python file and **not** a jupyter notebook. Roughly speaking,  a collection of python functions contained in a single file,  is said to be a **module** (or **library**) of functions. (Libraries can in fact contain different types of data such as classes.) Now to use these functions we need to import them. Then we can simply call them in the present code that we are writing. We thus have two main steps in this process. 

### Step 1: create, edit and save a python file (script) 

We have two main options for this. 
- Use a text editor such as *Notepad*, *Gedit*, *Atom*, *Sublime*, *Visual Studio Code (VS Code)*, *Emacs*, *Vim* (among others) to create, edit and save the file. 
- Use an Integrated Development Environment (*IDE*)  such as *Spyder* (supplied by Anaconda) to create, edit and run the code in your file and to save the file. Again there are a number of other *IDEs* such as *Idle*, *PyCharm* or *Visual Studio* (different to VS Code). You can also add functionality to text editors such as *Atom*, *Sublime*, *VS Code*, *Emacs* and *Vim* so as to use them as *IDE*s. 

Our advice is that you learn to use *Spyder* or *VS Code*  to do  this. Next week there will be a short video  covering the use of *Spyder*. For this week we will take a short cut: inside Jupyter we have the possibility of creating, editing and saving a python file/script. (Remember that this is just a text file.) In cases like the present $-$ when the editing basically consists of copying and pasting $-$ this is a perfectly good way of proceeding.

**Note.** Be careful to save your file in your current working folder - i.e. alongside this notebook (or else on the python path if you know what that is).  

**Our Example.** We now create, edit and save a file called `week3_functions.py` using the editor built into the Jupyter. (This will take place during the video of the lecture.) We save it in our present working directory. 

### Step 2: import and implement the functions 

We have created the file `week3_functions.py` which contains the functions `factorial`, `fib_iter` and `fib_rec`. From  the Python interpreter's point of view the code inside this file (i.e. definining these functions) is a **module** called `week3_functions`. (Note that the `.py` extension is not included here.) We can now import this module. In fact we have several options as follows. 

1. Import the module as is: 
```python 
import week3_functions
```
or using an alias 
```python 
import week3_functions as w3_funs
```
Now, suppose that we want to compute the factorial of $10$. Then in the first case we use the function call `week3_functions.factorial(10)` and in the second case we use `w3_funs.factorial(10)`. 

2. Import functions individually. For example
```python
from week3_functions import fib_iter, fib_rec 
```
Then, supposing that we want to compute Fibonacci number with index $15$ we simply call `fib_iter(15)` or 
`fib_rec(15)` (depending on whether we like the iterative or recursive function). Note however that `factorial` has **not** been imported. 
- Import all the functions in the module using the (so called) *wildcard* `*`: 
```python
from week3_functions import *  
```
In this case we can call and use all of the above functions: each of the calls `factorial(10)`, `fib_iter(15)` and `fib_rec(15)` will work as expected. 

**Our Example continued 2.** Let's try the first option without using  an alias, and also try calling the functions. 

In [1]:
import week3_functions 

print(week3_functions.factorial(10))
print(week3_functions.fib_iter(15))
print(week3_functions.fib_rec(15))

3628800
610
610


Now let's trying the second option, using an alias. 

In [2]:
import week3_functions as w3_funs

# For example we can now call the factorial function as follows... 
w3_funs.factorial(10)

3628800

OK. Good. Now's try importing `fib_iter` and `fib_rec` individually (into the `main`/`global` namespace $-$ see below). 

In [3]:
from week3_functions import fib_iter, fib_rec

print(fib_iter(15))
print(fib_rec(15))

610
610


Good. But note that calling `factorial`  generates an error. 

In [4]:
factorial(20)

NameError: name 'factorial' is not defined

Or we can take the easy route and import all the functions in the module into our `main` namespace. 

In [None]:
from week3_functions import *

print(factorial(7))
print(fib_iter(25))
print(fib_rec(25))

5040
75025
75025


### Using Namespaces 

The module `week3_functions` defines the **namespace** of the same name so that, as we have seen, if we import the module  using 
```python 
import week3_functions 
``` 
we need to prefix calls to the functions inside this module with its name. This is why we used `week3_functions.factorial(10)` to call the factorial function. Note how we were also able to use the alias  `w3_funs` for the namespace and that, when we do this, we can call our function using `w3_funs.factorial(10)`. 

On the other hand, when we use
```python 
from week3_functions import fib_iter, fib_rec
```
we import these functions into our `main` namespace and so can call them directly, for example using the call `fib_iter(15)` or `fib_rec(25)` as above. Likewise when we use the wildcard `*` we import all of the functions in the module into our `main` namespace and so can call them directly using the names used in their definitions. 

**Remark.** We have also already seen the **dot** used to call the methods belonging to a class. For example the `list` class contains the method `sort` which we can use to sort objects in the class `list`.  

In [5]:
# Example using the list method "sort"
some_numbers = [1,5,-12,25,7]
print(some_numbers)
some_numbers.sort()
print(some_numbers)

[1, 5, -12, 25, 7]
[-12, 1, 5, 7, 25]


Note that here we can  think of an object of the  `list` class as defining its own *namespace*. This makes sense since  an instance of the method `sort` is an *attribute* of the list object, i.e. effectively belongs to  this object. 

## Using Scientific Libraries

Our above example of a module/library is of course quite limited. However it exemplifies a very important aspect of programming in python (and other programming languages): the use of pre-defined libraries. Of interest to us are the scientific libraries such as `math`, `random` or `numpy`. For example, suppose that we want to be able to use any functions or constants  from the `math` library. 

In [6]:
# We import the math library.
import math 

Now, for example,  we can compute $\cos(\pi/3)$. 

In [7]:
math.cos(math.pi/3)

0.5000000000000001

Now let's proceed slightly differently with the `random` module. Let's import three functions from this module into our `main` namespace and call them.

In [8]:
# We import three  functions in the random library into our main namespace
# Run this cell multiple times to test the functions. 
from random import random, randint, randrange

# A random real in [0,1) 
numberA = random()
# A random integer in  [3,20]
numberB = randint(3,20)
# A random integer in [-7,3) (i.e. not including 3)
numberC = randrange(-7,3)

print("The three numbers are: " + str(round(numberA, 5)) + ", " + str(numberB)  + ", " + str(numberC))
print(f"The three numbers are: {numberA:.5}, {numberB}, {numberC}") 

The three numbers are: 0.71144, 20, -4
The three numbers are: 0.71144, 20, -4


## NumPy

NumPy is the fundamental package  for scientific computing in python. Among other things it provides us with the following.
- A multidimensional array object allowing us to define arrays of any (fixed) dimensions. 
- Many scientific (such as trigonometric) functions  that can be applied directly $-$ i.e. component wise $-$ to Numpy arrays. 
- Specialised tools for linear algebra, numerous functionals such as the Fourier transform, as also random number capabilities. 
- Computational efficiency: Numpy is highly optimised in terms of time and space. 

There are many other aspects of NumPy, such as tools for integration with C/C++ and Fortran, and properties that allow it to seamlessly and speedily integrate with a wide variety of databases, that make it of wide interest to the scientific community. 

**To find out more.** Visit the Numpy webpage at:   <a href="https://numpy.org">https://numpy.org</a> 

**Remark.** If you have used *Matlab* you will notice that the functions and objects defined by NumPy closely mirror those of Matlab. 

**The `numpy` library.** The functions, constants and classes of Numpy are defined within the `numpy` library. It is standard practice to import this library under the alias `np` as in the following cell. 

In [9]:
import numpy as np

Having run this cell all the functions and constants of `numpy` are now available in the namespace `np`. As a basic example run the following cells to compute $\sin \pi/6$ and $\sqrt{2}$. 

In [10]:
np.sin(np.pi/6)

0.49999999999999994

In [11]:
np.sqrt(2)

1.4142135623730951

### NumPy arrays 

NumPy arrays (denoted as *numpy arrays* from now on) are one-dimensional or multi-dimensional collections of integers, doubles or booleans (and other numeric variable types). At first glance, they resemble lists. However $\dots$
- Lists can contain truly anything, and collections of different types. Numpy arrays only contain one type at a time.
- Numpy arrays can **do mathematics**.  Lists **cannot**.
- For lists, one can arbitrarily append elements, and remove elements from the middle. Numpy arrays are of fixed length (usually).
- Numpy arrays are stored in a piece of continuous/contiguous memory. Lists are often stored in linked areas of  non-contiguous memory.

In [12]:
# Example of  a list 
a_list = [0,1,2,3,4,5]
print(a_list)
# The same in the form of a NumPy array
an_array = np.array([0,1,2,3,4,5])   
print(an_array)
# The cell returns the value of the last line... 
an_array

[0, 1, 2, 3, 4, 5]
[0 1 2 3 4 5]


array([0, 1, 2, 3, 4, 5])

In [13]:
# We can perform mathematical operations on the NumPy array
# We certainly cannot do this with the list
array2 = an_array * an_array
array3 = an_array**0.5 
print(array2)
print(array3)

[ 0  1  4  9 16 25]
[0.         1.         1.41421356 1.73205081 2.         2.23606798]


In [14]:
# We can freely add and remove elements to the list
# We certainly cannot do this with the NumPy array
a_list.remove(0)
a_list.append(6)
print(a_list)

[1, 2, 3, 4, 5, 6]


### Creating numpy arrays

Unlike with a list (object), when you define a numpy array (object) you define it to be of fixed dimensions. On the other hand a very useful aspect of numpy arrays is that there are a number of different ways of creating/initialising them. Here are several of these ways. (You will see more in Lecture 6.1.)

 $\blacksquare$ Use a list to initialise an array.

In [15]:
array1 = np.array([1,2,3,4,5])
print("This array belongs to:", type(array1))
print(array1)

This array belongs to: <class 'numpy.ndarray'>
[1 2 3 4 5]


$\blacksquare$ Yes. This means  that you can use list comprehension when you initialise your array. 

In [16]:
array2 = np.array([n**3 for n in range(1,6)])
print("The array printed:", array2)   # Printing: it looks like a list
array2                                # Executing the code directly as output of this cell

The array printed: [  1   8  27  64 125]


array([  1,   8,  27,  64, 125])

$\blacksquare$ An array that contains  17 numbers evenly spaced in the closed nterval [-2,2]. Useful for plotting.

In [17]:
x_values1 = np.linspace(-2,2,17) 
print(x_values1)

[-2.   -1.75 -1.5  -1.25 -1.   -0.75 -0.5  -0.25  0.    0.25  0.5   0.75
  1.    1.25  1.5   1.75  2.  ]


$\blacksquare$ An array with elements of distance 0.25 apart in the closed
interval [-2,2]. The same array as above, via a different definition!  

In [18]:
np.arange(-2,2.25,0.25)  # Or assign to a variable: e.g. xvalues2 = np.linspace(-2,2,17)

array([-2.  , -1.75, -1.5 , -1.25, -1.  , -0.75, -0.5 , -0.25,  0.  ,
        0.25,  0.5 ,  0.75,  1.  ,  1.25,  1.5 ,  1.75,  2.  ])

$\blacksquare$ An array representing a $4 \times 3$ matrix of zeros

In [19]:
matrix1 = np.zeros((4,3))
print(matrix1)

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


$\blacksquare$ A $10 \times 5$ array/matrix containing sixes. 

In [20]:
array3 = np.full((10,5), 6)
print(array3)

[[6 6 6 6 6]
 [6 6 6 6 6]
 [6 6 6 6 6]
 [6 6 6 6 6]
 [6 6 6 6 6]
 [6 6 6 6 6]
 [6 6 6 6 6]
 [6 6 6 6 6]
 [6 6 6 6 6]
 [6 6 6 6 6]]


**Note.** The type of the components is fixed at the outset. 

In [21]:
print("The components of array2 belong to:", type(array2[0]))
print("The components of matrix1 belong to:", type(matrix1[0,0]))

The components of array2 belong to: <class 'numpy.int64'>
The components of matrix1 belong to: <class 'numpy.float64'>


**Remark.** For further tools for initialising numpy arrays see Lecture 6.1 or <a href="https://numpy.org/doc/stable/user/basics.creation.html">https://numpy.org/doc/stable/user/basics.creation.html</a>, or do your own quick web search. 

### The dimensions and components of a numpy array

The numpy function `shape` returns the dimensions of an array as a tuple.

In [None]:
print("The dimension of array1:", np.shape(array1))
print("The dimension of matrix1:", np.shape(matrix1))

The dimension of array1: (5,)
The dimension of matrix1: (4, 3)


Here's a nice way of assigning the dimensions of a matrix. 

In [None]:
rows, cols = np.shape(matrix1)  # Or with brackets: (rows, cols) = np.shape(matrix1)
print(rows)
print(cols)

4
3


The components of a numpy array can be accessed in a similar way to lists. (Numpy arrays, like lists are indexed from 0.) 

In [22]:
# The component with index 3 of array1
print(array1[3])

4


In [23]:
# The component with index (0,2) of matrix 1
matrix1[0,2]

0.0

And we can modify the components. 

In [24]:
print("Array1 before modification:", array1)
array1[3] = 9
print("Array1 after modification: ", array1)

Array1 before modification: [1 2 3 4 5]
Array1 after modification:  [1 2 3 9 5]


In [25]:
print("Matrix1 before modification:\n",matrix1)
matrix1[2,2] = 3.6
matrix1[3,1] = 1.4
print("\nMatrix1 after modification:\n",matrix1)

Matrix1 before modification:
 [[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]

Matrix1 after modification:
 [[0.  0.  0. ]
 [0.  0.  0. ]
 [0.  0.  3.6]
 [0.  1.4 0. ]]


###  Slicing numpy arrays

We can slice numpy arrays in a similar way to slicing lists. Below are some examples. 

$\blacksquare$ An array formed by slicing  out every third element from the second to the 14th element of `x_values1`. 

In [26]:
x_slice1 = x_values1[1:14:3]
print("The original array:\n", x_values1)
print("\nThe slice:", x_slice1)

The original array:
 [-2.   -1.75 -1.5  -1.25 -1.   -0.75 -0.5  -0.25  0.    0.25  0.5   0.75
  1.    1.25  1.5   1.75  2.  ]

The slice: [-1.75 -1.   -0.25  0.5   1.25]


$\blacksquare$ The $2 \times 2$ bottom right part of `matrix1`. 

In [28]:
matrix_slice = matrix1[2:,1:]
print("The original matrix:\n", matrix1)
print("\nThe sliced out matrix:\n", matrix_slice)

The original matrix:
 [[0.  0.  0. ]
 [0.  0.  0. ]
 [0.  0.  3.6]
 [0.  1.4 0. ]]

The sliced out matrix:
 [[0.  3.6]
 [1.4 0. ]]


$\blacksquare$ The third row of `matrix1` as a simple one dimensional array. 

In [None]:
matrix1[2,:]

$\blacksquare$ The second column of `matrix1` as a simple one dimensional array.

In [29]:
col2 = matrix1[:,1]
print(col2)

[0.  0.  0.  1.4]


$\blacksquare$ The second column of `matrix1` in matrix form: i.e. as a column vector.  

In [30]:
column2 = matrix1[:,1:2]
print(column2)

[[0. ]
 [0. ]
 [0. ]
 [1.4]]


$\blacksquare$ The matrix consisting of the second and third columns of  `matrix1`.

In [None]:
matrix1[:,1:3]

### Passing numpy arrays to functions 

When we pass a numpy array with components of a given type (for example `float`) to a function $f$ which takes inputs of that type the result is an array of the same dimensions containing the result of applying $f$ to each of the components of the array. Let see some examples using the numpy array `xvalues1`

In [None]:
print(x_values1)

$\blacksquare$ Subtract 3 from every element. (Implicitly we are passing the array to the function $f(x) = x - 3$.)

In [None]:
x_values1 - 3

$\blacksquare$ Square every element. (Here $f(x) = x^2$.)

In [None]:
x_values1**2

$\blacksquare$ Pass the array to the function $f(x) = sin(x)$.

In [None]:
np.sin(x_values1)

**Warning.** This only works (out of the box) for built-in arithmetic functions such as `+`, `*`, `**` etc. or `numpy` library functions. For example, try running the following cell. (You will get an unpleasant surprise). 


In [33]:
math.sin(x_values1)

TypeError: only length-1 arrays can be converted to Python scalars

However we can make functions `numpy` compatible by 
using the `numpy` function `vectorize`. 

In [None]:
# We can do this directly. 
np.vectorize(math.sin)(x_values1)

array([-0.90929743, -0.98398595, -0.99749499, -0.94898462, -0.84147098,
       -0.68163876, -0.47942554, -0.24740396,  0.        ,  0.24740396,
        0.47942554,  0.68163876,  0.84147098,  0.94898462,  0.99749499,
        0.98398595,  0.90929743])

In [None]:
# Or (more transparently) rename the function and then use it.
my_sine = np.vectorize(math.sin)
my_sine(x_values1)

array([-0.90929743, -0.98398595, -0.99749499, -0.94898462, -0.84147098,
       -0.68163876, -0.47942554, -0.24740396,  0.        ,  0.24740396,
        0.47942554,  0.68163876,  0.84147098,  0.94898462,  0.99749499,
        0.98398595,  0.90929743])

### Componentwise arithmetic of numpy arrays

We can also perform arithmetical operations componentwise on arrays that are of the same dimensions. For example let's define a new array `x_values3` of the same dimension as `x_values1`. (The same methods work for any dimensions. Not just one dimensional arrays as here.) 

In [36]:
x_values3 = np.linspace(3,4,17)
print("x_values1 =\n", x_values1)
print("\nx_values3 =\n", x_values3)

x_values1 =
 [-2.   -1.75 -1.5  -1.25 -1.   -0.75 -0.5  -0.25  0.    0.25  0.5   0.75
  1.    1.25  1.5   1.75  2.  ]

x_values3 =
 [3.     3.0625 3.125  3.1875 3.25   3.3125 3.375  3.4375 3.5    3.5625
 3.625  3.6875 3.75   3.8125 3.875  3.9375 4.    ]


$\blacksquare$ Adding the two arrays componentwise. 

In [None]:
x_values1 + x_values3

$\blacksquare$ Multiplying the two arrays component wise. 

In [None]:
x_values1 * x_values3

**Note.** In Week 6 we will look at matrix and linear algebra operations implemented using `numpy` arrays. 

# Appendix - Boolean masking in numpy

Let's define a boolean array telling us which components of `x_values1` are greater than zero. 


In [37]:
# Firstly a reminder of x_values1
x_values1

array([-2.  , -1.75, -1.5 , -1.25, -1.  , -0.75, -0.5 , -0.25,  0.  ,
        0.25,  0.5 ,  0.75,  1.  ,  1.25,  1.5 ,  1.75,  2.  ])

We call this array `is_positive`. 

In [38]:
is_positive = x_values1 > 0
is_positive

array([False, False, False, False, False, False, False, False, False,
        True,  True,  True,  True,  True,  True,  True,  True])

And now lets extract the positive numbers from `x_values1`. 

In [39]:
x_values1[is_positive]

array([0.25, 0.5 , 0.75, 1.  , 1.25, 1.5 , 1.75, 2.  ])

Let's now create another array of the same length stretched over the interval $[-3, 2.5]$. 

In [40]:
x_values4 = np.linspace(-3, 2.5, 17)
x_values4

array([-3.     , -2.65625, -2.3125 , -1.96875, -1.625  , -1.28125,
       -0.9375 , -0.59375, -0.25   ,  0.09375,  0.4375 ,  0.78125,
        1.125  ,  1.46875,  1.8125 ,  2.15625,  2.5    ])

We compare this array with `x_values1` and create a boolean array telling us (componentwise) where the values of `x_values1` are greater than the values of `x_values4`. 

In [41]:
is_greater = x_values4 < x_values1
is_greater

array([ True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True, False, False, False, False, False, False])

And we can now extract as a new array the part of `x_values1` that satisfies this inequality (i.e. those components of `x_values1` that are greater than the corresponding components of `x_values4`.)

In [42]:
x_values1[is_greater]

array([-2.  , -1.75, -1.5 , -1.25, -1.  , -0.75, -0.5 , -0.25,  0.  ,
        0.25,  0.5 ])