# Numerical Python Refresher

You should already have used numerical python (numpy) at least a little.

In this worksheet you will refresh that knowledge and explore some things that numpy can do.

You can skip it if you're already comfortable with numpy, or perhaps skip to the last couple of exercises.

---


## Using this notebook

- If you haven't used a notebook like this before before, run through the [introduction to Jupyter notebook](0_Notebooks.ipynb). 


- Underneath the "Exercise" descriptions there is a place for you to write the code. 


--- 

# Introduction


Python has several built-in data types, including for:

- integers (whole numbers)
- floating-point numbers (real/decimal)
- lists (ordered sequences of any other variable)
- dictionaries (look-up tables from one variable to another)

Numpy adds a new data type, the *array*, which is a very efficient way of storing large collections, often of numbers but also other objects.  In this notebook we will learn about creating, using, exploring, and loading/saving arrays.


In [None]:
# This loads numerical python, the standard python library, and renames
# it to "np" so we can type it more quickly later.
import numpy as np

Numpy arrays are different to standard python lists in a few ways:

- Lists can mix different types of value, like `[1.4,  72, "cat"]`, but arrays are fixed
- You can do mathematics with arrays (we will discuss this below)
- Lists get very slow when they are large; arrays can be very fast even when they have millions of entries
    
If you are doing any kind of complex data analysis then you usually want to use arrays.  Use lists when you have a few small items to pass around.

---

# 1. Making Arrays

There are lots of ways to make arrays:
- `np.zeros(n)` makes an array which is initially full of `n` zero values
- `np.ones(n)` makes an array which is initially full of `n` one values
- `np.array(x)` converts some `x` (e.g. a list) into an array
- `np.arange(a, b)` makes an array of [$a$, $a+1$, $a+2$, ..., $b-1$]
- `np.linspace(x, y, n)` makes an array of `n` values evenly spaced between `x` and `y`
- `np.repeat(x, n)` makes an array of `n` copies of `x`

For example:

In [None]:
a = np.zeros(5)
print("Array of zeros:", a)

a = np.arange(5, 10)
print("Array from 5 to 9:", a)

a = np.repeat(3, 7)
print("Array of seven threes:", a)


You may notice that some of the arrays we made above have decimal points after the numbers. These arrays are contain decimal (or *floating point*) numbers; the ones without the points are whole numbers (*integers*). This is the data type of the array, and you can set it explicitly by adding the argument `dtype=int` or `dtype=float` to most of the functions above. Otherwise, numpy will try to guess what you want.

---

## <font color='blue'>Exercise 1</font>  

Make a list of a few  numbers (you can choose which ones and how many) called `b` and convert it into an array called `c`.

Hint: Look at the list of functions above for one which does this.

In [None]:
# Space for your workings for this exercise


---

Arrays can also have *dimensions*.  They can be 2D, representing a matrix or a map, or 3D, or higher.  For example, you might use a 3D array to represent an image with red, blue, and green colours.

You make higher dimensional arrays by specifying a shape for the array instead of just a size, as a tuple (round brackets), for example:

In [None]:
d = np.zeros((10, 10))
print(d)

---

## <font color='blue'>Exercise 2</font>  

Make and print an array called `e` which is a 5 x 6 matrix full of ones.

Hint: See the list of methods above on how to create an array of ones.

In [None]:
# Space for your working


--- 

# 2. Properties

Arrays have various properties that you access with the dot (`.`) syntax, like other python objects.  These include the size, shape, number of dimensions, and data type of the object.  For example:

In [None]:
f = np.zeros((4, 8))
print(f.shape)
print(f.size)

--- 
You can explore the properties of an array (or any other object) by typing the variable name in a cell, followed by a dot (no space between them), then pressing TAB to show a drop-down list of available properties.

## <font color='blue'>Exercise 3</font>

Display the number of dimensions and the data type of the array `g`.

Hint: Most properties have straightforward names; the datatype is not just called `data` though.

In [None]:
g = np.zeros((4, 8))

# Space for your working


---

While it's possible to change some of these properties manually, it's usually a bad idea.  If you want to change the shape or type of an array (for example to convert text to numbers, or integers to reals), there are several methods to do that: `z.reshape(new_shape)` and `z.astype(new_type)`

## <font color='blue'>Exercise 4</font>  

Convert the array `h` to a 2 x 15 array of integers.

Hint: you will need to combine the two methods above.  You can do this without creating a new variable, by "chaining" methods.

In [None]:
h = np.ones((6, 5))

# Space for your working


---


# 3. Mathematics

The first main difference between lists and arrays is that we can do maths on arrays all at once.  

For example, adding two *lists* together merges the two lists into one longer list, but adding two *arrays* together does maths and addes them together element-by-element:

In [None]:
k = [1, 2, 3]
m = np.array([1, 2, 3])

print("Adding lists: ", k + k)
print("Adding arrays: ", m + m)
print("Squaring arrays: ", m**2)

# This does not work - try uncommenting it and re-executing the cell to see the error
#print("Squaring lists: ", k**2)

---

You can add / multiply / etc values to an array all at once, either single (scalar) values or other arrays

In [None]:
n = np.array([1., 3., 4.])

# Let's make a new array with some weird function of this array
# Note that we can both add a single value to the array (the + 1),
# and also take an array's reciprocal, etc.
1 / (n + 1) * 100 + n**2

Arrays generally have to be the same size to do maths with them together:

In [None]:
# these two are size 3 arrays
o = np.array([4, 5, 6])
p = np.array([7, 8, 9])

# this is a size 2 array
q = np.array([10, 11])

# works, same size:
o + p

# This does not work - try uncommenting it to see
# p + q

---

Numpy has a large number of mathematical functions that can operate on arrays automatically, including trig functions, roots, and linear algebra.

## <font color='blue'>Exercise 5</font>  

Make an array of 100 values evenly spaced between 1 and 10, and then show the logarithm to base 10 of them.

(Log values are often useful for visualizing information that extends over a very wide range; we will see that later).

Hint: Look at the list of functions above to make the initial array.  You can explore the list of numpy functions by typing `np.` and pressing TAB.

In [None]:
# Space for your working


---

You can also perform mathematical comparisons (less than, greater than, equal to, etc.) on entire arrays; the result will be an array of True/False type (called `bool`s in python, after [George Boole](https://en.wikipedia.org/wiki/George_Boole)):

In [None]:
r = np.arange(-10, 10)
print(r)
print(r > 0)

You can't use the regular python `and` and `or` commands to combine multiple conditions together.  Instead you have to use the symbols:
- `&` ("and")
- `|` ("or")
- `~` ("not").  

## <font color='blue'>Exercise 6</font>  

Make a True/False array matching the variable `r` which is True whenever x is positive and less then 5.

Hint: Bracket the two parts of your condition carefully.

In [None]:
r = np.arange(-10, 10)

# Space for your working:


---

# 4. Indexing

## 4.1 Single Indices

You can extract values from an array using square brackets (just like you can with a list). Remember that indices in python start at zero.

Handily, you can also use negative indices to start from the end of the array (-1 is the last element, -2 the second-to-last, etc.):

In [None]:
s = np.arange(-5, 5)
print("s =", s)
print("fourth element= ", s[3])
print("last element =" , s[-1])
print("second-to-last element = ", s[-2])

You can modify values in an array with square brackets too:

In [None]:
t = np.arange(-5, 5)
print(t)
t[0] = -2         # setting a single value
t[1] = t[1] + 10  # adding to a single value
t[2] -= 5         # this is shorthand for x[2] = x[2] - 5
t[-1] *= 10       # shorthand; multiplying a single value
print(t)


---

### 4.2 Ranges

You can also reference a range of array elements using the `:` syntax.  The index `a:b` means "the range from `a` to `b`"
 (not including the end point):

In [None]:
u = np.arange(-5, 5)

print(u)
print(u[1:4])

If you leave out the start or end (`a` or `b`) then the range will go to the start or end of the whole array:

In [None]:
v = np.arange(-5, 5)

print("First three elements = ", v[:3])

## <font color='blue'>Exercise 7</font>  

Print the last three elements of `w`.

In [None]:
w = np.arange(-5, 5)

# Space for your working:


---

As well as using single numbers and ranges, there are two other ways to index numpy arrays:
- with a list or array of integer indices
- with a boolean True/False list or array, the same size as the indexed array


For example:

In [None]:
x = np.array([4, 8, 15, 16, 23, 42])

# A list of integer indices - will print the
# first, fourth, and sixth elements of x
y = [0, 3, 5]
print(x[y])

# A list of bools - will print the values of x
# where y is True, and not where it is False
y = [True, False, True, False, True, False]
print(x[y])

---

## <font color='blue'>Exercise 8</font>  

Use what you learned above about mathematical comparisons to make a True/False index that indicates where the values of the `x` array below are positive, and then use that index to show a version of `z` with negative values removed.

In [None]:
z = np.array([-7, 14, 3, -100, -3, 0, 9, 17])

# Space for your working


This is a common way to filter arrays.

---

You can also modify entire ranges or selections at once.  For example, this will change all -100 values in the data to zero:

In [None]:
A = np.array([-7, 14, 3, -100, -3, 0, 9, -100, 0, -100])
A[A == -100] = 0
print(A)

---

## 4.3 Higher dimension indices

You can also look up values in higher dimensional arrays.  To get one element you provide as many indices as there are dimensions, separated by commas.  For example:

In [None]:
# To create a new 2D array you can used nested square brackets (i.e. a list of lists):
B = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

print("Full matrix:\n", B)
print("")
print("The (1, 2) element is: ", B[1, 2])


## <font color='blue'>Exercise 9</font>  

Print out the element in the last row and last column of `C`.


In [None]:
C = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

# Space for your working


---

If you just use a single index for a 2D array, then this extracts a single row, as a 1D array:

In [None]:
D = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

D[0]

If you want a single column, you can use an empty range as the first index:

In [None]:
E = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
print(E)
print("")
print(E[:, 1])

---


To extract multiple rows or columns, or a rectangular subset of an array, you can use the same range syntax as above, again separated by commas.


## <font color='blue'>Exercise 10</font>  

Show the final two columns of the array `F`

In [None]:
F = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

# Space for your working


---

# 5. Array methods

Numpy arrays have a large number of helpful methods.  For example, the min and max methods return the smallest and largest values in an array:

In [None]:
G = np.array([-3.1, 5.4, 7.3, 222, 4.444, 3, 0])
print(G.min(), G.max())

## <font color='blue'>Exercise 11</font>  

Find out what the methods `H.argmin` and `H.argmax` do, and demonstrate them.  

Hint: You can look up their documentation either on-line, or here in this notebook by executing `H.argmin?`, or you can just experiment with them.

In [None]:
H = np.array([-3.1, 5.4, 7.3, 222, 4.444, 3, 0])

# Space for your working

# argmin and argmax find the index of the largest and smallest value


---

You can also sort an array using `x.sort()`.  

This sorts *in-place*, meaning that it modifies the array directly instead or returning a new copy.  You can use `np.sort` instead if you want a copy.

In [None]:
K = np.array([-3.1, 5.4, 7.3, 222, 4.444, 3, 0])

print("Before sorting:", K)
K.sort()
print("After sorting:",K)


If instead you want to know the indices that *would* sort an array, you can use `K.argsort()`.


## <font color='blue'>Exercise 12</font>  

Below, the two arrays `animals` and `legs` represent the names and number of legs of various species.  Sort both arrays from fewest to most feet.

Hint: make one index with `argsort` and use it with with arrays.

In [None]:
animals = np.array(["Human", "Centipede", "Snail", "Ant", "Cat", "Octopus", "Dog"])
legs = np.array([2, 100, 0, 6, 4, 8, 4])

# Space for your working:
