# Scientific Computing in Python with Numpy 

<a href=https://numpy.org/>Numpy</a> (numerical Python) is a library designed for performing scientific computing in Python. 

In this notebook, we will introduce numpy arrays, a data structure introduced in numpy for working with vectors and matrices. We will explore how to create them, how to manipulate them, and how to use them for efficient numerical calculations using numpy functions. 

**Learning objectives for this notebook:**

After completing this notebook, you are able to:
* create numpy arrays from a list of numbers
* use indexing and slicing with numpy arrays
* iterate over a numpy array 
* perform mathematical operations on numpy arrays
* use functions for creating arrays (eg. `np.zeros()`, `np.linspace()`, `np.random.random()`
* use numpy functions for vectorized calculations

In [None]:
from IPython.lib.display import YouTubeVideo
YouTubeVideo('xECXZ3tyONo')

## Numpy Arrays

You have encountered numpy arrays in previous notebooks. If you don't remember, open the first notebook and read the section where we introduce numpy arrays.

To use numpy arrays, we first need to import the numpy library, which we will do using the shortened name "np" (to save typing):

In [None]:
import numpy as np

Now that we have imported numpy, we can use functions in numpy to create a numpy array. A simple way to do this is to use the function `np.array()` to make a numpy array from a comma-separated list of numbers in square brackets:

In [None]:
a = np.array([1,2,3,4,5])
print(a)

Note  that numpy does not make a distinction between row vectors and column vectors: there are just vectors. 

### Indexing arrays (and counting from zero)

One useful thing about arrays is that you can access the elements of the array using square brackets:

`a[n]` will give you the n-th element of the array `a`. 

This process of extracting a single element from the array is called **indexing**. 

Note that here we encounter for the fist time what is know as the python **counting from zero** convention. What is the counting from zero convention? In the example above, we created a array:

```
a = np.array([1,2,3,4,5])
```

The first element of `a` is `1`. You might think that if you want to access the first element of `a`, you would use the notation `a[1]`. Right?

Let's try it:

In [None]:
print(a[1])

**WRONG!** Why? Because the makers of Python decided to start counting from zero: the first element of a tuple `a` is actually `a[0]`. 

(This is a long-standing <a href=https://en.wikipedia.org/wiki/Zero-based_numbering>discussion among computer scientists</a>, and the convention is <a href=https://en.wikipedia.org/wiki/Comparison_of_programming_languages_(array)#Array_dimensions>different</a> in many different languages. There are advantages and disadvantages of both, and even essays written about it...but in any case, Python chose to start arrays at zero.)

This also helps better understand the `range()` function: for example, to loop over all the elements in `a`, I can use this code:

In [None]:
for i in range(len(a)):
    n = a[i]
    print('a[%d] is %d' % (i,n))

Here the `len` function returns the length of the array `a`. As we saw before, Python has very smart `for` loops that can automatically iterate over many types of objects, which means we can also print out all the elements of our array like this:

In [None]:
for n in a:
    print(n)

In Python, if you try to index beyond the end of the array, you will get an error: 

In [None]:
a[5]

(Remember: indexing starts at zero!)

Python also has a handy feature: negative indices count backwards from the end, and index `-1` corresponds to the last element in the array! 

In [None]:
a[-1]

In [None]:
a[-2]

We can also use indexing to change the values of elements in our array:

In [None]:
print(a)
a[2] = -1
print(a)

**`Exercise 4.1`** \
Set the first three, and the last two, entries of the following array to zero:

In [None]:
a = np.array([1,2,3,4,5,6,7,8,9,10,11,32,55,78,22,99,55,33.2,55.77,99,101.3])

#some code to set the first three and last two entries to zero 

print(a)

### Slicing numpy arrays

Numpy arrays also support a special type of indexing called "slicing" that does not just return a single element of an array, but instead returns a whole part of array. 

To do this, I put not just a single number inside my square brackets, but instead two numbers, separated by a colon `:`

`a[n:m]` will return a new tuple that consist of all the elements in `a`, starting at element `n` and ending at element `m-1`. 

Let's look at a concrete example:

In [None]:
a = np.array(range(10))
print(a)
print(a[0:5])

The notation `a[0:5]` has "sliced" out the first five elements of the array.

With slicing, you can also leave off either `n` or `m` from the slice: if leave off `n` it will default to `n=0`, and if you leave off `m`, it will default to the end of the array (also the same as `m=-1` in Python indexing):

In [None]:
print(a[:5])

In [None]:
print(a[5:])

Also handy: you can can have Python slice an array with a "step" size that is more than one by adding another `:` and a number after that. Find out its operation using:

In [None]:
print(a[0:10:2])

Fun: you can also use negative steps:

In [None]:
print(a[-1:-11:-1])

And finally, unlike indexing, Python is a bit lenient if you slice off the end of an array:

In [None]:
print(a[0:20])

**`Exercise 4.2`** \
Slicing can also be used to *set* multiple values in an array at the same time. Use slicing to set first 10 entries of the array below to zero in one line of code.

In [None]:
a = np.array(range(20))+1
print(a)
some code that sets the first 10 entries to zero
print(a)

### Mathematical operations on arrays

An advantage of using numpy arrays for scientific computing is that way they behave under mathematical operations. In particular, they very often do exactly what we would want them to do if they were a vector:

In [None]:
a = np.array([1,2,3,4,5])
print(2*a)

In [None]:
print(a+a)

In [None]:
print(a+1)

In [None]:
print(a-a)

In [None]:
print(a/2)

In [None]:
print(a**2)

What about if I multiply two vectors together? 

In mathematics, if I multiply two vectors, what I get depends on if I use the "dot product" or the "outer product" for my multiplication:

https://en.wikipedia.org/wiki/Row_and_column_vectors#Operations

The "dot product" corresponds to multiplying a column vector by a row vector to produce a single number. The "outer product" (also called the "tensor product") corresponds to multiplying the column vector by the row vector to make a matrix. 

**Question:** If I type `a*a`, or more generally `a*b`, does Python use the inner or outer product? 

It turns out: it uses **neither!** In Python, the notation `a*a` produces what is commonly called the "element-wise" product: specifically,

`a*b = [a[0]*b[0] a[1]*b[1] a[2]*b[2] ...]`

(Mathematically, this has a fancy name called the <a href=https://en.wikipedia.org/wiki/Hadamard_product_(matrices)>Hadamard product</a>, but as you can see, despite the fancy name, it's actually very simple...)

We can see this in action here:

In [None]:
print(a*a)

What if I actually want the dot product or the outer product? For that, Python has functions `np.dot()` and `np.outer()`: 

In [None]:
print(np.dot(a,a))

In [None]:
print(np.outer(a,a))

Pretty much all operators work with numpy arrays, even comparison operators, which can sometimes be very handy:

In [None]:
print(a)
print(a>3)

**`Exercise 4.3`** \
Generate a sequence of the first 20 powers of 2 in a numpy array (starting at $2^0$). 

Your output should be an array $[2^0, 2^1, 2^2, 2^3, ...]$. 

*(Hint: Start with a numpy array created using an appropriate range function that makes an array [0,1,2,3,...])*

In [None]:
# your code that makes the desired array

## Functions for creating numpy arrays

In numpy, there are also several handy functions for automatically creating arrays:

In [None]:
a = np.zeros(10)
print(a)

In [None]:
a = np.ones(10)
print(a)

### `np.linspace`

To automatically generate an array with linerly increasing values you can take `np.linspace()`:

https://docs.scipy.org/doc/numpy/reference/generated/numpy.linspace.html

It takes three arguments: the starting number, the ending number, and the number of points.

This is a bit like the `range` function we saw before, but allows you to pick the total number of points, automatically calculating the (non-integer) step size you need:

In [None]:
a = np.linspace(0,20,40)
print(a)
print()
print("Length is: ", len(a))
print("Step size is: ", a[1]-a[0])

Note that if we wanted to have a step size of exactly 0.5, we need a total of 41 points:

In [None]:
a = np.linspace(0,20,41)
print(a)
print()
print("Length is: ", len(a))
print("Step size is: ", a[1]-a[0])

**`Exercise 4.4`** \
Generate an array that runs from -2 to 1 with 20 points using `linspace`.

In [None]:
a = your code
print(a)

### `np.arange()`

If we want to have more control on the exact spacing, we can use the `np.arange()` function. It is like `range()`, asking you for the start, stop, and step size:

In [None]:
a = np.arange(0,20,0.5)
print(a)
print()
print("Length is: ", len(a))
print("Step size is: ", a[1]-a[0])

Here, we already see a small quirk of `arange`: it will stop once the next point it calculates is `<` (not `<=`) to the stop point. If we want to get a range that stops at `20.0`, we need to make the stop point any number a bit bigger than 20 (but smaller than our step size):

In [None]:
a = np.arange(0,20.00000001,0.5)
print(a)
print()
print("Length is: ", len(a))
print("Step size is: ", a[1]-a[0])

For this reason, I do not find myself using `np.arange()` very often, and mostly use `np.linspace()`. There are also several other useful functions, such as <a href=https://docs.scipy.org/doc/numpy/reference/generated/numpy.geomspace.html>np.geomspace()</a>, which produces geometrically spaced points (such that they are evenly spaced on a log scale). 

**`Exercise 4.5`** \
Generate a numpy array that has a first element with value 60 and last element 50 and takes steps of -0.5 between the values. 

In [None]:
a = your code
print(a)

### Random numbers 

Numpy can also generate arrays of random numbers:

In [None]:
a = np.random.random(40)
print(a)

This will generate uniform random numbers on the range of 0 to 1, but there are also several other random number generator functions that can make <a href=https://en.wikipedia.org/wiki/Normal_distribution>normally distributed</a> random numbers, or random integers, and more.

**`Exercise 4.6`** \
Generate a numpy array that contains 300 random grades that have a distribution of a <a href=https://www.mathsisfun.com/data/standard-normal-distribution.html>bell-shaped curve</a> that might represent the final grades of the students in this course, with an average grade of 7.5 and a standard deviation of 1. Make sure your grades are rounded to a half point.

(You may find it useful have to look at the help of the `np.random.normal()` function, and at your answer from Assignment 1 for the rounding.)

(Because of the properties of a normal distribution a small pecentage of the grades may be above a 10, you may leave this for now.)

In [None]:
#Your code here that results in a numpy array rounded_grades
...some code...
print(rounded_grades)

## Numpy functions 

We can use `for` loops to iterate through numpy arrays and perform calculations. 

For example, this code will calculate the average value of all the numbers in an array:

In [None]:
a = np.linspace(1,20,20)
avg = 0

for x in a:
    avg += x
    
avg /= len(a)

print("Average is", avg)

Because this is a common operation, there is a function built into numpy `np.average()` that can do this for you:

In [None]:
print("Average is", np.average(a))

This is very handy: it saves us loads of typing! From the function name, it is also easy to understand what you are doing, making the code clearer and easier to read. However, the purpose of numpy functions is not only to save lots of typing: they also can often perform calculations MUCH faster than if you do program the calculation yourself with a `for` loop, as we will see in the next section.

Python also has many other useful functions for performing calculations using arrays:

In [None]:
# The square root
print(np.sqrt(a))

In [None]:
# Numpy also has max and min, here is an example of min
a = np.linspace(-10,10,100)
print(np.min(a**2))

Good question for you to think about: why is this not zero? And what would I have to change above to get the code to return zero? 

In addition to finding the minimum value in a vector, the function `argmin` can tell you **where** (what index number) the minimum is:

In [None]:
# Find the index number of the minimum of the array
i = np.argmin(a**2)
print(i)
print((a**2)[i])

Note also here that we used round brackets `()` around the `a**2` in the `print` statement to be able to then index the resulting array `a**2` array using square brackets `[]`. 

You can find the full list of mathematical numpy functions on the documentation website:

https://docs.scipy.org/doc/numpy/reference/routines.math.html

and the full list of all functions in the reference guide: 

https://docs.scipy.org/doc/numpy/reference/index.html

**`Exercise 4.7`** \
Make a array `x` that runs from 0 to 4 with 20 points, and calculate an array `y` whose entries are equal to the square root of the entries in `x`. 

In [None]:
your code to make the requested arrays x and y
print(y)

## Solutions to exercises

**Exercise 4.1** 

In [None]:
a = np.array([1,2,3,4,5,6,7,8,9,10,11,32,55,78,22,99,55,33.2,55.77,99,101.3])

#some code to set the first three and last two entries to zero 

# A bit tricky, but these work
a[0] = 0
a[1] = 0
a[2] = 0

# We could count forwards (a[19] = 0 and a[20] = 0), but is much 
# easier to count backwards from the  end
a[-2] = 0
a[-1] = 0
print(a)

**Exercise 4.2:** 

In [None]:
a = np.array(range(20))+1
print(a)
a[0:10]  =  0
print(a)

**Exercise 4.3:** 

In [None]:
n = np.array(range(21))
out = 2**n
print(out)

**Exercise 4.4:** 

In [None]:
a = np.linspace(-2,1,20)
print(a)

**Exercise 4.5:** 

In [None]:
a = np.arange(60,49.9,-0.5)
print(a)

**Exercise 4.6:** 

In [None]:
raw = np.random.normal(7.5,1,300)
rounded_grades = np.round(raw*2)/2
print(rounded_grades)

**Exercise 4.7:** 

In [None]:
x = np.linspace(0,4,20)
y = np.sqrt(x)
print(y)