# Operations on NumPy arrays

One of the most powerful features of NumPy is its ability to manipulate entire arrays of numbers in one go.

## Numerical operations

In Python, you can multiply a single number by another to get a new number:

In [1]:
# insert code here

However, if you try to multiply a list by a number it will give a perhaps strange result:

In [2]:
# insert code here

This is hapenning because Python's lists are not restricted to only hold numbers, nor must they only hold one consistent type, and so they do not have any special logic to account for the case where they *do* only have numbers in them. The only safe way to interpret `*` that works for all Python lists is "duplicate the array".

NumPy, however, is designed to deal with numerical data and so interprets the request differently:

In [3]:
# insert code here

In [4]:
# insert code here

<style>
table.array {
  border-spacing: 0;
  border-collapse: collapse;
}
table.array td {
  border: 2px solid darkblue;
  background-color: lightblue;
  padding: 2px;
  text-align: center;
}
.operation {
  display: flex;
  flex-direction: row;
  align-items: center;
  gap: 10px;
  font-family: monospace;
  font-size: 2em;
  justify-content: center;
}
.vertical {
  writing-mode: vertical-rl;
}
table.array th.axis_label, table.array td.axis_label {
  border: 0;
  background-color: white;
  padding: 2px;
  font-weight: normal;
  font-size: 0.8em;
}
table.array .selected td, table.array td.selected {
  background-color: orange;
}
table.array td.empty, table.array tr.empty td {
  background-color: white;
  color: lightgrey;
}



<div class="operation">
<div>
    <table class="array"><tr><td>3.14</td></tr><tr><td>2.71</td></tr><tr><td>2.36</td></tr></table>
</div>
<div>×</div>
<div>2</div>
<span style="font-size: 200%;">⇨</span>
<div>
    <table class="array"><tr><td>6.28</td></tr><tr><td>5.42</td></tr><tr><td>2.36</td></tr></table>
</div>
</div>

Here, each number has been multiplied by 2 individually.

You can perform any standard numerical operations to NumPy arrays, including `*`, `+`, `/`, `-` and `**`. You can also use comparison operations like `==`, `>` and `<=`. If your array contains booleans (`True`/`False`) the you can also use the binary logic operations such as `|` ("or") and `&` ("and") as well as the unary logical operator `~` ("not").

In all of these cases, it will apply the operation to each element of the array indivudually and give you back an array of the same size.

One big benefit of this is an improvment in speed. To demonstrate this, let's try doubling all the values in a large list of 1 million values:

In [5]:
# insert code here

Doing this with plain Python will be:

In [6]:
# insert code here

(here we're using the `%%timeit` magic to display the execution time of a specific code block)

But NumPy allows us to do:

In [7]:
# insert code here

You might see different results on your computer but speedups of anything from 10 to 100 times is common on an example like this. There are plenty of operations which might see speedups of 1000 times or more.

### Exercise #

- Try multiplying the array with different numbers
- Subtract $3.04$ from each element of the array
- Use a comparison operator to ask if each number is greater than $2.5$.


In [8]:
# insert code here

##### Element-wise multiplication

In [9]:
# insert code here

In [10]:
# insert code here


In [11]:
# insert code here


##### Element-wise subtraction

In [12]:
# insert code here


##### Element-wise comparison


In [13]:
# insert code here

## Functions

As well as simple numerical operations, you will often also want to perform more complex operations on your data. For example, the cosine of a number. We can do this in plain Python with the `math` module:

In [14]:
# insert code here

This works, but has the same problem as above in that it doesn't work as you want with a Python list:

In [15]:
# insert code here

To help with this, NumPy provides [a large number of operations via the `numpy` namespace](https://numpy.org/doc/stable/reference/routines.math.html). They work the same way as the Python functions for single numbers:

In [16]:
# insert code here

But they also work with Python lists:

In [17]:
# insert code here

You see here that even though we passed it a Python list, it has returned the result as a NumPy array. We can also pass in a NumPy array directly:

In [18]:
# insert code here

There is a cost to passing in Python lists compared with using an array directly, as it has to convert it from one to the other. If you can, it's best to keep things as NumPy arrays throughout your computations.

##### Other numpy functions

NumPy provides a wide range of mathematical functions that operate efficiently on arrays, including trigonometric functions, logarithmic functions, exponential functions, etc.

In [19]:
# insert code here


In [20]:
# insert code here


It also provides functions for aggregating data, such as calculating the sum, mean, minimum, maximum, etc., of an array or along a specific axis. These functions are highly optimized and efficient.


In [21]:
# insert code here


In [22]:
# insert code here


##### Generating arrays filled with (pseudo-)random numbers

In [23]:
# Generate a 1D array of 5 random integers between 0 and 10
# insert code here


In [24]:
# Generate a 2D array of shape (3, 4) with random floating-point numbers between 0 and 1
#insert code here


## Plotting arrays

As our arrays get longer and more complex, it's difficult to see what's going on by just looking at the numbers. Let's see how we can easily plot the data as a line graph. Let's make our data to be plotted:

In [26]:
# Numbers from 0 to 20. 100 of them.

# insert code here

First, we need to import `matplotlib`, the defacto standard plotting tool for Python:

In [27]:
# insert code here

Then, we need to make a place for the plotting to happen which we do with the `plt.figure()` function. 
We then draw on the axes with `plt.plot` and pass it the $y$ values:


(hint: you will also need `plt.show()` to display the figure withour any weird output)

In [28]:
# insert code here

It has done the plot and the the $y$ values are correct, but the $x$ axis has just been taken as the integer indexes of the array. If we want to label the $x$ axis then we can pass two arguments to `plot`:

In [29]:
# insert code here

### Exercise : Data manipulation and plotting

- Generate some random data 
- Plot the data as a line graph
- Find the mean average, $\mu$, value in the data
  - Have a look at the [list of statistical functions](https://numpy.org/doc/stable/reference/routines.statistics.html) to find an appropriate one
- Find the standard deviation, $\sigma$, of the data
- Write a function which returns the ratio between the standard deviation and the mean, $\dfrac{\sigma}{\mu}$
- Check that your funtion works with both Python lists as well as the `data` array


In [30]:
# Generate some random data
# insert code here


In [31]:
# Plot the data as a line graph
# insert code here


In [32]:
# Calculate mean and standard deviation of the data - you can use the numpy functions np.mean() and np.std()

# insert code here


In [33]:
# Calculate the ratio between standard deviation and mean using the function
# insert code here


In [34]:
# Check the function with Python lists
# insert code here


### Homework Exercise

- Generate an array `x` with 100 evenly spaced values from 0 to 10 (hint: use `numpy.linspace()` to create a range of x-values.
- Generate an array `y` by applying a mathematical expression (sin(x)) to the x array. 
- Generate a new array `y_r` which is equal to `y` + some random noise. Use `numpy.random.normal()` to make the data slightly varied.
- Plot the generated data as a scatter plot using `matplotlib.pyplot.scatter()`. This visualizes the relationship between the x and y values.
- Use `numpy.max()` to find the maximum value in the y array.
- Use `numpy.argmax()` to find the index of the maximum value in the y array.
- Print the maximum value and the index of the maximum value.

Hint: you can display both `y` and `y_r`in the plot, by running `plt.plot` or `plt.scatter` multiple times before `plt.show`. Try plotting `y` with a line `plt.plot(x, y)` and `y_r` with a scatter plot `plt.scatter(x, y_r)`.
play with the `color` parameter to distinguish between the two.