# Numerical Operations

## Learning Goals

- What makes NumPy arrays useful for numerical operations
- How arrays compare against lists for math
- an overview of array operations one can perform
- how to find the mean of a NumPy array
- how to sort a NumPy array
- What it means to apply an operation along an axis

## Introduction
Numpy arrays are at their most powerful when containing numerical data. Many things that are cumbersome with lists come naturally for Numpy arrays. They have math at the core of their being.

For starters, we will create both a list as well as a NumPy array containing the same numbers. We will see that the difference in container type (list vs array) changes how easily we can do mathematical operations on the contained elements.  

In [None]:
import numpy as np

In [None]:
# Initialize our list and our array
number_list = list(range(10))
number_arr  = np.arange(10)

# print them so we see they contain the same numbers
print(f"{number_list=}")
print(f"{number_arr=}")

### Math with Lists

For a list of numbers, we want to multiply each element by factor 2.

In [None]:
# a naive approach
number_list * 2

Multiplying a list by a number just repeats the list a number of times. That is not what we want.  
To do this properly, we might need to use a for-loop or a so-called list comprehension (don't worry about this now), like this:

In [None]:
# for-loop
double_numbers = []
for n in number_list:
    double_numbers.append(n*2)
print("for-loop", double_numbers)

# List comprehension
print("list comprehension", [n*2 for n in number_list])  

That was not funny nor intuitive.  

### Math with NumPy Arrays
Multiplying a numpy array with a number does what we want straight away, namely multiplying each element in the array by that number.

In [None]:
print(number_arr*2)

The same logic also works for other mathematical operations, which you are welcome to try out.

In [None]:
# Test this for other operations (+, -, /, **)

### Element-wise addition of subtraction

__Element-wise addition for lists__  
Imagine we have two lists of numbers which we want to add elementwise. Just adding two lists like this:

```python
    list_1 = [1,2,3]
    list_2 = [3,2,1]
    list_1 + list_2
```

will just combine both lists into a single new list like this:

```python
   >>> [1,2,3,3,2,1]
```

 For basic python lists, implementing element-wise addition is relatively cumbersome:

In [None]:
list_1 = [1,2,3,4,5]
list_2 = [-1,-2,-3,-4,-5]

# list_1 + list_2 won't do what we want!

# The long route would be:
new_list = []
for idx in range(len(list_1)):
    new_list.append(list_1[idx] + list_2[idx])
print(new_list)

__Element-wise addition for NumPy arrays__  
We will see that NumPy arrays make such manipulations much easier, as they natively support element-wise operations if the shapes of the arrays are compatible (or can be made compatible).

In [None]:
one_array = np.ones(10)
rnd_array = np.random.randint(10,size = 10) # Array of random integers from 0 up to incl. 9

Arrays of compatible dimensions can be combined with mathematical operations elementwise using their normal operators, e.g. `+`, `-`, `*`,....

In [None]:
one_array + rnd_array

## There are many NumPy operations!

Beyond combining two different arrays, we can also perform operations on a single array. We might care for the average value of an array, the maximum or minimum value, and so on. 

Sometimes, we want to extract some information from an array, e.g. its mean, standard deviation, percentiles etc. Numpy provides various functions for that, and the class of Numpy arrays further implement some of them as methods.

The following overview is not meant to be memorized by heart. Again, it's main purpose is to illustrate that there are a lot of inbuilt functions that make working with numerical data easy.


| **Category**                  | **Operation**                     | **Function/Method**                                  | **Description**                                   |
|-------------------------------|-----------------------------------|-----------------------------------------------------|---------------------------------------------------|
| **Statistical Operations**    | Mean                              | `np.mean(array)`, `array.mean()`                    | Computes the average of all elements.             |
|                               | Median                            | `np.median(array)`                                  | Computes the middle value of the array.           |
|                               | Standard Deviation                | `np.std(array)`, `array.std()`                      | Measures the spread of data.                      |
|                               | Variance                          | `np.var(array)`, `array.var()`                      | Computes average squared differences from mean.   |
|                               | Minimum/Maximum                   | `np.min(array)`, `np.max(array)`                    | Finds the smallest or largest element.            |
|                               | Percentile                        | `np.percentile(array, q)`                           | Finds the `q`th percentile of the array data.     |
| **Aggregation Functions**     | Sum                               | `np.sum(array)`, `array.sum()`                      | Computes the sum of all elements.                 |
|                               | Product                           | `np.prod(array)`, `array.prod()`                    | Computes the product of all elements.             |
|                               | Cumulative Sum                    | `np.cumsum(array)`                                  | Calculates cumulative sum across the array.       |
|                               | Cumulative Product                | `np.cumprod(array)`                                 | Calculates cumulative product across the array.   |
| **Mathematical Functions**    | Square Root                       | `np.sqrt(array)`                                    | Computes square root of each element.             |
|                               | Exponentials                      | `np.exp(array)`                                     | Computes exponential (e^x) of each element.       |
|                               | Logarithms                        | `np.log(array)`                                     | Computes natural log of each element.             |
|                               | Absolute Value                    | `np.abs(array)`, `array.abs()`                      | Finds absolute value of each element.             |
|                               | Rounding                          | `np.round(array)`, `np.floor(array)`, `np.ceil()`   | Rounds to nearest, lowest, or highest integers.   |
| **Boolean Operations**        | Any                               | `np.any(array)`, `array.any()`                      | Checks if any elements are `True`.                |
|                               | All                               | `np.all(array)`, `array.all()`                      | Checks if all elements are `True`.                |
|                               | Count Non-Zero                    | `np.count_nonzero(array)`                           | Counts non-zero elements in the array.            |
|**Sorting array values**                               | Sorting                           | `np.sort(array)`, `array.sort()`                    | Sorts elements along a specified axis.            |
|                               | Unique Values                     | `np.unique(array)`                                  | Finds unique values in the array.                 |
  



## Testing some NumPy Operations

In [None]:
# We initialize our test array
array = np.random.randint(100, size = (4,5))
print(array)

In [None]:
# Mean
mean_value = np.mean(array) # == array.mean()
print("Mean:", mean_value)
# Median
median_value = np.median(array)
print("Median:", median_value)
# Minimum
min_value = np.min(array)   # == array.min()
print("Minimum:", min_value)

__Applying Operations along dimensions__

For multidimensional arrays, we may apply operations also along an array. Imagine a 2d-array (like a table of numbers), in which each row represents a different patient and each column a different day, with the entries storing some variable of interest such as a pain score.  
We could take the overall mean, i.e. the average across all patients and all days, but we might also be in a more fine-grained view.  

- A mean along the columns would give the overall mean of each patient, across all days
- A mean across rows the average pain value of each day, across all patients.

In [None]:
# Mean of columns, i.e., collapse the row dimension
array.mean(axis = 0) 

In [None]:
 # Mean of rows, collapsing across column dimension
array.mean(axis = 1)

In [None]:
# The axes may also be a tuple, so that you can average across multiple dimensions at once
array.mean(axis = (0,1))

#### Logical Operations on Arrays

We can also use logical statements on NumPy array, e.g., checking whether elements in an array are larger than a certain value or not. 
Such logical (Boolean) operations on arrays must be performed with a bit of care, as a list of boolean values itself is not a valid boolean that can be used in comparisons. To yield a single boolean value to compare against, you can test whether `any()` element satisfies a criterion, and likewise you may test whether a criterion is True for `all()` elements of an array.

In [None]:
boolean_array = array > 40
print(f"{boolean_array=}")
any_greater_than_8 = np.all(boolean_array)
print("Any elements greater than 8:", any_greater_than_8)

boolean_array = array < 20
print(f"{boolean_array=}")
any_smaller_than_0 = np.any(boolean_array)
print("All elements greater than 0:", any_smaller_than_0)

#### Sorting Arrays

Sometimes, we want to sort an array by the value of the elements. This can be done using the sort function `np.sort()`.

In [None]:
# Sorting
print("Original array:")
print(array)
sorted_array = np.sort(array, axis = 0)
print("Sorted array:", sorted_array)

#### Unique Values

Sometimes, it is useful to check for all unique values in an array. `np.unique()` does just that.

In [None]:
# Unique values
unique_values = np.unique(array)
print("Unique values:", unique_values)

#### Summation

We can also sum all elements of an array. As we have shown for taking the mean, this operation can also be applied along an axis.
Likewise, we can also perform a cumulative sum.

In [None]:

total_sum = np.sum(array)
print("Sum of all elements:", total_sum)

row_sum = array.sum(axis = 1)
print("Row sum:", row_sum)

# Cumulative sum of elements
cumulative_sum = np.cumsum(array)
print("Cumulative sum of elements:", cumulative_sum)



## Bonus: Linear Algebra

All of the above operations are already pretty useful. In a lot of scientific computing and data analysis, linear algebra occurs every now and then.
So, NumPy of course also supports linear algebra operations, some of which need to be accessed via its `linalg` submoduleÂ¹. This encompasses for instance matrix multiplication, eigendecomposition, singular value decomposition, and many more. Principal components analysis a dimensionality reduction method some of you might have already encountered, is for instance based on either singular value decomposition or eigendecomposition.

Linear algebra is the work horse of statistics and dynamical systems, so you might encounter it sooner or later in the wild.


| **Category**                        | **Operation**                     | **Function/Method**                                  | **Description**                                   |
|-------------------------------------|-----------------------------------|-----------------------------------------------------|---------------------------------------------------|
| **Linear Algebra Operations**       | Dot Product                       | `np.dot(array, other)`, `array.dot(other)`          | Computes the dot product for 1D, 2D, and higher-dimensional arrays. |
|                                     | Matrix Inverse                    | `np.linalg.inv(array)`                              | Finds the inverse of a square matrix.             |
|                                     | Eigenvalues and Eigenvectors      | `np.linalg.eig(array)`                              | Computes eigenvalues and eigenvectors.            |
|                                     | Singular Value Decomposition (SVD)| `np.linalg.svd(array)`                              | Computes the singular value decomposition of a matrix. |
|                                     | Determinant                       | `np.linalg.det(array)`                              | Computes the determinant of a matrix.             |
|                                     | Matrix Rank                       | `np.linalg.matrix_rank(array)`                      | Finds the rank of a matrix.                       |
| **Matrix Products** | Matrix Multiplication          | `array @ other`, `np.matmul(array, other)`          | Performs matrix multiplication (like linear algebra). |
|                                     | Element-wise Multiplication       | `array * other`                                     | Multiplies corresponding elements of arrays.      |



We won't delve into details here, but just mention it. I'd also like to mention that there is a convenient short-hand for matrix multiplication in NumPy, namely using the `@` symbol, so that `matrix_a @ matrix_b` is the matrix product of those two matrices. The usual asterisk `*` implements the element-wise or Hadamard product, `matrix_c = matrix_a * matrix_b`, where `matrix_c[i,j] = matrix_a[i,j] * matrix_b[i,j]`).

Old code might use the specialised `matrix` object of NumPy. Don't. It's obsolete. Use normal arrays.

Â¹ _on a personal note, I prefer scipy's linalg for some applications, though. By and large, they are very similar, though._

## Summary and Outlook

In this notebook, we discussed a variety of mathematical operations for NumPy arrays. We have seen that NumPy arrays support the regular arithemtic operations element-wise for compatible arrays, and that likewise multiplication by a scalar (i.e., just a single number) multiplies each element of the array by that scalar. We have also seen that there are multiple mathematical operations that can be performed on single arrays, e.g. to get the mean, find the maximum, or to sort values.
In the next notebook, we will have another look at masking arrays, and combining different arrays into a new array. We will also learn how to append elements to existing arrays.