# Mathematical and Statistical Operations

This notebook explores how to perform efficient mathematical and statistical computations on NumPy arrays. We'll cover vectorized operations, universal functions (ufuncs), and aggregations.

In [1]:
import numpy as np

Let's start by creating a sample 2x4 array.

In [2]:
data = np.array([[1,2,3,4],
                [5,6,7,8]])
print(data)

[[1 2 3 4]
 [5 6 7 8]]


### 1. Vectorized Arithmetic and Universal Functions

NumPy allows you to perform operations on entire arrays at once, which is much faster than using Python loops. This is called vectorization.

#### Basic Arithmetic
You can perform element-wise arithmetic (addition, subtraction, multiplication, etc.) directly on the array. Here, every element in `data` is multiplied by 10.

In [None]:
print(data*10)

[[10 20 30 40]
 [50 60 70 80]]


#### Universal Functions (ufuncs)
A ufunc is a function that operates on arrays in an element-by-element fashion. NumPy has many built-in ufuncs. For example, `np.sqrt()` calculates the square root of every element.

In [None]:
print(np.sqrt(data))

[[1.         1.41421356 1.73205081 2.        ]
 [2.23606798 2.44948974 2.64575131 2.82842712]]


Another common ufunc is `np.exp()`, which calculates the exponential ($e^x$) of each element.

In [None]:
print(np.exp(data))

[[2.71828183e+00 7.38905610e+00 2.00855369e+01 5.45981500e+01]
 [1.48413159e+02 4.03428793e+02 1.09663316e+03 2.98095799e+03]]


### 2. Aggregation Functions

Aggregation functions perform a computation on an array and return a single value.

The `.sum()` method calculates the sum of all elements in the array.

In [None]:
print(data)

print(data.sum())

[[1 2 3 4]
 [5 6 7 8]]
36


The `.mean()` method calculates the average of all elements.

In [None]:
print(data.mean())

4.5


The `.std()` method calculates the standard deviation.

In [None]:
print(data.std())

2.29128784747792


The `.min()` and `.max()` methods find the minimum and maximum values in the array, respectively.

In [None]:
print(f'min = {data.min()}')
print(f'max = {data.max()}')

min = 1
max = 8


### 3. Axis-based Operations

You can also perform aggregations along a specific axis (e.g., across rows or columns).
- `axis=0` refers to the vertical axis (columns).
- `axis=1` refers to the horizontal axis (rows).

Here, `data.sum(axis=0)` collapses the rows and computes the sum of each **column**. The result is a 1D array where each element is the sum of the corresponding column.

In [None]:
col_sum = data.sum(axis=0)
print(col_sum)

[ 6  8 10 12]


`data.sum(axis=1)` collapses the columns and computes the sum of each **row**.

In [None]:
row_sum = data.sum(axis=1)
print(row_sum)

[10 26]


This principle applies to other aggregation functions as well, such as finding the mean of each column.

In [None]:
col_mean = data.mean(axis=0)
print(col_mean)

[3. 4. 5. 6.]


### 4. `argmax` and `argmin`

These functions are extremely useful for finding the **indices** of the maximum and minimum values in an array.

Imagine this array represents the output probabilities from a machine learning model for four classes.
- `.argmax()` returns the index of the highest value, which corresponds to the predicted class.
- `.argmin()` returns the index of the lowest value.

In [None]:
model_output = np.array([0.1,0.2,0.5,0.1])
print(model_output)

predict_max = model_output.argmax()
predict_min = model_output.argmin()
print(predict_max)
print(predict_min)

[0.1 0.2 0.5 0.1]
2
0
