# 4.3 Mathematical and Statistical Methods

In [1]:
import numpy as np

A set of mathematical functions that compute statistics about an entire array or about the data along an axis are accessible as methods of the array class. You can use aggregations (often called *reductions*) like `sum`, `mean`, and `std` (standard deviation) either by calling the array instance method or by using the top-level NumPy function.

In [2]:
rng = np.random.default_rng(seed=12345)
arr = rng.standard_normal((5, 4))
print(f"Original array:\n{arr}")

Original array:
[[-1.42382504  1.26372846 -0.87066174 -0.25917323]
 [-0.07534331 -0.74088465 -1.3677927   0.6488928 ]
 [ 0.36105811 -1.95286306  2.34740965  0.96849691]
 [-0.75938718  0.90219827 -0.46695317 -0.06068952]
 [ 0.78884434 -1.25666813  0.57585751  1.39897899]]


## 4.3.1 Basic Statistical Methods

#### `mean`
Computes the arithmetic mean.

In [3]:
print(f"Mean (method): {arr.mean()}")
print(f"Mean (NumPy function): {np.mean(arr)}")

Mean (method): 0.0010611661248891013
Mean (NumPy function): 0.0010611661248891013


#### `sum`
Computes the sum of all elements.

In [4]:
print(f"Sum: {arr.sum()}")

Sum: 0.021223322497782027


## 4.3.2 Operations on an Axis

Functions like `mean` and `sum` take an optional `axis` argument that computes the statistic over the given axis, resulting in an array with one fewer dimension.

In [5]:
# Compute the mean across the columns (axis=1)
print(f"Mean across columns: {arr.mean(axis=1)}")

Mean across columns: [-0.32248289 -0.38378196  0.4310254  -0.0962079   0.37675318]


In [6]:
# Compute the sum down the rows (axis=0)
print(f"Sum down rows: {arr.sum(axis=0)}")

Sum down rows: [-1.10865307 -1.78448912  0.21785956  2.69650595]


## 4.3.3 Cumulative Methods

Other methods, like `cumsum` and `cumprod`, do not aggregate, instead producing an array of the intermediate results.

In [7]:
arr = np.array([0, 1, 2, 3, 4, 5, 6, 7])
print(f"Original array: {arr}")

Original array: [0 1 2 3 4 5 6 7]


In [8]:
# Cumulative sum
print(f"Cumulative sum: {arr.cumsum()}")

Cumulative sum: [ 0  1  3  6 10 15 21 28]


In multidimensional arrays, accumulation functions like `cumsum` return an array of the same size, but with the partial aggregates computed along the indicated axis according to each lower-dimensional slice.

In [9]:
arr = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8]])
print(f"Original 2D array:\n{arr}")

Original 2D array:
[[0 1 2]
 [3 4 5]
 [6 7 8]]


In [10]:
# Cumulative sum down the rows (axis=0)
print(f"Cumulative sum down rows:\n{arr.cumsum(axis=0)}")

Cumulative sum down rows:
[[ 0  1  2]
 [ 3  5  7]
 [ 9 12 15]]


In [11]:
# Cumulative sum across the columns (axis=1)
print(f"Cumulative sum across columns:\n{arr.cumsum(axis=1)}")

Cumulative sum across columns:
[[ 0  1  3]
 [ 3  7 12]
 [ 6 13 21]]


## 4.3.4 Other Basic Statistical Methods

In [12]:
arr = rng.standard_normal((3, 5))
print(f"New array:\n{arr}")

New array:
[[ 1.32229806 -0.29969852  0.90291934 -1.62158273 -0.15818926]
 [ 0.44948393 -1.34360107 -0.08168759  1.72473993  2.61815943]
 [ 0.77736134  0.8286332  -0.95898831 -1.20938829 -1.41229201]]


#### `std` and `var`
Standard deviation and variance.

In [13]:
print(f"Standard deviation: {arr.std()}")
print(f"Variance: {arr.var()}")

Standard deviation: 1.2291354305745301
Variance: 1.5107739066936354


#### `min` and `max`
Minimum and maximum values.

In [14]:
print(f"Minimum value: {arr.min()}")
print(f"Maximum value: {arr.max()}")

Minimum value: -1.6215827341822058
Maximum value: 2.61815942636784


#### `argmin` and `argmax`
Indices of the minimum and maximum elements.

In [15]:
print(f"Index of minimum value: {arr.argmin()}")
print(f"Index of maximum value: {arr.argmax()}")

Index of minimum value: 3
Index of maximum value: 9


#### `cumprod`
Cumulative product of elements.

In [16]:
arr = np.array([1, 2, 3, 4])
print(f"Cumulative product: {arr.cumprod()}")


Cumulative product: [ 1  2  6 24]
