# Aggregations: min, max, and Everything in Between

Aggregation functions in NumPy allow you to perform computations across the entire array or along a specified axis. The numpy implementation is faster than the standard python implementation.

In [None]:
import numpy as np
rng = np.random.default_rng(seed=1701)

x = rng.random(1000000)

%timeit sum(x)
%timeit np.sum(x)

Aggregations also allow you to specify the axis to be collapsed.


In [None]:
M = rng.integers(0, 10, (3,4))

print(M)

print('collapse 0', M.min(axis=0))
print('collapse 1', M.min(axis=1))

Aggregations are typically used to find basic stats.

In [None]:
import pandas as pd
import numpy as np

data = pd.read_csv('./data/president_heights.csv')

heights = data['height(cm)']

print("Mean height: ", heights.mean()) 
print("Standard deviation:", heights.std()) 
print("Minimum height: ", heights.min()) 
print("Maximum height: ", heights.max())
print("25th percentile: ", np.percentile(heights, 25)) 
print("Median: ", np.median(heights)) 
print("75th percentile: ", np.percentile(heights, 75))

In [None]:
import matplotlib.pyplot as plt

plt.style.use('seaborn-v0_8-whitegrid')

plt.hist(heights)
plt.title('Height Distribution of US Presidents')
plt.xlabel('height (cm)')
plt.ylabel('count')