# NumPy Follow-Along
Hands-on notebook to accompany today's lecture.

**Goals**
- Practice creating and manipulating NumPy arrays
- Understand broadcasting, slicing, reshaping, and combining
- Move between pandas and NumPy
- Explore a simulation of *nontransitive dice* and visualize results


## Setup

In [None]:
import numpy as np
import pandas as pd
# import matplotlib.pyplot as plt

np.random.seed(386)  # for reproducibility; change this if you like
print(np.__version__)

Matplotlib is building the font cache; this may take a moment.


2.3.3


## Creating Arrays
Try the creation routines discussed in class:
- `np.array`, `np.zeros`, `np.ones`, `np.full`
- `np.arange`, `np.linspace`
- `np.random.rand`, `np.random.randn`


In [3]:
# Your turn â€” execute and modify these:
a = np.array([1, 2, 3])
b = np.zeros((2, 3))
c = np.ones(5)
d = np.full((3, 3), fill_value=7)
e = np.arange(0, 10, 2)
f = np.linspace(0, 1, 5)
g = np.random.rand(2, 2)
h = np.random.randn(3, 2)

print('a:', a)
print('b:', b)
print('c:', c)
print('d:', d)
print('e:', e)
print('f:', f)
print('g:', g)
print('h:', h)


a: [1 2 3]
b: [[0. 0. 0.]
 [0. 0. 0.]]
c: [1. 1. 1. 1. 1.]
d: [[7 7 7]
 [7 7 7]
 [7 7 7]]
e: [0 2 4 6 8]
f: [0.   0.25 0.5  0.75 1.  ]
g: [[0.73820298 0.44082866]
 [0.83119324 0.10062259]]
h: [[ 0.3312869   0.09959823]
 [-0.93937512 -0.22506129]
 [-0.14231118 -0.43709633]]


## Element-wise Operations & Comparisons

In [4]:
arr = np.array([10, 20, 30, 40])
arr2 = np.array([1, 2, 3, 4])

print('arr + arr2 ->', arr + arr2)
print('arr * 2 ->', arr * 2)
print('arr ** 2 ->', arr ** 2)
print('arr > 15 ->', arr > 15)


arr + arr2 -> [11 22 33 44]
arr * 2 -> [20 40 60 80]
arr ** 2 -> [ 100  400  900 1600]
arr > 15 -> [False  True  True  True]


## Broadcasting

In [4]:
A = np.array([[1., 2., 3.],
              [4., 5., 6.]])
b = np.array([10., 20., 30.])

print('A shape:', A.shape, 'b shape:', b.shape)
print('A + b ->')
print(A + b)


A shape: (2, 3) b shape: (3,)
A + b ->
[[11. 22. 33.]
 [14. 25. 36.]]


## Slicing & Fancy Indexing

In [9]:
arr = np.arange(10)
print('arr:', arr)
print('arr[2:6]:', arr[2:6])
print('arr[1:9:2]:', arr[1:9:2])

mat = np.arange(9).reshape(3, 3)
print('\nmat:\n', mat)
print('mat[0:2, 1:3]:\n', mat[0:2, 1:3])

indices = np.array([0, 2])
print('Fancy indexing mat[[0,2], :]:\n', mat[indices, :])


arr: [0 1 2 3 4 5 6 7 8 9]
arr[2:6]: [2 3 4 5]
arr[1:9:2]: [1 3 5 7]

mat:
 [[0 1 2]
 [3 4 5]
 [6 7 8]]
mat[0:2, 1:3]:
 [[1 2]
 [4 5]]
Fancy indexing mat[[0,2], :]:
 [[0 1 2]
 [6 7 8]]


## Reshaping & Combining

In [11]:
x = np.arange(12)
print('x:\n', x)
X = np.reshape(x, (3, 4))
print('X:\n', X)

print('\nFlattened:', X.flatten())

A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6]])
print('\nnp.concatenate((A,B), axis=0):\n', np.concatenate((A, B), axis=0))

C = np.array([[7, 8], [9, 10]])
print('\nnp.vstack((A, C)):\n', np.vstack((A, C)))
print('\nnp.hstack((A, C)):\n', np.hstack((A, C)))


x:
 [ 0  1  2  3  4  5  6  7  8  9 10 11]
X:
 [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

Flattened: [ 0  1  2  3  4  5  6  7  8  9 10 11]

np.concatenate((A,B), axis=0):
 [[1 2]
 [3 4]
 [5 6]]

np.vstack((A, C)):
 [[ 1  2]
 [ 3  4]
 [ 7  8]
 [ 9 10]]

np.hstack((A, C)):
 [[ 1  2  7  8]
 [ 3  4  9 10]]


## From pandas to NumPy (and back)

In [18]:
df = pd.DataFrame({
    'height_cm': [172, 165, 181, 190, 175],
    'weight_kg': [70, 60, 82, 90, 72]
})
print(df)

arr = df.to_numpy()
print('\nUnderlying NumPy array:\n', arr)

df2 = pd.DataFrame(arr, columns=df.columns)
print('\nStandardized back in pandas:\n', df2)

print('\ndf == df2?', df.equals(df2))


   height_cm  weight_kg
0        172         70
1        165         60
2        181         82
3        190         90
4        175         72

Underlying NumPy array:
 [[172  70]
 [165  60]
 [181  82]
 [190  90]
 [175  72]]

Standardized back in pandas:
    height_cm  weight_kg
0        172         70
1        165         60
2        181         82
3        190         90
4        175         72

df == df2? True


## Quick Aggregations

In [19]:
print('Column means:', arr.mean(axis=0))
print('Column stds:', arr.std(axis=0))
print('Column percentiles (50th):', np.percentile(arr, 50, axis=0))

Column means: [176.6  74.8]
Column stds: [ 8.45221864 10.32279032]
Column percentiles (50th): [175.  72.]
