# NumPy Basics

NumPy (short for Numerical Python) provides an efficient interface to store and operate on arrays. NumPy arrays form the core of nearly the entire ecosystem of data science tools in Python, so time spent learning to use NumPy effectively will be valuable no matter what aspect of data science interests you.

In [None]:
import numpy as np

### Creating NumPy Arrays

In [None]:
losses = np.array([[100, 200, 300],
                   [125, 195, np.nan],
                   [125, np.nan, np.nan]])
print(losses)

### Numpy Array Attributes

In [None]:
losses.ndim

In [None]:
losses.shape

In [None]:
losses.size

### NumPy Array Indexing and Slicing

In [None]:
# select [row, column]
losses[0, 1]

In [None]:
# select first row
losses[0, :]

In [None]:
# select last column
losses[:, 2]

In [None]:
# use negative indexing to select the last column
losses[:, -1]

In [None]:
# select the first two columns
losses[:, 0:2]

In [None]:
# use negative indexing to select the first two columns
losses[:, :-1]

### NumPy Array Aritmetic

In [None]:
# multiply by scalar
loss_trend = 1.05
losses * loss_trend

In [None]:
# arithmetic is element-wise
counts = np.array([[10, 12, 15],
                   [10, 12, np.nan],
                   [10, np.nan, np.nan]])
severity = losses / counts
print(severity)

### Broadcasting

Broadcasting is a set of rules for applying functions on arrays of different sizes.

In [None]:
premiums = np.array([[400, 400, 400],
                     [450, 450, 450],
                     [475, 475, 475]])

loss_ratio = losses / premiums
print(loss_ratio)

This can be cumbersome, we'd like to just have a single array of premiums

In [None]:
premiums = np.array([[400],
                     [450],
                     [475]])
premiums.shape

In [None]:
losses / premiums

Rules of Broadcasting:
1. If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading (left) side.
2. If the shape of the two arrays does not match in any dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape.
3. If in any dimension the sizes disagree and neither is equal to 1, an error is raised.

In [None]:
# number of dimensions are equal, so rule #1 doesn't apply
print(losses.ndim)
print(premiums.ndim)

In [None]:
# premium is stretched to match the other shape.
print(losses.shape)
print(premiums.shape)

## Further Reading
- NumPy docs
- Python Data Science Handbook