# Intro to NumPy

## NumPy Basics, Continued

The examples here use trivial "datasets" for clarity and simplicity. In real world use, NumPy can easily handle arrays with 1M elements (e.g. 1000 x 1000 matrix). With careful memory management it can handle 100 to 1000x that.

Note that these tools are complex and this simplified introduction glosses over some important nuances that we will elaborate on later, as required.

Much of the material in this section is adapted from Chapter 4 of *Python for Data Analysis* (McKinney 2022).

### Import

By convention, NumPy is usually imported as `np` - shorthand notation.

In [None]:
import numpy as np

### Arithmetic with Arrays

There is no need for loops when operating on an `ndarray`, because arithmetic operations are **vectorized**. This means that they take advantage of modern cpu architecture to execute the same operation on multiple data elements simultaneously, processing entire arrays in compiled C code rather than interpreting Python loops element by element.

Any operations between equal-size arrays apply the operation element-wise.

In [None]:
arr = np.array([[1., 2., 3.], [4., 5., 6.]])

arr * arr

In [None]:
arr - arr

Operations with scalars propogate the scalar argument to each element in the array. This uses a process called **broadcasting**, which we will explore in more detail later.

In [None]:
1 / arr

In [None]:
arr ** 2

Matrix multiplication is fundamental to working with this sort of data, and is supported with the `@` operator.

In [None]:
a1 = np.array([1, 2])

a1 @ arr

Comparisons between arrays of the same size yield Boolean arrays.

In [None]:
arr2 = np.array([[0., 4., 1.], [7., 2., 12.]])

arr2 > arr

### Create Arrays from Scratch

A number of functions exist to create arrays from scratch.

`np.zeros` and `np.ones` will create an array of the specified shape filled with zeros, or ones, respectively.

In [None]:
np.zeros(10)

In [None]:
np.ones((3, 6))  # note inner parens!

Note: to create an array with 2 or more dimensions, you must pass a tuple of the desired shape. In this case the tuple **must** be enclosed in parentheses to avoid being interpreted as multipl arguments.

In [None]:
# this will be interpreted as two arguments, not one specifying the shape
np.ones(3, 6)

Here we see that `6` is being interpreted as the second argument, `dtype`.

In [None]:
help(np.ones)

To generate an array filled with a range of values, use `np.arange` with start, stop, and step parameters, which correspond to the same in `range()` from base Python. `arange` is limited to 1D arrays, but you can reshape the result into higher order arrays as we'll see later.

In [None]:
np.arange(10)

In [None]:
np.arange(0, 1, 0.1)

Be careful of floating-point precision issues when using `np.range`. For exact factional steps, `np.linspace` is often a more reliable way to create evenly spaced divisions:

In [None]:
# 0 to 1 in 11 evenly spaced points
np.linspace(0, 1, 11)

Several other basic array creation functions exist. The most commonly used are summarized in the following table.

![NumPy Array Creation (McKinney Table 4-1)](images/03a-numpy-array-creation.png)

Other functions exist to create arrays of procedural data. For example, you can easily generate arrays of arbitrary size filled with random samples from various distributions. We will explore this further later.

In [None]:
# Random values
print("\nRandom uniform [0,1):")
print(np.random.random((2, 3)))

print("\nRandom normal (mean=0, std=1):")
print(np.random.normal(size=(2, 3)))

### Reshaping

We will sometimes need to convert data between row and column form or reshape it in different ways without changing the total number of elements.

In [None]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
print(arr.shape)

In [None]:
print("As row:", arr.reshape(1, 12))      # one row
print("As column:\n", arr.reshape(12, 1))   # one column
print("As table:\n", arr.reshape(3, 4))     # 3 rows, 4 cols
print("As cube:\n", arr.reshape(2, 2, 3))   # 2x2x3 3D array