# An introduction to NumPy

Numpy is the core library for scientific computing in Python. It provides a high-performance multidimensional array object (ndarray), and tools for working with these arrays. You might find this [quickstart](https://numpy.org/doc/stable/user/quickstart.html) guide useful to get started with Numpy.

To use Numpy, you will need to install it. See [here](https://numpy.org/install/) for instructions.

Then, to use it in your program, you will need to import the `numpy` package:

In [1]:
import numpy as np  # The convention is to import as np

## Creating arrays

A numpy array is a grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. The shape of an array is a tuple of integers giving the size of the array along each dimension.

There are a few ways to create arrays:

1. Conversion from other Python structures (i.e. lists and tuples)

2. Intrinsic NumPy array creation functions (e.g. arange, ones, zeros, etc.)

3. Replicating, joining, or mutating existing arrays

4. Reading arrays from disk, either from standard or custom formats

5. Creating arrays from raw bytes through the use of strings or buffers

6. Use of special library functions (e.g., random)

Let's take a look at the first three methods:

### Create from data structures
1. We can initialize numpy arrays from Python lists, and access elements using square brackets:

In [None]:
a = np.array([1, 2, 3])  # Create a 1D array
print(a)

[1 2 3]


In [None]:
print(type(a), a.shape)  # We can see it is a 'numpy.ndarray' type with shape (3,)

<class 'numpy.ndarray'> (3,)


In [None]:
mylist = [1, 5, 2, 6, 89, 43, 5]
myarray = np.array(mylist)
print(myarray)

[ 1.  5.  2.  6. 89. 43.  5.]


In [None]:
b = np.array([[1,2,3],[4,5,6]])   # Create an array with two axes
print(b)

[[1 2 3]
 [4 5 6]]


In [None]:
print(b.shape)  # There are two ways to get the shape
print(np.shape(b))
print(b[0, 0], b[0, 1], b[1, 0])

(2, 3)
(2, 3)
1 2 4


Create an array with a shape of (4, 2). Verify this by printing its shape below.

In [None]:
# your code here


### Array creation functions
2. Numpy also provides a few functions to create arrays. Try modifying the parameters of these functions below to create different arrays and understand how they work.

In [None]:
# Create arrays with regularly incremented values (use with integers)
a = np.arange(10) # stop value only
print(a)
b = np.arange(0,15,3) # start, stop, increment values
print(b)
c = np.arange(0,6) # start, stop values
print(c)

[0 1 2 3 4 5 6 7 8 9]
[ 0  3  6  9 12]
[0 1 2 3 4 5]


In [None]:
a = np.linspace(1., 4., 6) # Create arrays with a specified number of elements,
# and spaced equally between the specified beginning and end values
print(a) # linspace is best to use with floats

[1.  1.6 2.2 2.8 3.4 4. ]


In [None]:
a = np.zeros((2,2))  # Create an array of all zeros
print(a)

[[0. 0.]
 [0. 0.]]


In [None]:
b = np.ones((1,2))   # Create an array of all ones
print(b)

[[1. 1.]]


In [None]:
c = np.full((2,2), 7)  # Create a constant array
print(c)

[[7 7]
 [7 7]]


In [None]:
d = np.eye(2)  # Create a 2x2 identity array
print(d)

[[1. 0.]
 [0. 1.]]


In [None]:
from numpy.random import default_rng  # Import the random number generator
e = default_rng().random((2,3))  # Create an array filled with random values
print(e)

[[0.41262735 0.62666351]
 [0.24661747 0.00210203]]
[[0.05337115 0.61307726 0.34020867]
 [0.54042865 0.28588556 0.09077572]]


### Replicate/join/change existing arrays
3. We can replicate, join, and change existing arrays to create new arrays. The key here is that a slice of an array is a **view** into the same data, so modifying it will modify the original array.

In [None]:
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print(a)

b = a[:2, 1:3]
print(b)
b[0, 0] = 77    # b[0, 0] is the same piece of data as a[0, 1]
print(a)        # note that a[0, 1] is changed

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
[[2 3]
 [6 7]]
[[ 1 77  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


To avoid this, we can copy the values from the slice into a new array:

In [None]:
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

b = a[:2, 1:3].copy()
b[0,1] = 77
print(b) # b is changed
print(a) # a remains unchanged

[[ 2 77]
 [ 6  7]]
[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


Below are a few additional examples for creating new arrays from existing arrays:

In [None]:
# concatenate two arrays together
a1 = np.array([1, 2, 3])
a2 = np.array([4, 5, 6])
a3 = np.concatenate((a1, a2)) # np.concatenate((a1, a2...), axis=0)
print(a3)

Specifying an axis outside of the shape will result in an error:


In [None]:
# concatenate two arrays together
a1 = np.array([1, 2, 3])
a2 = np.array([4, 5, 6])
a3 = np.concatenate((a1, a2), axis = 1) # np.concatenate((a1, a2...), axis=0)
print(a3)

Instead, we can vertically stack arrays:

In [None]:
# vertically stack two arrays
a1 = np.array([0,0])
a2 = np.array([1,1])
a3 = np.vstack((a1,a2))
print(a3)

[[0 0]
 [1 1]]


We can also horizontally stack arrays:

In [None]:
# horizontally stack two arrays
a1 = np.array([0,0])
a2 = np.array([1,1])
a3 = np.hstack((a1,a2))
print(a3)

[0 0 1 1]


In [None]:
# use block to combine arrays
a1 = np.array([0,0])
a2 = np.array([1,1])
a3 = np.block([[a1,a2], [a1,a2]])
print(a3)

[[0 0 1 1]
 [0 0 1 1]]


## Array indexing and slicing

Numpy offers several ways to index into arrays.

Slicing: Similar to Python lists, numpy arrays can be sliced. Since arrays may be multidimensional, you must specify a slice for each dimension of the array:

In [None]:
import numpy as np

# Create the following rank 2 array with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

# Use slicing to pull out the subarray consisting of the first 2 rows
# and columns 1 and 2; b is the following array of shape (2, 2):
# [[2 3]
#  [6 7]]
b = a[:2, 1:3]
print(b)

[[2 3]
 [6 7]]


There are two ways of accessing the data in the middle row of the array.
Mixing integer indexing with slices yields an array of lower dimensions,
while using only slices yields an array of the same dimensions as the
original array:

In [None]:
# Create the following rank 2 array with shape (3, 4)
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print(a)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [None]:
row_r1 = a[1, :]    # 1D view of the second row of a
row_r2 = a[1:2, :]  # 2D view of the second row of a
row_r3 = a[[1], :]  # 2D view of the second row of a
print(row_r1, row_r1.shape)
print(row_r2, row_r2.shape)
print(row_r3, row_r3.shape)

[5 6 7 8] (4,)
[[5 6 7 8]] (1, 4)
[[5 6 7 8]] (1, 4)


In [None]:
# We can make the same distinction when accessing columns of an array:
col_r1 = a[:, 1]
col_r2 = a[:, 1:2]
print(col_r1, col_r1.shape)
print()
print(col_r2, col_r2.shape)

[ 2  6 10] (3,)

[[ 2]
 [ 6]
 [10]] (3, 1)


Boolean array indexing: Boolean array indexing lets you pick out arbitrary elements of an array. Frequently this type of indexing is used to select the elements of an array that satisfy some condition. Here is an example:

In [None]:
import numpy as np

a = np.array([[1,2], [3, 4], [5, 6]])

bool_idx = (a > 2)  # Find the elements of a that are bigger than 2;
                    # this returns a numpy array of Booleans of the same
                    # shape as a, where each slot of bool_idx tells
                    # whether that element of a is > 2.

print(bool_idx)

[[False False]
 [ True  True]
 [ True  True]]


In [None]:
# We use boolean array indexing to construct a 1D array
# consisting of the elements of a corresponding to the True values
# of bool_idx
print(a[bool_idx])

# We can do all of the above in a single concise statement:
print(a[a > 2])

[3 4 5 6]
[3 4 5 6]


For brevity, a lot of details about advanced numpy array indexing have been left out; if you want to know more you should read the [documentation](https://numpy.org/doc/stable/user/basics.indexing.html).

## Datatypes

Every numpy array is a grid of elements of the same type. Numpy provides a large set of numeric datatypes that you can use to construct arrays. Numpy tries to guess a datatype when you create an array, but functions that construct arrays usually also include an optional argument to explicitly specify the datatype. Here are some example:

In [None]:
x = np.array([1, 2])  # Let numpy choose the datatype
y = np.array([1.0, 2.0])  # Let numpy choose the datatype
z = np.array([1, 2], dtype=np.int64)  # Force a particular datatype

print(x.dtype, y.dtype, z.dtype)

int64 float64 int64


You can read all about numpy datatypes in the [documentation](http://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html).

## Array math

Basic mathematical functions operate elementwise on arrays, and are available both as operators and as functions in the numpy module:

In [None]:
x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)

# Elementwise sum
print(x + y)
print(np.add(x, y))

[[ 6.  8.]
 [10. 12.]]
[[ 6.  8.]
 [10. 12.]]


In [None]:
# Elementwise difference
print(x - y)
print(np.subtract(x, y))

[[-4. -4.]
 [-4. -4.]]
[[-4. -4.]
 [-4. -4.]]


In [None]:
# Elementwise product
print(x * y)
print(np.multiply(x, y))

[[ 5. 12.]
 [21. 32.]]
[[ 5. 12.]
 [21. 32.]]


In [None]:
# Elementwise division; produces the array
# [[ 0.2         0.33333333]
#  [ 0.42857143  0.5       ]]
print(x / y)
print(np.divide(x, y))

[[0.2        0.33333333]
 [0.42857143 0.5       ]]
[[0.2        0.33333333]
 [0.42857143 0.5       ]]


In [None]:
# Elementwise square root; produces the array
# [[ 1.          1.41421356]
#  [ 1.73205081  2.        ]]
print(np.sqrt(x))

[[1.         1.41421356]
 [1.73205081 2.        ]]


Numpy provides many useful functions for performing computations on arrays; one of the most useful is `sum`:

In [None]:
x = np.array([[1,2],[3,4]])

print(np.sum(x))  # Compute sum of all elements; prints "10"
print(np.sum(x, axis=0))  # Compute sum of each column; prints "[4 6]"
print(np.sum(x, axis=1))  # Compute sum of each row; prints "[3 7]"

10
[4 6]
[3 7]


You can find the full list of mathematical functions provided by numpy in the [documentation](http://docs.scipy.org/doc/numpy/reference/routines.math.html).


## Broadcasting

Broadcasting is a powerful mechanism that allows numpy to work with arrays of different shapes when performing arithmetic operations. Frequently we have a smaller array and a larger array, and we want to use the smaller array multiple times to perform some operation on the larger array.

For example, suppose that we want to add a constant vector to each row of a grid of numbers. We could do it like this:

In [None]:
# We will add the vector v to each row of the array x,
# storing the result in the array y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = np.empty_like(x)   # Create an empty array with the same shape as x

# Add the vector v to each row of the array x with an explicit loop
for i in range(4):
    y[i, :] = x[i, :] + v

print(y)

[[ 2  2  4]
 [ 5  5  7]
 [ 8  8 10]
 [11 11 13]]


This works; however when the array `x` is very large, computing an explicit loop in Python could be slow.

Numpy broadcasting allows us to perform this computation without actually creating multiple copies of v. Consider this version, using broadcasting:

In [None]:
import numpy as np

# We will add the vector v to each row of the array x,
# storing the result in the array y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = x + v  # Add v to each row of x using broadcasting
print(y)

[[ 2  2  4]
 [ 5  5  7]
 [ 8  8 10]
 [11 11 13]]


The line `y = x + v` works even though `x` has shape `(4, 3)` and `v` has shape `(3,)` due to broadcasting; this line works as if v actually had shape `(4, 3)`, where each row was a copy of `v`, and the sum was performed elementwise.

Broadcasting two arrays together follows this rule:

**In order to broadcast, the size of the trailing axes for both arrays in an operation must either be the same size or one of them must be one.**

This can be broken down into the following rules:

Broadcasting in NumPy follows a strict set of rules to determine the interaction between the two arrays:

Rule 1: If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading (left) side.

Rule 2: If the shape of the two arrays does not match in any dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape.

Rule 3: If in any dimension the sizes disagree and neither is equal to 1, an error is raised.


If this explanation does not make sense, try reading the explanation from the [documentation](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html).

Functions that support broadcasting are known as universal functions. You can find the list of all universal functions in the [documentation](http://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs).

Here are some applications of broadcasting:

In [None]:
# Add a vector to each row of an array
x = np.array([[1,2,3], [4,5,6]])
v = np.array([1, 0, 1])

# x has shape (2, 3) and v has shape (3,) so they broadcast to (2, 3),
# giving the following matrix:

print(x + v)

[[2 2 4]
 [5 5 7]]


In [None]:
# Multiply an array by a constant:
# x has shape (2, 3). Numpy treats scalars as arrays of shape ();
# these can be broadcast together to shape (2, 3), producing the
# following array:
print(x * 2)

[[ 2  4  6]
 [ 8 10 12]]


While these examples are relatively easy to understand, more complicated cases can involve broadcasting of both arrays. Consider the following example:

In [None]:
a = np.arange(3)
b = np.arange(3)[:, np.newaxis]

print(a)
print(b)
print(a + b) # We've "stretched" *both* ``a`` and ``b`` to match a common shape,
           # and the result is a two-dimensional array

[0 1 2]
[[0]
 [1]
 [2]]
[[0 1 2]
 [1 2 3]
 [2 3 4]]


Broadcasting typically makes your code more concise and faster, so you should strive to use it where possible.

This brief overview has touched on many of the important things that you need to know about numpy, but is far from complete. Check out the [numpy reference](http://docs.scipy.org/doc/numpy/reference/) to find out much more about numpy.

# Numpy examples

## Array creation: Vectors

We can very easily create vectors to use in engineering computations with one line of code. The example below creates a vector over which to compute a sine function, and then plots it (plotting is our next subject!). This is also a useful approach for computing time vectors.

In [None]:
import matplotlib.pyplot as plt

xstart = 0
xstop = 2*np.pi
increment = .1

x = np.arange(xstart, xstop, increment)  # Create a vector of x values
y = np.sin(x)

plt.plot(x,y,'g')
plt.title('sin function')
plt.xlabel('x')
plt.ylabel('sin(x)')
plt.grid()
plt.show()

## Array multiplication
Here is a quick illustration of how powerful numpy can be for doing even seemingly simple calculations. Recall from previous homework assignments we had to iterate through each item in a list to multiply it by a constant to convert units. With Numpy, we can do it in one simple operation.



In [None]:
precip_in = [0.70, 0.75, 1.85, 2.93, 3.05, 2.02,
              1.93, 1.62, 1.84, 1.31, 1.39, 0.84]  # Precip in inches
precip_mm = []
for precip in precip_in:
  precip_mm.append(round(precip * 25.4,3)) # Convert to mm
print(precip_mm)

In [None]:
import numpy as np
precip_in = np.array([0.70, 0.75, 1.85, 2.93, 3.05, 2.02,
              1.93, 1.62, 1.84, 1.31, 1.39, 0.84])  # Precip in inches
precip_mm = np.around(precip_in * 25.4,3)  # Convert to mm
print(precip_mm)

## Array meshgrid

Compute all possible prices of flooring that can have lengths of 2, 4, 6, and 8 meters and widths of 1, 1.5, and 2 meters if the flooring costs $32.19 per square meter. Store the result in a 2D array. The lengths should increase from top to bottom and widths should increase from left to right.

In [4]:
import numpy as np

l = np.array([2,4,6,8])  # List of flooring lengths in m
w = np.array([1,1.5,2])  # List of flooring widths in m

W,L = np.meshgrid(w,l)  # Create a meshgrid of lengths, widths
print(L)
print(W)

# Compute this function over the grid to get a cost for each l x w combination
costs = L * w * 32.19  # Multiply l x w x $/m^2
print(costs)

[[2 2 2]
 [4 4 4]
 [6 6 6]
 [8 8 8]]
[[1.  1.5 2. ]
 [1.  1.5 2. ]
 [1.  1.5 2. ]
 [1.  1.5 2. ]]
[[ 64.38  96.57 128.76]
 [128.76 193.14 257.52]
 [193.14 289.71 386.28]
 [257.52 386.28 515.04]]
