<img src="images/numpy-logo.png"
     align="center"
     width="40%"
     alt="Python logo\">

## A brief Introduction

In [44]:
import numpy as np

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

In [50]:
# list
x = [1, 2.5, True, 'sueño']
[type(e) for e in x]

# np array
y = np.array([1, 2, 3, 4], dtype='int')  #y = np.array([1, 2, 3, 3+4j], dtype='int')
y

[int, float, bool, str]

array([1, 2, 3, 4])

<img src="images/array_vs_list.png"
     align="center"
     width="60%"
     alt="Python logo\">

In [61]:
%%timeit 
for i in range(10**5):
    i + 1

11.2 ms ± 89.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [62]:
%timeit np.arange(10**5) + 1 

235 µs ± 891 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


## Creating Arrays

Create a list and convert it to a numpy array

In [3]:
mylist = [1, 2, 3]
x = np.array(mylist)
x
print(type(x))

array([1, 2, 3])

<class 'numpy.ndarray'>


<br>
Or just pass in a list directly

In [8]:
y = np.array([4, 5, 6])
y

array([4, 5, 6])

<br>
Pass in a list of lists to create a multidimensional array.

In [9]:
m = np.array([[7, 8, 9], [10, 11, 12]])
m

array([[ 7,  8,  9],
       [10, 11, 12]])

`arange` returns evenly spaced values within a given interval.

In [17]:
n = np.arange(0, 30, 2) # start at 0 count up by 2, stop before 30
n

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28])

`linspace` returns evenly spaced numbers over a specified interval.

In [15]:
x = np.linspace(0, 4, 9) # return 9 evenly spaced values from 0 to 4
x

array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. ])

`ones` returns a new array of given shape and type, filled with ones.

In [32]:
np.ones((3, 2))

array([[1., 1.],
       [1., 1.],
       [1., 1.]])

`zeros` returns a new array of given shape and type, filled with zeros.

In [20]:
np.zeros((2, 3))

array([[0., 0., 0.],
       [0., 0., 0.]])

`eye` returns a 2-D array with ones on the diagonal and zeros elsewhere.

In [22]:
np.eye(5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

`diag` extracts a diagonal or constructs a diagonal array.

In [23]:
np.diag([4,5,6,7])

array([[4, 0, 0, 0],
       [0, 5, 0, 0],
       [0, 0, 6, 0],
       [0, 0, 0, 7]])

Create a 3x5 array filled with 3.14

In [63]:
np.full((3, 5), 3.14)

array([[3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14]])

Create a 3x3 array of uniformly distributed
random values between 0 and 1

In [64]:
np.random.random((3, 3))

array([[0.22175544, 0.56493119, 0.7441299 ],
       [0.92941435, 0.66798233, 0.93596886],
       [0.81261278, 0.02984305, 0.3608868 ]])

Create a 3x3 array of normally distributed random values
with mean 0 and standard deviation 1

In [66]:
np.random.normal(0, 1, (3, 3))

array([[-1.04304453,  0.61948427, -0.64815931],
       [-1.18533355, -0.01333406,  0.40274271],
       [ 2.05739192,  0.70412005,  0.43244275]])

## Array Attributes

In [68]:
np.random.seed(0)  # seed for reproducibility

x1 = np.random.randint(10, size=6)  # One-dimensional array
x2 = np.random.randint(10, size=(3, 4))  # Two-dimensional array
x3 = np.random.randint(10, size=(3, 4, 5))  # Three-dimensional array

Each array has attributes `ndim` (the number of dimensions), `shape` (the size of each dimension), and `size` (the total size of the array):

In [69]:
print("x3 ndim: ", x3.ndim)
print("x3 shape:", x3.shape)
print("x3 size: ", x3.size)

x3 ndim:  3
x3 shape: (3, 4, 5)
x3 size:  60


Another useful attribute is the `dtype`, the data type of the array

In [70]:
print("dtype:", x3.dtype)

dtype: int32


## Reshaping of Arrays

In [75]:
x2.reshape(2,6)
x2

array([[3, 5, 2, 4, 7, 6],
       [8, 8, 1, 6, 7, 7]])

array([[3, 5, 2, 4],
       [7, 6, 8, 8],
       [1, 6, 7, 7]])

In [76]:
x2.resize(2,6)
x2

array([[3, 5, 2, 4, 7, 6],
       [8, 8, 1, 6, 7, 7]])

## Combining Arrays

In [78]:
p = np.ones([2, 3], dtype='float')
p

array([[1., 1., 1.],
       [1., 1., 1.]])

Use `vstack` to stack arrays in sequence vertically (row wise).

In [79]:
np.vstack([p, 2*p])

array([[1., 1., 1.],
       [1., 1., 1.],
       [2., 2., 2.],
       [2., 2., 2.]])

Use `hstack` to stack arrays in sequence horizontally (column wise).

In [80]:
np.hstack([p, 2*p])

array([[1., 1., 1., 2., 2., 2.],
       [1., 1., 1., 2., 2., 2.]])

## Indexing / Slicing

In [81]:
s = np.arange(13)**2
s

array([  0,   1,   4,   9,  16,  25,  36,  49,  64,  81, 100, 121, 144],
      dtype=int32)

<br>
Use bracket notation to get the value at a specific index. Remember that indexing starts at 0.

In [82]:
s[0], s[4], s[-1]

(0, 16, 144)

<br>
Use `:` to indicate a range. `array[start:stop]`


Leaving `start` or `stop` empty will default to the beginning/end of the array.

In [83]:
s[1:5]

array([ 1,  4,  9, 16], dtype=int32)

<br>
Use negatives to count from the back.

In [84]:
s[-4:]

array([ 81, 100, 121, 144], dtype=int32)

A second `:` can be used to indicate step-size. `array[start:stop:stepsize]`

Here we are starting 5th element from the end, and counting backwards by 2 until the beginning of the array is reached.

In [85]:
s[-5::-2]

array([64, 36, 16,  4,  0], dtype=int32)

<br>
Let's look at a multidimensional array.

In [86]:
r = np.arange(36)
r.resize((6, 6))
r

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35]])

Use bracket notation to slice: `array[row, column]`

In [87]:
r[2, 2]

14

And use : to select a range of rows or columns. Note that index 6 do not exist!

In [88]:
r[3, 3:6]

array([21, 22, 23])

<br>
Here we are selecting all the rows up to (and not including) row 2, and all the columns up to (and not including) the last column.

In [89]:
r[:2, :-1]

array([[ 0,  1,  2,  3,  4],
       [ 6,  7,  8,  9, 10]])

<br>
This is a slice of the last row, and only every odd element.

In [90]:
r[-1, ::2]

array([30, 32, 34])

We can also perform conditional indexing. Here we are selecting values from the array that are greater than 30. (Also see `np.where`)

In [92]:
r[r > 28]

array([29, 30, 31, 32, 33, 34, 35])

In [96]:
r[np.where(r > 28)]

array([29, 30, 31, 32, 33, 34, 35])

Here we are assigning all values in the array that are greater than 30 to the value of 30.

In [97]:
r[r > 30] = 30
r

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 30, 30, 30, 30, 30]])

## Copying Data

Be careful with copying and modifying arrays in NumPy!


`r2` is a slice of `r`

In [98]:
r2 = r[:3,:3]
r2

array([[ 0,  1,  2],
       [ 6,  7,  8],
       [12, 13, 14]])

Set this slice's values to zero ([:] selects the entire array)

In [99]:
r2[:] = 0
r2

array([[0, 0, 0],
       [0, 0, 0],
       [0, 0, 0]])

<br>
`r` has also been changed!

In [100]:
r

array([[ 0,  0,  0,  3,  4,  5],
       [ 0,  0,  0,  9, 10, 11],
       [ 0,  0,  0, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 30, 30, 30, 30, 30]])

<br>
To avoid this, use `r.copy` to create a copy that will not affect the original array

In [101]:
r_copy = r.copy()
r_copy

array([[ 0,  0,  0,  3,  4,  5],
       [ 0,  0,  0,  9, 10, 11],
       [ 0,  0,  0, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 30, 30, 30, 30, 30]])

<br>
Now when r_copy is modified, r will not be changed.

In [102]:
r_copy[:] = 10
r_copy
r

array([[10, 10, 10, 10, 10, 10],
       [10, 10, 10, 10, 10, 10],
       [10, 10, 10, 10, 10, 10],
       [10, 10, 10, 10, 10, 10],
       [10, 10, 10, 10, 10, 10],
       [10, 10, 10, 10, 10, 10]])

array([[ 0,  0,  0,  3,  4,  5],
       [ 0,  0,  0,  9, 10, 11],
       [ 0,  0,  0, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 30, 30, 30, 30, 30]])

## Operations

Use `+`, `-`, `*`, `/` and `**` to perform element wise addition, subtraction, multiplication, division and power.

In [106]:
x = np.arange(1, 4)
y = np.arange(4, 7)
x + y # elementwise addition     [1 2 3] + [4 5 6] = [5  7  9]
x - y # elementwise subtraction  [1 2 3] - [4 5 6] = [-3 -3 -3]

array([5, 7, 9])

array([-3, -3, -3])

In [107]:
x * y # elementwise multiplication  [1 2 3] * [4 5 6] = [4  10  18]
x / y # elementwise divison         [1 2 3] / [4 5 6] = [0.25  0.4  0.5]

array([ 4, 10, 18])

array([0.25, 0.4 , 0.5 ])

In [108]:
print(x**2) # elementwise power  [1 2 3] ^2 =  [1 4 9]

[1 4 9]



**Dot Product:**  

$ \begin{bmatrix}x_1 \ x_2 \ x_3\end{bmatrix}
\cdot
\begin{bmatrix}y_1 \\ y_2 \\ y_3\end{bmatrix}
= x_1 y_1 + x_2 y_2 + x_3 y_3$

In [109]:
x.dot(y) # dot product  1*4 + 2*5 + 3*6

32

In [110]:
np.dot(x,y)

32

In [142]:
x @ y

32

Let's look at transposing arrays. Transposing permutes the dimensions of the array.

In [113]:
z = np.array([y, y**2])
z

array([[ 4,  5,  6],
       [16, 25, 36]])

The shape of array `z` is `(2,3)` before transposing.

In [114]:
z.shape

(2, 3)

Use `.T` to get the transpose.

In [115]:
z.T

array([[ 4, 16],
       [ 5, 25],
       [ 6, 36]])

<br>
The number of rows has swapped with the number of columns.

In [116]:
z.T.shape

(3, 2)

Use `.astype` to cast to a specific type.

In [118]:
z = z.astype('f')
z
z.dtype

i=z.astype('i')
i
i.dtype

array([[ 4.,  5.,  6.],
       [16., 25., 36.]], dtype=float32)

dtype('float32')

array([[ 4,  5,  6],
       [16, 25, 36]], dtype=int32)

dtype('int32')

## Math Functions

Numpy has many built in math functions that can be performed on arrays.

In [119]:
a = np.array([-4, -2, 1, 3, 5])
b = np.arange(36).reshape(6, 6)

In [125]:
a.sum()
b.sum(), b.sum(axis=0), b.sum(axis=1)

3

(630,
 array([ 90,  96, 102, 108, 114, 120]),
 array([ 15,  51,  87, 123, 159, 195]))

In [126]:
a.max()
b.max(), b.max(axis=0), b.max(axis=1)

5

(35, array([30, 31, 32, 33, 34, 35]), array([ 5, 11, 17, 23, 29, 35]))

In [127]:
a.mean()
b.mean(), b.mean(axis=0), b.mean(axis=1)

0.6

(17.5,
 array([15., 16., 17., 18., 19., 20.]),
 array([ 2.5,  8.5, 14.5, 20.5, 26.5, 32.5]))

In [128]:
a.std()
b.std(), b.std(axis=0), b.std(axis=1)

3.2619012860600183

(10.388294694831615,
 array([10.24695077, 10.24695077, 10.24695077, 10.24695077, 10.24695077,
        10.24695077]),
 array([1.70782513, 1.70782513, 1.70782513, 1.70782513, 1.70782513,
        1.70782513]))

`argmax` and `argmin` return the index of the maximum and minimum values in the array.

In [131]:
a.argmax()
b.argmax(), b.argmax(axis=0), b.argmax(axis=1)

4

(35,
 array([5, 5, 5, 5, 5, 5], dtype=int64),
 array([5, 5, 5, 5, 5, 5], dtype=int64))

## Additional Functionality

NumPy provides versions of the standard functions log, exp, sin, etc. that act element-wise on arrays

In [144]:
z = np.array([1, 2, 3])
np.log(z)

array([0.        , 0.69314718, 1.09861229])

This eliminates the need for explicit element-by-element loops such as

In [147]:
n = len(z)
y = np.empty(n)
for i in range(n):
    y[i] = np.log(z[i])
y

array([0.        , 0.69314718, 1.09861229])

Because they act element-wise on arrays, these functions are called vectorized functions

Not all user defined functions will act element-wise. For example, passing the function f defined below a NumPy array causes a `ValueError`

In [150]:
x = np.random.randn(4)

def f(x):
    return 1 if x > 0 else 0

f(x)

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [152]:
f = np.vectorize(f)
x
f(x)

array([-1.50709602,  1.14907613, -1.19357825,  1.14104245])

array([0, 1, 0, 1])

## Broadcasting

Recall that for arrays of the same size, binary operations are performed on an element-by-element basis:

In [133]:
a = np.array([0, 1, 2])
b = np.array([5, 5, 5])
a + b

array([5, 6, 7])

Broadcasting allows these types of binary operations to be performed on arrays of different sizes–for example, we can just as easily add a scalar (think of it as a zero-dimensional array) to an array:

In [134]:
a + 5

array([5, 6, 7])

We can similarly extend this to arrays of higher dimension. Observe the result when we add a one-dimensional array to a two-dimensional array:

In [135]:
M = np.ones((3, 3))
M

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [136]:
M + a

array([[1., 2., 3.],
       [1., 2., 3.],
       [1., 2., 3.]])

Here the one-dimensional array a is stretched, or broadcast across the second dimension in order to match the shape of M

While these examples are relatively easy to understand, more complicated cases can involve broadcasting of both arrays. Consider the following example:

In [137]:
a = np.arange(3)
b = np.arange(3).reshape(3,1)

a
b

array([0, 1, 2])

array([[0],
       [1],
       [2]])

In [138]:
a + b

array([[0, 1, 2],
       [1, 2, 3],
       [2, 3, 4]])

The geometry of these examples is visualized in the following figure

<img src="images/broadcasting.png"
     align="center"
     width="60%"
     alt="Python logo\">