# Demo of some `numpy` features

## MCS 275 Spring 2023 - Emily Dumas

This is a quick tour of some `numpy` features.  For more detail see:
* [Chapter 2 of VanderPlas](https://jakevdp.github.io/PythonDataScienceHandbook/02.00-introduction-to-numpy.html)
* [The numpy documentation](https://numpy.org/doc/stable/)

## Importing the module

In [1]:
import numpy as np
np.__version__

'1.21.5'

## Creating arrays

[List of built-in dtypes](https://numpy.org/doc/stable/reference/arrays.scalars.html#arrays-scalars-built-in).

In [2]:
# `np.array` will convert from an iterable
x = np.array([2,4,8,16,32])

In [4]:
x

array([ 2,  4,  8, 16, 32])

In [5]:
type(x)

numpy.ndarray

In [6]:
x.ndim # how many dimensions?

1

In [3]:
x.shape # size in each dimension, as a tuple

(5,)

In [8]:
len(x) # first element of `shape` attribute

5

In [9]:
x.dtype # int64 means (signed) integer, 64 bits

dtype('int64')

In [10]:
# Given a mix of integers and floats, numpy
# will choose a floating point dtype
y= np.array([5,6,7,7.289])

In [11]:
y

array([5.   , 6.   , 7.   , 7.289])

In [12]:
y.dtype # float64 means float, 64 bits (double)

dtype('float64')

In [15]:
# Let's make an array and manually specify the dtype

# uint8 means UNSIGNED integer, 8 bits
# UNSIGNED = only 0 and positive values
# range is 0...255
z = np.array([1,-1,2,100,300,500,800,16384], dtype="uint8")

In [16]:
z

array([  1, 255,   2, 100,  44, 244,  32,   0], dtype=uint8)

In [17]:
# Why did 300 appear as 44 in the array above?
300 % 256

44

In [4]:
# Make a 2D array from a list of lists
A = np.array( [[3,4,5,6], [1,10,100,1000]], dtype="float64")

In [21]:
A

array([[   3.,    4.,    5.,    6.],
       [   1.,   10.,  100., 1000.]])

In [22]:
A.shape

(2, 4)

## Filled arrays

In [24]:
# Filled with zeros
np.zeros( (3,12), dtype="int64")

array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

In [5]:
# Filled with ones
np.ones( (6,2), dtype="float64")

array([[1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.]])

In [26]:
# Filled with one value
np.full( (7,4), 42, dtype="uint8" )

array([[42, 42, 42, 42],
       [42, 42, 42, 42],
       [42, 42, 42, 42],
       [42, 42, 42, 42],
       [42, 42, 42, 42],
       [42, 42, 42, 42],
       [42, 42, 42, 42]], dtype=uint8)

In [4]:
# Filled with random numbers between 0 and 1
np.random.random( (4,5) )  # argument is the shape

array([[0.18356939, 0.18984025, 0.82520915, 0.55699318, 0.53622515],
       [0.62208731, 0.73505179, 0.06832574, 0.40692692, 0.81277066],
       [0.50103065, 0.59539456, 0.48778655, 0.2054196 , 0.37571201],
       [0.75318401, 0.15764748, 0.458812  , 0.76400134, 0.0328593 ]])

## Special things about 2D arrays

In [6]:
# Identity matrix
np.eye(5, dtype="uint8")  # identity matrix of size (5,5)

array([[1, 0, 0, 0, 0],
       [0, 1, 0, 0, 0],
       [0, 0, 1, 0, 0],
       [0, 0, 0, 1, 0],
       [0, 0, 0, 0, 1]], dtype=uint8)

In [7]:
A = np.array([[1,2,3],[9,8,7]])
A

array([[1, 2, 3],
       [9, 8, 7]])

In [8]:
# The transpose of A, which switches row and column roles
A.T

array([[1, 9],
       [2, 8],
       [3, 7]])

In [9]:
# defining property
i=0
j=2
print(A[i,j])
print(A.T[j,i])
# whenever i,j are integers so that the first makes sense

3
3


## Vector algebra

In [11]:
# two 3-dimensional vectors
v = np.array([1,2,5])
w = np.array([4,-8,0])

In [12]:
v.dot(w) # dot product
#   1*4 + 2*(-8) + 5*0

-12

In [13]:
v.dot(v)**0.5 # length

5.477225575051661

In [14]:
1.8 * v # scalar multiplication

array([1.8, 3.6, 9. ])

In [15]:
v+w # elementwise sum

array([ 5, -6,  5])

In [16]:
v*w # elementwise product

array([  4, -16,   0])

## Arithmetic progressions

In [30]:
# Recall how you get a list of integer values
# in arithmetic progression using built-in stuff
list(range(3,20,2))

[3, 5, 7, 9, 11, 13, 15, 17, 19]

In [31]:
# From 2 up to but not including 3 in steps of size 0.1
np.arange(2,3,0.1)   # start, stop (not included), step

array([2. , 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9])

In [34]:
# From 12 to 14 in 6 steps; 
np.linspace( 12, 14, 6 )   # first, last, number of elements

array([12. , 12.4, 12.8, 13.2, 13.6, 14. ])

## Accessing items

In [35]:
A = np.array([[1,2,3],[4,5,6],[7,8,9]])

In [36]:
A

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [37]:
# A[i,j] refers to the entry in row i, column j  (0-based)

In [38]:
A[1,2] # row index 1, col index 2

6

In [39]:
A[2]  # means row index 2

array([7, 8, 9])

In [40]:
# column 0 from A?
# its entries are A[---,0]
# numpy notation for that is A[:,0]
# : means "anything" in numpy indexing
A[:,0]

array([1, 4, 7])

In [41]:
A

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [42]:
B = np.array([[2,4,8,16,32],[1,-1,1,-1,1],[0,0,0,5,6],[7,6,5,4,3]])

In [43]:
B

array([[ 2,  4,  8, 16, 32],
       [ 1, -1,  1, -1,  1],
       [ 0,  0,  0,  5,  6],
       [ 7,  6,  5,  4,  3]])

In [44]:
B.shape

(4, 5)

## Assigning items

**`numpy` arrays are mutable**

In [45]:
# row,col
B[2,0] = 275

In [46]:
B

array([[  2,   4,   8,  16,  32],
       [  1,  -1,   1,  -1,   1],
       [275,   0,   0,   5,   6],
       [  7,   6,   5,   4,   3]])

In [48]:
B[3] = 567  # entire row 3 of B should have its entries set to 567

In [49]:
B

array([[  2,   4,   8,  16,  32],
       [  1,  -1,   1,  -1,   1],
       [275,   0,   0,   5,   6],
       [567, 567, 567, 567, 567]])

In [50]:
# Change an entire row at one time
B[3] = [1,10,-1,-10,78]  # set the values in row 3 to 1, 10, ...

In [51]:
B

array([[  2,   4,   8,  16,  32],
       [  1,  -1,   1,  -1,   1],
       [275,   0,   0,   5,   6],
       [  1,  10,  -1, -10,  78]])

## Slices

In [53]:
C = B[ 1:3 , 1:4 ]  # the submatrix from rows 1 and 2 and columns 1,2,3

In [54]:
C

array([[-1,  1, -1],
       [ 0,  0,  5]])

Slices return **views**, not copies.

In [64]:
C[:,:] = 51 # set every entry in C to 51

In [65]:
C

array([[51, 51, 51],
       [51, 51, 51]])

In [66]:
# Let's inspect B.
B

array([[  2,   4,   8,  16,  32],
       [  1,  51,  51,  51,   1],
       [275,  51,  51,  51,   6],
       [  1,  10,  -1, -10,  78]])

Note that `B` changed when we modified `C`, because `C` is simply a view of part of `B`.

In [17]:
v = np.array([2,4,6,8,10])
w = np.array([2,4,8,16,32])

In [21]:
v == w  # gives an array of booleans

array([ True,  True, False, False, False])

In [22]:
if v == w:
    print("v and w are equal")
else:
    print("v and w are not equal")

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [None]:
if np.array([ True,  True, False, False, False] ):
    print("v and w are equal")
else:
    print("v and w are not equal")

In [24]:
# proper ways to test elementwise equality of arrays
np.array_equal(v,w)

False

In [25]:
(v==w).all()

False

## Ufuncs

Functions that automatically apply to each entry in an array.

### Some arrays to operate on

In [26]:
v = np.arange(-5,6,1)
v

array([-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5])

In [29]:
A = np.array(range(1,16)).reshape((3,5)) # (15,) -> (3,5)
A

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15]])

### Examples of numpy ufuncs

In [30]:
np.exp(v) # apply e^x to each entry

array([6.73794700e-03, 1.83156389e-02, 4.97870684e-02, 1.35335283e-01,
       3.67879441e-01, 1.00000000e+00, 2.71828183e+00, 7.38905610e+00,
       2.00855369e+01, 5.45981500e+01, 1.48413159e+02])

In [31]:
6.73794700 * 10**(-3)

0.006737947

In [32]:
v**3 # cube each entry

array([-125,  -64,  -27,   -8,   -1,    0,    1,    8,   27,   64,  125])

In [33]:
A**4

array([[    1,    16,    81,   256,   625],
       [ 1296,  2401,  4096,  6561, 10000],
       [14641, 20736, 28561, 38416, 50625]])

In [36]:
v

array([-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5])

In [34]:
np.cos(v) # Take cosine of each entry

array([ 0.28366219, -0.65364362, -0.9899925 , -0.41614684,  0.54030231,
        1.        ,  0.54030231, -0.41614684, -0.9899925 , -0.65364362,
        0.28366219])

In [35]:
np.tan(v)

array([ 3.38051501, -1.15782128,  0.14254654,  2.18503986, -1.55740772,
        0.        ,  1.55740772, -2.18503986, -0.14254654,  1.15782128,
       -3.38051501])

In [37]:
np.log(v)

  np.log(v)
  np.log(v)


array([       nan,        nan,        nan,        nan,        nan,
             -inf, 0.        , 0.69314718, 1.09861229, 1.38629436,
       1.60943791])

In [38]:
np.cos(A) # Works the same for 2D arrays

array([[ 0.54030231, -0.41614684, -0.9899925 , -0.65364362,  0.28366219],
       [ 0.96017029,  0.75390225, -0.14550003, -0.91113026, -0.83907153],
       [ 0.0044257 ,  0.84385396,  0.90744678,  0.13673722, -0.75968791]])

In [39]:
1/A # reciprocal of each entry

array([[1.        , 0.5       , 0.33333333, 0.25      , 0.2       ],
       [0.16666667, 0.14285714, 0.125     , 0.11111111, 0.1       ],
       [0.09090909, 0.08333333, 0.07692308, 0.07142857, 0.06666667]])

In [40]:
A

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15]])

Let $f(x) = 3x^2 - 8x + 14$.  Apply $f$ to each element of array `v`.

In [41]:
def f(x):
    return 3*(x**2) - 8*x + 14

In [43]:
np.array([f(x) for x in v])

array([129,  94,  65,  42,  25,  14,   9,  10,  17,  30,  49])

In [44]:
3*(v**2) - 8*v + 14  # shorter, also faster

array([129,  94,  65,  42,  25,  14,   9,  10,  17,  30,  49])

## Broadcasting

In [45]:
A = np.array(range(1,16)).reshape((3,5))
A

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15]])

In [46]:
A + 5   # A shape (3,5)   5 ???

array([[ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20]])

In [47]:
A + [[5,5,5,5,5],[5,5,5,5,5],[5,5,5,5,5]]

array([[ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20]])

In [48]:
A

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15]])

In [49]:
A[1]

array([ 6,  7,  8,  9, 10])

In [50]:
A[1] = 27 # A[1] is a row, 27 is a number
A

array([[ 1,  2,  3,  4,  5],
       [27, 27, 27, 27, 27],
       [11, 12, 13, 14, 15]])

In [51]:
A += 1 # increase everything by one
A

array([[ 2,  3,  4,  5,  6],
       [28, 28, 28, 28, 28],
       [12, 13, 14, 15, 16]])

In [52]:
A *= 2


In [54]:
A

array([[ 4,  6,  8, 10, 12],
       [56, 56, 56, 56, 56],
       [24, 26, 28, 30, 32]])

In [56]:
A + [5,-5,-5,5,0] # gets added to every row
#(3,5)  + (5,)

array([[ 9,  1,  3, 15, 12],
       [61, 51, 51, 61, 56],
       [29, 21, 23, 35, 32]])

In [55]:
A + [[5,-5,-5,5,0],[5,-5,-5,5,0],[5,-5,-5,5,0]]

array([[ 9,  1,  3, 15, 12],
       [61, 51, 51, 61, 56],
       [29, 21, 23, 35, 32]])

## Aggregations

`sum`, `max`, `min`, `argmax`, `argmin`, `mean`, `all`, `any`, `array_equal`

In [58]:
x = np.random.random( (10,) )
x

array([0.18727418, 0.60942275, 0.69341366, 0.04816357, 0.70953069,
       0.55591148, 0.0825111 , 0.02229469, 0.68578798, 0.50285539])

In [59]:
np.sum(x)  #fast - C for loop

4.097165477078023

In [60]:
sum(x) #slower - Python for loop

4.097165477078023

In [61]:
np.max(x)

0.7095306860279971

In [62]:
np.min(x)

0.022294687886910802

In [64]:
np.argmax(x)  # index of the (first) largest element in x

4

In [65]:
x[4]

0.7095306860279971

In [None]:
np.all()  # are all of the elements nonzero
np.any()  # is there at least one nonzero element

In [66]:
A = np.array([[1,2,5,8,9],[4,3,4,-2,-91],[0,0,0,3,4]])

In [67]:
A

array([[  1,   2,   5,   8,   9],
       [  4,   3,   4,  -2, -91],
       [  0,   0,   0,   3,   4]])

In [69]:
np.mean(A) # average of all the entries

-3.3333333333333335

In [71]:
# vector of column averages
np.mean(A,axis=0)  # average out all possible values for index 0
# average A[:,j]

array([  1.66666667,   1.66666667,   3.        ,   3.        ,
       -26.        ])

In [73]:
# vector of row averages
np.mean(A,axis=1)

array([  5. , -16.4,   1.4])

## Masks

In [76]:
A = np.zeros( (6,6) ) + [1,2,5,8,9,14]

In [77]:
A

array([[ 1.,  2.,  5.,  8.,  9., 14.],
       [ 1.,  2.,  5.,  8.,  9., 14.],
       [ 1.,  2.,  5.,  8.,  9., 14.],
       [ 1.,  2.,  5.,  8.,  9., 14.],
       [ 1.,  2.,  5.,  8.,  9., 14.],
       [ 1.,  2.,  5.,  8.,  9., 14.]])

In [78]:
A.T

array([[ 1.,  1.,  1.,  1.,  1.,  1.],
       [ 2.,  2.,  2.,  2.,  2.,  2.],
       [ 5.,  5.,  5.,  5.,  5.,  5.],
       [ 8.,  8.,  8.,  8.,  8.,  8.],
       [ 9.,  9.,  9.,  9.,  9.,  9.],
       [14., 14., 14., 14., 14., 14.]])

In [81]:
B = A + A.T
B

array([[ 2.,  3.,  6.,  9., 10., 15.],
       [ 3.,  4.,  7., 10., 11., 16.],
       [ 6.,  7., 10., 13., 14., 19.],
       [ 9., 10., 13., 16., 17., 22.],
       [10., 11., 14., 17., 18., 23.],
       [15., 16., 19., 22., 23., 28.]])

In [84]:
C = B**2 - 4*B
C

array([[ -4.,  -3.,  12.,  45.,  60., 165.],
       [ -3.,   0.,  21.,  60.,  77., 192.],
       [ 12.,  21.,  60., 117., 140., 285.],
       [ 45.,  60., 117., 192., 221., 396.],
       [ 60.,  77., 140., 221., 252., 437.],
       [165., 192., 285., 396., 437., 672.]])

In [87]:
is_big = C > 100   # MASK
is_big

array([[False, False, False, False, False,  True],
       [False, False, False, False, False,  True],
       [False, False, False,  True,  True,  True],
       [False, False,  True,  True,  True,  True],
       [False, False,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True,  True]])

In [88]:
C[is_big]  # 2d array indexed using a 2d array of booleans
# means: Give me all the values corresponding to "True" positions

array([165., 192., 117., 140., 285., 117., 192., 221., 396., 140., 221.,
       252., 437., 165., 192., 285., 396., 437., 672.])

In array `C`, find all the values larger than 100 and replace them with -50.

In [89]:
C[C>100] = -50  # expression of intent: C, where bigger than 100, is set equal to -50

In [90]:
C

array([[ -4.,  -3.,  12.,  45.,  60., -50.],
       [ -3.,   0.,  21.,  60.,  77., -50.],
       [ 12.,  21.,  60., -50., -50., -50.],
       [ 45.,  60., -50., -50., -50., -50.],
       [ 60.,  77., -50., -50., -50., -50.],
       [-50., -50., -50., -50., -50., -50.]])

## Pillow integration

* `np.array(img)` just works, if `img` is a `PIL.Image` object
* Use `PIL.Image.fromarray(A)` to make an image from an array
    * Shape `(height,width)` and dtype `uint8` for grayscale
    * Shape `(height,width,3)` and dtype `uint8` for color (last axis is red, green, blue)

In [91]:
import PIL.Image

In [93]:
A = np.random.random((256,256))

In [94]:
A

array([[0.0134065 , 0.90105807, 0.91543535, ..., 0.02828057, 0.34336364,
        0.72064901],
       [0.72750154, 0.02836251, 0.22307256, ..., 0.97292326, 0.8461659 ,
        0.4078381 ],
       [0.6233079 , 0.87224472, 0.91842192, ..., 0.26463918, 0.34587583,
        0.07294757],
       ...,
       [0.30778629, 0.80968986, 0.06001571, ..., 0.85371841, 0.31115892,
        0.21749124],
       [0.84737652, 0.85970629, 0.10829529, ..., 0.88356236, 0.30798818,
        0.83837335],
       [0.67100848, 0.84164515, 0.99521068, ..., 0.92152041, 0.23771473,
        0.90253976]])

In [101]:
D = (A*255).astype("uint8")
D[D>64] = 255

In [103]:
PIL.Image.fromarray(D).save("sparse_noise.png")