# Demo of some `numpy` features

## MCS 275 Spring 2024 - David Dumas

This is a quick tour of some `numpy` features.  For more detail see:
* [Chapter 2 of VanderPlas](https://jakevdp.github.io/PythonDataScienceHandbook/02.00-introduction-to-numpy.html)
* [The numpy documentation](https://numpy.org/doc/stable/)

## Importing the module

And checking the version.

In [2]:
import numpy as np
np.__version__

'1.21.5'

## Creating arrays

They are iterable and type-homogeneous.  Can make one from any suitable iterable.

[List of built-in dtypes](https://numpy.org/doc/stable/reference/arrays.scalars.html#arrays-scalars-built-in).

In [3]:
# `np.array` will convert from an iterable
x = np.array([2,4,8,16,32])

In [8]:
# List of lists -> 4 row, 3 column matrix (2 dimensional array)
A = np.array([[1,2,3],[4,5,6],[7,8,9],[0,2,0]])

In [11]:
# nice display
print(x)
print()
print(A)

[ 2  4  8 16 32]

[[1 2 3]
 [4 5 6]
 [7 8 9]
 [0 2 0]]


In [13]:
# ndarray class
type(x)

numpy.ndarray

Check number of dimensions

In [6]:
x.ndim # how many dimensions?

1

In [14]:
A.ndim

2

Check shape (size in each dimension)

In [7]:
x.shape # size in each dimension, as a tuple

(5,)

In [15]:
A.shape

(4, 3)

Check "length" (first elt of shape)

In [16]:
len(x) # number of items in the vector

5

In [17]:
len(A) # number of rows in the matrix = A.shape[0]

4

In general, `len(m)` means `m.shape[0]` if `m` is a numpy array.

Check data type

In [18]:
x.dtype # int64 means (signed) integer, 64 bits

dtype('int64')

Data type typically inferred but can be specified (potential lossy process)

In [19]:
# Given a mix of integers and floats, numpy
# will choose a floating point dtype
y= np.array([5,6,7,7.289])

In [20]:
y

array([5.   , 6.   , 7.   , 7.289])

In [21]:
y.dtype # float64 means float, 64 bits (double)

dtype('float64')

In [22]:
y_force_int = y= np.array([5,6,7,7.289], dtype="int")

In [25]:
# Notice we lost precision by specifying dtype int
y_force_int

array([5, 6, 7, 7])

In [24]:
# Notice numpy chose a precise type compatible with
# the request "int"
y_force_int.dtype

dtype('int64')

In [26]:
# uint8 means UNSIGNED integer, 8 bits
# UNSIGNED = only 0 and positive values
# range is 0...255
z = np.array([1,-1,2,100,300,500,800,16384], dtype="uint8")

In [27]:
z

array([  1, 255,   2, 100,  44, 244,  32,   0], dtype=uint8)

In [28]:
# Why did 300 appear as 44 in the array above?
300 % 256

44

## Filled arrays

Can fill with zeros, ones, or make an array full of a general value.

In [29]:
# Filled with zeros
np.zeros( (3,12), dtype="int64")

array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

In [30]:
# Filled with ones
np.ones( (6,2), dtype="float64")

array([[1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.]])

In [31]:
# Filled with one value
np.full( (7,4), 42, dtype="uint8" )

array([[42, 42, 42, 42],
       [42, 42, 42, 42],
       [42, 42, 42, 42],
       [42, 42, 42, 42],
       [42, 42, 42, 42],
       [42, 42, 42, 42],
       [42, 42, 42, 42]], dtype=uint8)

Can also ask for an array filled with random values (floats between 0 and 1, never exactly 1, uniformly distributed).  Note `np.random` is a submodule, you want `np.random.random(...)`

In [32]:
# Filled with random numbers between 0 and 1
np.random.random( (4,5) )  # argument is the shape

array([[0.84985369, 0.52904623, 0.06757972, 0.60749338, 0.44579558],
       [0.94276045, 0.22398809, 0.99290933, 0.9505469 , 0.65531365],
       [0.0944089 , 0.67245052, 0.25851387, 0.81402906, 0.02156303],
       [0.16531779, 0.87858936, 0.60636781, 0.24837322, 0.67221406]])

## Special things about 2D arrays

Identity (eye-dentity) matrix

Transpose

## Vector algebra

In [33]:
# two 3-dimensional vectors
v = np.array([1,2,5])
w = np.array([4,-8,0])

Dot product (and vector length)

In [34]:
v.dot(w) # dot product
#   1*4 + 2*(-8) + 5*0

-12

In [35]:
v.dot(v)**0.5 # length

5.477225575051661

Scalar multiplication

In [36]:
1.8 * v # scalar multiplication

array([1.8, 3.6, 9. ])

Elementwise sum

In [37]:
v+w # elementwise sum

array([ 5, -6,  5])

Elementwise product (?!)

In [38]:
v*w # elementwise product

array([  4, -16,   0])

## Arithmetic progressions

* `np.arange` is `start`, `stop`, `step`
* `np.linspace` is `first`,`last`,`number`

In [39]:
# Recall how you get a list of integer values
# in arithmetic progression using built-in stuff
list(range(3,20,2))

[3, 5, 7, 9, 11, 13, 15, 17, 19]

The similarly named `arange` from `numpy` does all this and more.

In [40]:
# From 2 up to but not including 3 in steps of size 0.1
np.arange(2,3,0.1)   # start, stop (not included), step

array([2. , 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9])

When you know how many points you want, rather than the spacing, it's better to use `np.linspace`.  It takes the first and last elements, then the number of evenly-spaced points you want between them.

In [41]:
# From 12 to 14 in 6 steps; 
np.linspace( 12, 14, 6 )   # first, last, number of elements

array([12. , 12.4, 12.8, 13.2, 13.6, 14. ])

## Accessing items

Zero-based indexing.  For multi-dimensional arrays, give several integer indices separated by commas.

In [47]:
v = np.arange(8,24,3)
v

array([ 8, 11, 14, 17, 20, 23])

In [44]:
A = np.array([[1,2,3],[4,5,6],[7,8,9],[0,-2,16]])
A

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [ 0, -2, 16]])

### Vector indexing: Just like lists

In [48]:
v[0]

8

In [49]:
v[4]

20

In [51]:
v[-1]

23

### Multidimensional indexing: use a tuple of indices

For matrices, it's `[row, col]`

In [53]:
print(A)
print()
print(A[0,1])  # row 0 column 1

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [ 0 -2 16]]

2


Omitted indices at the end mean "everything from those dimensions"

In [54]:
A[2] # Row 2

array([7, 8, 9])

Using `:` as an index means "everything from that dimension"

In [56]:
print(A)
print()
# All rows, column 1; that is, get column 1 as a vector
print(A[:,1])  

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [ 0 -2 16]]

[ 2  5  8 -2]


## Assigning items

**`numpy` arrays are mutable** 😱

## Slices

Can combine slice notation with multiple indices.

Slices return **views**, not copies.

## Equality and bool

`.all()` checks if an array of booleans is all `True`.

## Ufuncs

Functions that automatically apply to each entry in an array.

### Some arrays to operate on

### Examples of numpy ufuncs

Let $f(x) = 3x^2 - 8x + 14$.  Apply $f$ to each element of array `v`.

## Broadcasting

## Aggregations

`sum`, `max`, `min`, `argmax`, `argmin`, `mean`, `all`, `any`, `array_equal`

## Masks

## Pillow integration

* `np.array(img)` just works, if `img` is a `PIL.Image` object
* Use `PIL.Image.fromarray(A)` to make an image from an array
    * Shape `(height,width)` and dtype `uint8` for grayscale
    * Shape `(height,width,3)` and dtype `uint8` for color (last axis is red, green, blue)