# Data Structures: `NumPy`

In [None]:
from py5canvas import *

[docs](https://numpy.org/doc/stable/reference/index.html)  
[absolute basics for beginngers](https://numpy.org/doc/stable/user/absolute_beginners.html)  
[quickstart](https://numpy.org/doc/stable/user/quickstart.html)  
[w<sup>3</sup>](https://realpython.com/numpy-tutorial/), [RealPython](https://realpython.com/numpy-tutorial/)

What is `NumPy`? It is a library providing additional functionalities for **arrays**. This not only becomes very useful for a lot of numerical manipulations, but also because of its underlying optimisations: operations on arrays are often faster than native Python loops!

In [None]:
# the convention is to give the alias `np`
import numpy as np

## Array creation

[absolute basics: creation](https://numpy.org/doc/stable/user/absolute_beginners.html#how-to-create-a-basic-array)  
[quickstart: creation](https://numpy.org/doc/stable/user/quickstart.html#array-creation)

[`np.array` doc](https://numpy.org/doc/stable/reference/generated/numpy.array.html)  
[`np.ones` doc](https://numpy.org/doc/stable/reference/generated/numpy.ones.html)  
[`np.zeros` doc](https://numpy.org/doc/stable/reference/generated/numpy.zeros.html)  
[`np.arange` doc](https://numpy.org/doc/stable/reference/generated/numpy.arange.html)  
[`np.linspace` doc](https://numpy.org/doc/stable/reference/generated/numpy.linspace.html)  

In [None]:
python_list_of_lists = [[1,2,3],[4,5,6],[7,8,9]]

# turn a Python list into a numpy array
np_list_of_lists = np.array(python_list_of_lists)

# # same as:
# np_list_of_lists = np.array([[1,2,3],[4,5,6],[7,8,9]])

np_list_of_lists

In [None]:
# pass the shape as a tuple
np.ones(shape=(3,3))
# # same as:
# np.ones((3,3))

In [None]:
np.zeros((2,4))

In [None]:
np_range = np.arange(24)
np_range

In [None]:
# create 20 equally spaced numbers between -10 and 10
np.linspace(-10, 10, 20)

### Extra: py5canvas

In [None]:
canvas_size = (255, 255)
# the star means 'unpack': from `(255, 255)` to 255, 255 (not a tuple)
create_canvas(*canvas_size)

# the default dtype is float, and for that the expected pixel values are [0,1]
# (otherwise, if it's integers, the values are [0,255])
pixels = np.zeros(canvas_size)
# use the pixels as an image, place the top left corner at 0,0 
image(pixels, 0, 0)
show()

# TODO:
# - Try np.ones instead (blank canvas)
# - Try and make a smaller image, place it elsewhere than 0,0
# - Try and create two images
# - Try and modify the values inside pixels: 
#   `np.zeros(canvas_size) + .5` or
#   `np.ones(canvas_size) / 2`
#   What happens? Do you know why?

In [None]:
canvas_size = (255, 255)
create_canvas(*canvas_size)

# linspace: go from 1. (white) to 0 (black) incrementally, with
# as many steps as we have pixels
pixels = np.linspace(1., 0., canvas_size[0] * canvas_size[1])

# our single line of values can now be reshaped to be (255,255)
pixels = pixels.reshape(canvas_size)

image(pixels, 0, 0)
show()

# TODO: try `pixels.T` ?

## Array Shapes

[absolute basics: attributes](https://numpy.org/doc/stable/user/absolute_beginners.html#array-attributes)  
[quickstart: printing](https://numpy.org/doc/stable/user/quickstart.html#printing-arrays)

In [None]:
np_list_of_lists

In [None]:
# the shape is 3x3 (3 rows, 3 columns)
np_list_of_lists.shape

## Reshaping

[absolute basics: reshape](https://numpy.org/doc/stable/user/absolute_beginners.html#can-you-reshape-an-array)  
[quickstart: printing](https://numpy.org/doc/stable/user/quickstart.html#printing-arrays)

[`np.reshape` doc](https://numpy.org/doc/stable/reference/generated/numpy.reshape)  
[`np.transpose` doc](https://numpy.org/devdocs/reference/generated/numpy.transpose.html), [`np.array.transpose` doc](https://numpy.org/doc/2.2/reference/generated/numpy.ndarray.transpose.html)

In [None]:
np_range

In [None]:
np_range.shape

In [None]:
# pass a tuple describing the shape (or separate arguments)
# NOTE: it must be compatible with the number of elements! (2 x 12 = 24)
np_range.reshape((2,12))

# # same as
# np_range.reshape(2,12)

In [None]:
np_range.reshape((2,3,4))

In [None]:
np_range.reshape((4,3,2))

In [None]:
# `.T` transposes the vector, same as `np.transpose(np_row_vector)``
np_range.reshape((2,12)).T

### Extra: py5canvas

In [None]:
canvas_size = (255, 255)
create_canvas(*canvas_size)

pixels = np.zeros((100, 20))

# # TODO: try transposition
# pixels = np.zeros((100, 20)).T

# use the pixels as an image, place the top left corner at 0,0 
image(pixels, 50, 50)
show()

## Indexing & Slicing

[absolute basics: indexing and slicing](https://numpy.org/doc/stable/user/absolute_beginners.html#indexing-and-slicing)  
[quickstart: indexing, slicing and iterating](https://numpy.org/doc/stable/user/quickstart.html#indexing-slicing-and-iterating)

In [None]:
python_list_of_lists

In [None]:
np_list_of_lists

If we want to select the first row, all fine, Python and NumPy are the same. But if we want the first **column**, then Python is clunky, requires a loop:

In [None]:
# row ok
print(python_list_of_lists[0])
print(np_list_of_lists[0])

In [None]:
# column uh oh
print([row[0] for row in python_list_of_lists])

# yay indexing! 
# - ':' means all elements in the first dimension (all rows)
# - '0' in the second dimension (first element of each column)
print(np_list_of_lists[:, 0])

In [None]:
new_range = np.arange(10*10).reshape((10,10))
print(new_range)

In [None]:
new_range[-5:,5:]

In [None]:
np_list_of_lists

In [None]:
# in each axis, slicing works exactly like in regular Python
print(np_list_of_lists[1:3, 1])

The `[]` syntax to select elements is the same, just with additional functionalities:

```python
      ← outer axes              inner axes →
array[first axis, second axis, ... last axis]
```

In [None]:
# the use of `...` means: take everything in all the remaining axes
print(np_list_of_lists[-1:, ...])

### Extra: py5canvas

In [None]:
# load image returns a (PIL) Image object, values are int [0,255]
img = load_image("../../pics/Lijn.Neurograph-Untitled-II.png")

# rezising the image to the third of its size
img = img.resize((int(img.width/2), int(img.height/2)))

print("Image shape is now:", img.height, img.width)

# beware!, in canvas we have the width/height order, not height/width :<
canvas_size = (img.width, img.height)

create_canvas(*canvas_size)

# to manipulate it using numpy, we can convert it
img = np.array(img)

print("Image np.array shape:", img.shape)

# then slice through it
crop_selection = img[110:220, 200:300, :]

# after the placement, we can resize the image
image(img, 0, 0)
image(crop_selection, 0, 0)
show()

# TODO:
# - Select another object and display it (perhaps don't the full image, don't
# resize the image at the top of the cell, and display only the selected object?)
# - Unfortunately, since we have (height, width, channels), we can't do `.T` here,
# but it's possible to transpose the image nonetheless, using:
# `np.transpose(crop_selection, (1,0,2))` (this flips the 0th and 1st dimensions)

## Methods

[absolute basics: more useful operations](https://numpy.org/doc/stable/user/absolute_beginners.html#more-useful-array-operations)  
[quickstart: basic operations](https://numpy.org/doc/stable/user/quickstart.html#basic-operations)

Many, many available operations! Often they can be found either as method (`array.max()`) or as functions (`np.max(array)`).

Examples:
- `.min() / .max()`
- `.sum() / .mean() / .std()`
- `.exp() / .log() / .sqrt() / etc.`

In [None]:
np_list_of_lists

In [None]:
np_list_of_lists.max()

In [None]:
# max of each column, try 1 for row
np_list_of_lists.max(axis=1)

In [None]:
# can also accept an `axis` argument, like most methods
np_list_of_lists.mean()

`.argmax`

## Extra: Random Numbers

[Random Sampling doc](https://numpy.org/doc/stable/reference/random/index.html#random-sampling)  
[RealPython tutorial](https://realpython.com/numpy-random-number-generator/)

In [None]:
seed = 42

# instantiate a random number generator object
np_rng = np.random.default_rng(seed)

### Integers

[`np_rng.integers` doc](https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.integers.html)  
[`np.random.randint` doc](https://numpy.org/doc/stable/reference/random/generated/numpy.random.RandomState.randint.html) (legacy)

In [None]:
# create matrix of 2x2, integers from 0 to 10 [exclusive]
np_rng.integers(0, 10, (2,2))

# # old way:
# np.random.randint(0, 10, (2,2))

### Sequences

[`np_rng.choice` doc](https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.choice.html)  
[`np.random.choice` doc](https://numpy.org/doc/stable/reference/random/generated/numpy.random.RandomState.choice.html) (legacy)

In [None]:
# create matrix of 2x2, numbers sampled uniformly from the given list, with replacement
np_rng.choice([1,2,3,4,5,6], replace=True, size=(2,2))

# # old way:
# np.random.choice([1,2,3,4,5,6], replace=True, size=(2,2))

[`np_rng.shuffle` doc](https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.shuffle.html)  
[`np.random.shuffle` doc](https://numpy.org/doc/stable/reference/random/generated/numpy.random.RandomState.shuffle.html) (legacy)

In [None]:
# shuffle the elements in-place!
a = [1,2,3,4,5,6]
np_rng.shuffle(a)

# # old way:
# np.random.shuffle(a)
a

### Floats (distributions)

[`np_rng.random` doc](https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.random.html)  
[`np.random.rand` doc](https://numpy.org/doc/stable/reference/random/generated/numpy.random.RandomState.rand.html) (legacy)

In [None]:
# uniform distribution within [0, 1]
np_rng.random()

In [None]:
# create matrix of 2x2, numbers sampled from the uniform distribution, in [0, 1)
np_rng.random(size=(2,2))

[`np_rng.uniform` doc](https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.uniform.html)   
[`np.random.uniform` doc](https://numpy.org/doc/stable/reference/random/generated/numpy.random.RandomState.uniform.html) (legacy)

In [None]:
# create matrix of 2x2, numbers from a uniform distribution in [2, 3]
np_rng.uniform(2, 4, size=(2,2))

# # old way
# np.random.uniform(2, 4, size=(2,2))

[`np_rng.normal` doc](https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.normal.html)  
[`np.random.randn` doc](https://numpy.org/doc/stable/reference/random/generated/numpy.random.RandomState.randn.html) (legacy)  
[`np.random.normal` doc](https://numpy.org/doc/stable/reference/random/generated/numpy.random.RandomState.normal.html) (legacy)

In [None]:
# one number sampled from a standard normal distribution
np_rng.normal()

In [None]:
# create matrix of 2x2, numbers from a gaussian of mu 3, sigma 10
# (`loc` is mu, `scale` is sigma)
np_rng.normal(loc=3, scale=10, size=(2,2))

# # old way, same matrix, numbers from standard normal (mu = 0, sigma = 1)
# np.random.randn(2,2)

# # old way, same matrix, numbers from a normal with mu = 3, sigma = 10
# np.random.normal(loc=3, scale=10, size=(2,2))

### Extra: py5canvas

In [None]:
canvas_size = (255, 255)
create_canvas(*canvas_size)

pixels = np_rng.random(canvas_size)
# use the pixels as an image, place the top left corner at 0,0 
image(pixels, 0, 0)
show()

# TODO: want colour? You need the pixels in each channel to be different,
# and you can do that by creating not just a random array of (255,255), but
# by doing `np_rng.random((*canvas_size,3))`

Note: for random textures using e.g. Perlin noise (or similar) in `py5canvas`, use [`noise_grid`](https://github.com/colormotor/py5canvas/tree/main/docs#noise_grid) (using the library [`rumore`](https://github.com/colormotor/rumore) under the hood).

## Extra: Broadcasting

[doc](https://numpy.org/doc/stable/user/basics.broadcasting.html)  
[absolute basics: broadcasting](https://numpy.org/doc/stable/user/absolute_beginners.html#broadcasting)

Broadcasting is one of the great powers of `NumPy`. It allows you to operate on an entire array in one go (again eliminating the need for loops)!

In [None]:
# fails
python_list_of_lists + 1

In [None]:
[[x + 1 for x in row] for row in python_list_of_lists]


In [None]:
# it's also possible to combine arrays
np_row_vector = np.array([[5,10,15]])

print("shape:", np_row_vector.shape)
print(np_row_vector)
print()

# add the vector to every row
print("shape:", np_list_of_lists.shape)
print(np_list_of_lists + np_row_vector)

In [None]:

# `.T` transposes the vector, same as `np.transpose(np_row_vector)`
np_column_vector = np_row_vector.T
print("shape:", np_column_vector.shape)
print(np_column_vector)
print()

# add the vector to every column
print("shape:", np_list_of_lists.shape)
print(np_list_of_lists + np_column_vector)


Especially if you want to go on to study AI later, I highly, highly recommend that you learn this:


![Sasha Rush, broadcasting](pics/srush-broadcasting.png)
[Source](https://twitter.com/srush_nlp/status/1516781757596680194?t=RwVp5kUWPvHG-e42wo0ryw&s=19)

### Extra: py5canvas

In [None]:
# load image returns a (PIL) Image object, values are int [0,255]
img = load_image("../../pics/Lijn.Neurograph-Untitled-II.png")

# rezising the image to the third of its size
img = img.resize((int(img.width/2), int(img.height/2)))

canvas_size = (img.width, img.height)

create_canvas(*canvas_size)

# to manipulate it using numpy, we can convert it
img = np.array(img)

# turn the values to float, and from [0,255] to [0,1]
img = img / 255.

# modify each channel independently, by using broadcasting
# the '.2' will multiply all the red-channel pixels, reducing
# them by 80%, whilst green- and blue-channels are kept intact
img = img * [.2, 1., 1.]

# after the placement, we can resize the image
image(img, 0, 0)
show()

# TODO: try other colour combinations by changing the array
# (good first test: keep one colour up, and the rest to almost nothing)
