## Introducing ```numpy``` and arrays

To begin processing image data, we need to understand what's going on behind the scenes.

We can do that using a library called ```numpy```, which stands for __Numerical Python__. 

In general, you should use this library when you want to do fancy mathemtical operations with numbers, especially if you have arrays or matrices.

In [None]:
# tools for interacting with the operating system
import os

# scientific python tool for 
import numpy as np # creating an abbreviation to save keystrokes

In [None]:
filepath = os.path.join("..", "..", "CDS-VIS", "test_samples", "sample-data-01.csv")

In [None]:
np.loadtxt(fname=filepath, delimiter=',')

The expression ```numpy.loadtxt(...)``` is a function call that asks Python to run the function ```loadtxt``` which belongs to the ```numpy``` library. This dotted notation is used everywhere in Python: the thing that appears before the dot contains the thing that appears after.


```numpy.loadtxt``` has two parameters: the name of the file we want to read and the delimiter that separates values on a line. These both need to be character strings (or strings for short), so we put them in quotes.

__Assign to variable__

In [None]:
data = np.loadtxt(fname=filepath, delimiter=',') #comma seperated

In [None]:
print(data)

In [None]:
print(type(data))

__numpy.ndarray__ tells us that we are working with an N-dimensional array

In this case, it's 2-dimensional

In [None]:
print(data.dtype)

In [None]:
print(data.shape)

In [None]:
60*40

__Index__

Indexing is similar to lists and strings, but we need to inlcude both row and column

In [None]:
first_value = data[0, 0]

In [None]:
print(f"First value in data: {first_value}")

__Question:__ What is the middle value of the array?

In [None]:
middle_value = data[30, 20]

In [None]:
print(f"Middle value in data: {middle_value}")

<img src="../../CDS-VIS/test_samples/python-zero-index.svg">

__Slice__

An index like [30, 20] selects a single element of an array, but we can select whole sections as well. 

For example, we can select the first ten columns of values for the first four rows like this:

In [None]:
print(data[0:4, 0:10])

First ten columns, rows five-ten

In [None]:
print(data[5:10, 0:10])

__Select only one row__

In [None]:
data[0,:]

__Select only one column__

In [None]:
data[:,0]

__Numpy functions__

In [None]:
np.mean(data)

In [None]:
max_value, min_value, std_dev = np.max(data), np.min(data), np.std(data)

In [None]:
print(f"Maximum: {max_value}")
print(f"Minimum: {min_value}")
print(f"Standard deviation: {std_dev}")

Show numpy + dot + tab, access full range of options. Show ```help()```

In [None]:
help(np.count_nonzero)

__Operation across rows__

In [None]:
print(np.mean(data, axis=0))

In [None]:
print(np.mean(data, axis=0).shape)

"Average score per day"

__Operation along columns__

"Average score per patient"

<img src="../../CDS-VIS/test_samples/numpy-axes.png">

In [None]:
print(np.mean(data, axis=1))

In [None]:
print(np.mean(data, axis=1).shape)

This is a good overview to show how things work wiht ```numpy```:

https://www.sharpsightlabs.com/blog/numpy-axes-explained/

## Basic image processing with OpenCV

We start by loading all of the modules we'll need for this class

In [None]:
# We need to incldue the home directory in our path, so we can read in our own module.
import sys
sys.path.append(os.path.join("..", "..", "CDS-VIS"))

In [None]:
# python framework for working with images
import cv2

# some utility functions for plotting images
from utils.imutils import jimshow

__Read image__

We can load an image using a handy function from OpenCV

In [None]:
path_to_image = os.path.join("..", "..", "CDS-VIS", "img", "trex.png")

In [None]:
image = cv2.imread(path_to_image)

In [None]:
jimshow(image, "Image")

__Save image__

In [None]:
filepath, _ = os.path.split(path_to_image)

In [None]:
outfile = os.path.join(filepath, "new_dino.jpg")

In [None]:
cv2.imwrite(outfile, image)

__Inspect image__

In [None]:
print(type(image))

In [None]:
print(image.shape)

## What is an image?

__Remember how ```numpy``` arrays work!__

ROWSxCOLUMNS == HEIGHTxWIDTH

In [None]:
height = image.shape[0]
width = image.shape[1]

In [None]:
print(f"[INFO] Image height: {height} pixels")
print(f"[INFO] Image width: {width} pixels")

In our image, there are 228*350 = 79,800 pixels

__What about the last one?__

In [None]:
image.shape[2]

<img src="../../CDS-VIS/test_samples/3-channels.png">

__NB!__

```OpenCV``` stores RGB tuples in REVERSE ORDER

__What colour is a specific pixel?__

In [None]:
b, g, r = image[0, 0]

In [None]:
print(f"[INFO] pixels at (0, 0) - Red: {r}, Green {g}, Blue: {b}")

__Modify colour__

In [None]:
image[0, 0] = (0, 0, 255) # remember - blue, green, red!
(b, g, r) = image[0, 0]

In [None]:
print(f"[INFO] pixels at (0, 0) - Red: {r}, Green {g}, Blue: {b}")

__Image slice__

In [None]:
corner = image[0:100, 0:100]

In [None]:
jimshow(corner, "Corner")

__Change corner colour__

In [None]:
image[0:100,0:100] = (0, 0, 255)

In [None]:
jimshow(image, "Update")