## Introducing ```numpy``` and arrays

To begin processing image data, we need to understand what's going on behind the scenes.

We can do that using a library called ```numpy```, which stands for __Numerical Python__. 

In general, you should use this library when you want to do fancy mathemtical operations with numbers, especially if you have arrays or matrices.

In [1]:
# tools for interacting with the operating system
import os

# tool fo working with arrays
# creating an abbreviation to save keystrokes
import numpy as np

In [3]:
# load data
data = np.loadtxt(fname="../data/sample-data/sample-data-01.csv", delimiter=",")

In [None]:
# show array

The expression ```numpy.loadtxt(...)``` is a function call that asks Python to run the function ```loadtxt``` which belongs to the ```numpy``` library. This dotted notation is used everywhere in Python: the thing that appears before the dot contains the thing that appears after.


```numpy.loadtxt``` has two parameters: the name of the file we want to read and the delimiter that separates values on a line. These both need to be character strings (or strings for short), so we put them in quotes.

__Assign to variable__

In [None]:
# load array

In [4]:
# inspect array
type(data)

numpy.ndarray

In [None]:
# print data type

__numpy.ndarray__ tells us that we are working with an N-dimensional array

In this case, it's 2-dimensional

In [5]:
# print type of data points
data.dtype

dtype('float64')

In [6]:
# print shape
data.shape

(60, 40)

In [None]:
# check shape


__Index__

Indexing is similar to lists and strings, but we need to inlcude both row and column

In [7]:
# your code here
data[0,0]

0.0

In [None]:
# your code here

__Question:__ What is the middle value of the array?

In [None]:
# your code here

Print the value of ```middle_value``` to the screen:

In [None]:
# your code here

<img src="../data/viz/python-zero-index.svg">

__Slice__

An index like [30, 20] selects a single element of an array, but we can select whole sections as well. 

For example, we can select the first ten columns of values for the first four rows like this:

In [10]:
# your code here
data[0:4, 0:10]

array([[0., 0., 1., 3., 1., 2., 4., 7., 8., 3.],
       [0., 1., 2., 1., 2., 1., 3., 2., 2., 6.],
       [0., 1., 1., 3., 3., 2., 6., 2., 5., 9.],
       [0., 0., 2., 0., 4., 2., 2., 1., 6., 7.]])

First ten columns, rows five-ten

In [None]:
# your code here

__Select only one row__

In [13]:
# your code here
data[0,:]

array([ 0.,  0.,  1.,  3.,  1.,  2.,  4.,  7.,  8.,  3.,  3.,  3., 10.,
        5.,  7.,  4.,  7.,  7., 12., 18.,  6., 13., 11., 11.,  7.,  7.,
        4.,  6.,  8.,  8.,  4.,  4.,  5.,  7.,  3.,  4.,  2.,  3.,  0.,
        0.])

__Select only one column__

In [15]:
# your code here
data[:,0]

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0.])

__Numpy functions__

```numpy``` comes with a range of built-in methods which allow you to quickly and efficiently calculate descriptive statistics for an array.

In [16]:
# your code here
np.std(data)

4.613833197118566

In [17]:
# your code here
np.max(data)

20.0

In [18]:
# your code here
np.min(data)

0.0

In [19]:
mean_value, max_value, min_value = np.mean(data), np.max(data), np.min(data)

Show numpy + dot + tab, access full range of options. Show ```help()```

In [None]:
# your code here


__Operation across rows__

In [None]:
# your code here

In [None]:
# your code here

"Average score per day"

__Operation along columns__

"Average score per patient"

<img src="../data/viz/numpy-axes.png">

In [None]:
# your code here

In [None]:
# your code here

This is a good overview to show how things work wiht ```numpy```:

https://www.sharpsightlabs.com/blog/numpy-axes-explained/

## Exercise

- We saw how to calculate descriptive statistics for a single array. In the data folder, there are more examples of sample data in the folder called [data/sample-data]("../data/sample-data").
  - Write some code which does the following steps:
    - Load every CSV data file in the input folder one at a time
    - For each CSV file, calculate: 
      - The mean and median values for each patient
        - Create a list of tuples for each CSV 
          - Eg: [(```patient0_mean, patient0_median```),
                 (```patient1_mean, patient1_median```),
                 etc, etc]
      - The same as above, but this time calculating the mean, median, and modal values for each day
       

## Basic image processing with OpenCV

We start by loading all of the modules we'll need for this class

In [None]:
# We need to incldue the home directory in our path, so we can read in our own module.
import sys
sys.path.append("..")

In [None]:
# python framework for working with images
import cv2

# some utility functions for plotting images
from utils.imutils import jimshow

__Read image__

We can load an image using a handy function from OpenCV

In [1]:
# your code here
path_to_image = os.path.join("..", "data", "img", "trex.png")

NameError: name 'os' is not defined

In [None]:
# your code here
image = cv2.imread(path_to_image)

In [2]:
# your code here
jimshow(image, "A T-Rex, in all its glory")

NameError: name 'jimshow' is not defined

__Save image__

In [None]:
# The path for the image we want to save
outpath = os.path.join("..", "data", "img", "trex2.png")

In [None]:
# write the image to disk
cv2.imwrite(outpath, image)

In [None]:
# your code here


__Inspect image__

In [None]:
# your code here

In [None]:
# your code here

## What is an image?

__Remember how ```numpy``` arrays work!__

ROWSxCOLUMNS == HEIGHTxWIDTH

In [None]:
# your code here

In [None]:
# your code here

In our image, there are 228*350 = 79,800 pixels

__What about the last one?__

In [None]:
# your code here

<img src="../data/viz/3-channels.png">

__NB!__

```OpenCV``` stores RGB tuples in REVERSE ORDER

__What colour is a specific pixel?__

In [3]:
# your code here
b, g, r = image[0,0]

NameError: name 'image' is not defined

In [None]:
print(f"[INFO] pixels at (0, 0) - Red: {r}, Green {g}, Blue: {b}")

__Modify colour__

In [None]:
# your code here
image[0,0] = (0, 0, 0) # set the pixel at (0, 0) to be black
b, g, r = image[0,0]

In [None]:
print(f"[INFO] pixels at (0, 0) - Red: {r}, Green {g}, Blue: {b}")

__Image slice__

In [None]:
# your code here

In [None]:
# your code here

__Change corner colour__

In [None]:
# your code here

In [None]:
# your code here