In [1]:
import numpy as np
data = np.array(
    [
        [ # row 1
            [1,2,3], [4,5,6] 
        ], 
        [ # row 2
            [7,8,9], [10,11,12] 
        ]
    ]
)

In [2]:
data

array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [3]:
data[0] # row 1

array([[1, 2, 3],
       [4, 5, 6]])

In [4]:
data[1] # row 2

array([[ 7,  8,  9],
       [10, 11, 12]])

Here is what the array representation (pretend this is a 2x2 color image) looks like:

$ 
 \begin{bmatrix}
  (1,2,3) & (4,5,6) \\
  (7,8,9) & (10,11,12) 
 \end{bmatrix}
$

Consider each entry in the matrix as (H,S,V)-tuples.

In [5]:
data[0][0] # element 0th row, 0th column

array([1, 2, 3])

Imagine H=1, S=2 and V=3.

In [6]:
data.shape # notice the shape : 2x2 matrix, with 3-tuple or you could translate that as 2x2x3 matrix

(2, 2, 3)

What we are trying to find is the DataFrame that looks something like this (showing only the first 2 pixels/elements in row 0):




In [7]:
import pandas as pd
df = pd.DataFrame(data={'H': [1,4], 'S': [2,5], 'V': [3,6]})
df

Unnamed: 0,H,S,V
0,1,2,3
1,4,5,6


The reason we're interested in this is because once we have the image broken up, we can analyze the Hue and Saturation more easily.  This is ultimately an exercise in data transformation.

It is important to know that in this particular case, we don't care about preserving the dimensions -- we can have one large column for each value since we will be analyzing them as a single dimensional vector anyway.

## `np.compress` in action

The function takes a list of the elements to keep as the first argument.  The third argument, _axis_ let's us select the dimension of the data that we're interested in (the actual element data H, S or V).

What happens when we use it on our data.  If we wanted to select the first value (H), we can do something like this:

In [8]:
np.compress([True, False, False], data, axis=2)

array([[[ 1],
        [ 4]],

       [[ 7],
        [10]]])

Notice that these are the first values (H) of all HSV-tuples.  Because we are happy to see this as a single dimensional vector we can use the `np.flatten()` method.

## `np.flatten()` in action

What happens when we take the compressed data and then flatten it?

In [9]:
np.compress([True, False, False], data, axis=2).flatten()

array([ 1,  4,  7, 10])

Now we can get the S (Saturation) values.  The middle value of the tuple is `True` while the other values are `False`.

In [10]:
np.compress([False, True, False], data, axis=2).flatten()

array([ 2,  5,  8, 11])

And the V (Value) values -- notice that we have chosen last of the tuple to be `True`.

In [11]:
np.compress([False, False, True], data, axis=2).flatten()

array([ 3,  6,  9, 12])

If we put these in into a DataFrame, it might look like this:

In [12]:
pd.DataFrame(data=
                 {'H': np.compress([True, False, False], data, axis=2).flatten(),
                  'S': np.compress([False, True, False], data, axis=2).flatten(),
                  'V': np.compress([False, False, True], data, axis=2).flatten()})

Unnamed: 0,H,S,V
0,1,2,3
1,4,5,6
2,7,8,9
3,10,11,12


Now we can work with the DataFrame in a number of ways being asked for in the assignment.