<table>
  <tr>
    <td><img src="https://github.com/rvss-australia/RVSS/blob/main/Pics/RVSS-logo-col.med.jpg?raw=1" width="400"></td>
    <td><div align="left"><font size="30">Greyscale images</font></div></td>
  </tr>
</table>

(c) Peter Corke 2024

Robotics, Vision & Control: Python, see Chapter 11

## Configuring the Jupyter environment
We need to import some packages to help us with linear algebra (`numpy`), graphics (`matplotlib`), and machine vision (`machinevisiontoolbox`).
If you're running locally you need to have these packages installed.  If you're running on CoLab we have to first install machinevisiontoolbox which is not preinstalled, this will be a bit slow.

In [None]:
try:
    import google.colab
    print('Running on CoLab')
    !pip install machinevision-toolbox-python
    COLAB = True
except:
    COLAB = False

%matplotlib widget
import matplotlib.pyplot as plt

import numpy as np
from machinevisiontoolbox import *
import ipywidgets as widgets

# display result of assignments
if COLAB:
    %config ZMQInteractiveShell.ast_node_interactivity = 'last_expr_or_assign'
# make NumPy display a bit nicer
np.set_printoptions(linewidth=100, formatter={'float': lambda x: f"{x:10.4g}" if abs(x) > 1e-10 else f"{0:10.4g}"})



# Create an image from data

We create a 64 element vector of zeros, then set certain elements to one, then reshape it to be an 8x8 array

In [None]:
a = np.zeros((64,))
a[[17, 18, 21, 22, 25, 26, 29, 30, 41, 46, 50, 51, 52, 53]] = 1

a = a.reshape((8, 8))
print(a)


We can see some kind of pattern in here, but it's much more obvious if we display this matrix as an image -- each element of the matrix corresponds to a pixel.

In [None]:
plt.imshow(a)  # display the array using Matplotlib
plt.draw()

Now we can clearly see the pattern -- a face.  We can see the equivalence between a 2D NumPy array and an image.  The colors, magenta and yellow, are not part of the image, they are just the default behaviour of MatPlotLib's `imshow` method.  As you drift the cursor over the image the pixel coordinates and pixel values are displayed beneath the image.

We can also display this 2D array using the Machine Vision Toolbox for Python

In [None]:

idisp(a);

and the result is much the same, except the colors are now just black and white.  It's a common convention that zero valued pixels are displayed as black.  `idisp` has scaled the values in the array so that the biggest value, 1, is displayed as brightest white.

As you drift the cursor over the image the pixel coordinates and pixel values are displayed beneath the image. A subtle difference is that the pixel coordinates are always integers, and the datatype of the pixel is also shown.  

# Working with a real image file


<p style="border:3px; background-color:#FF0000; padding: 1em; text-align: center;">Note that in this section we will consider grey scale or monochrome images.  Have a look at the color-images.ipynb notebook in this folder.</p>

# Images and pixels

Now let's load a real greyscale image from a PNG file.  This particular image file is distributed with the Toolbox, but you can pass in the path to any image file you might have.  _If the Toolbox can't find the specified image it defaults to looking in the folder of images distributed with the Toolbox._

In [None]:
image = Image.Read("street.png")
# image = Image.Read("penguins.png")
# image = Image.Read("monalisa.png", grey=True)  # convert original color image to greyscale


`image` is an object that contains an image, the pixel data is contained in an internal NumPy array (a python style matrix) with dimensions

In [None]:
image.size

which we see has 851 columns and 1280 rows (remember NumPy always has rows as the first index).
The data itself, the "internal" NumPy array can be accessed by

In [None]:
array = image.image
array

is simply a big 2-dimensional table of 8-bit integers which represent brightness of each pixel as a number between 0 (black) and 255 (white).

We can access the value of the pixel at image coordinate (100,200), remember that's (horizontal, vertical) coordinate

In [None]:
image[100,200]

which we see is a `uint8` datatype of value 188.

However if we index the underlying NumPy array the same way

In [None]:
array[100,200]

we get a different result.  That's because for NumPy indexing the first index is the row, the second index is the column.

We need to reverse the indices

In [None]:
array[200,100]

**This is a bit of a trap for those starting out doing image processing with NumPy**  The MachineVision Toolbox is concerned with image processing so it strictly uses image coordinate indexing order.


We can see a lot of pertinant information about the image by

In [None]:
print(image)


But most importantly for an image, we can display it as an image

In [None]:
image.disp();

As we saw earlier, the notebook image view is interactive. As you move the cursor over the image, the pixel coordinates and value are updated at the bottom of the window.  The displayed pixel values are always in the range 0 to 255 which are minimum and maximum possible values for the `uint8` pixel data type.

A toolbar provides some extra functionality.  You can select a region to get an expanded view, pan that selected region around, change the zoom level, or revert to the original view.

**Q:**

* What's the lowest intensity value in the image?
* What's the highest intensity value in the image?
* What percentage of pixels have a value less than 100, or greater than 200?


# Histograms

To answer some of those questions we can plot a histogram which shows the frequency (or occurence) of the various grey levels within the image.

In [None]:
hist = image.hist()
print(hist)

Which is an instance of a `Histogram` object that contains statistics about the pixel values in the image.  We can plot the histogram using its `plot` method.

In [None]:
plt.figure()
hist.plot()

We can see that the histogram is pretty ragged with three dominant peaks.  Explore the image with the cursor and see which parts of the image correspond to these different peaks.

To answer a question like "What percentage of pixels have a value less than 100?" it's useful to show the normalized cumulative histogram. 

In [None]:
plt.figure()
hist.plot('ncdf')


From this we can see that around 64% of pixels have a value less than 100.  Maybe around 10% have a value above 200.

**Q: repeat this exercise for another image, maybe `"penguins.png"` or `"monalisa.png"`.**

# Binary images

Let's load an image that only has two unique pixel values: 0 or 255.

In [None]:
sharks = Image.Read('shark2.png')
sharks.disp();

Now let's look at another grey scale image.

In [None]:
penguins = Image.Read("penguins.png")
penguins.disp();

In [None]:
# Q. Add the code here to compute and plot the histogram

This time we see a much richer distribution of pixel values.  A lot of pixels have a value less than 100 and these are the dark background of the sign.  Clearly there are many shades of black.  Similarly for the foreground, there are many shades of white.

**Q: Move the mouse over the original image to explore where these different grey levels appear.**

# Thresholding

A very classical image processing operation is thresholding.  We could turn the grey level image above into a binary image by comparing every pixel with a constant value called the threshold

In [None]:
penguins = Image.Read("penguins.png")
binary_image = penguins > 80
binary_image.disp();

We now have only two types of pixels, black (value of 0) or white (value of 1), but they don't cleanly map to what we perceive as the black and white parts of the image.

If you drift the cursor over the image you will see that the pixel datatype is now `bool`.  This is because the image was created using a logical operator `image > 80`.  The Toolbox display `False` values as black and `True` values as white.

**Q: adjust the threshold in the code above to see the effect on the resulting binary image.**


An easier way to do this is to add a control slider to interactively set the threshold.

<p style="border:3px; background-color:#FF0000; font-weight: bold; padding: 1em; text-align: center;">Click the slider, don't drag it.</p>

In [None]:
binary_image.disp()  # draw it once
@widgets.interact
def animate( threshold =  widgets.IntSlider(value=80, description='threshold:',  min=1, max=255)):
    binary_image = penguins > threshold
    binary_image.disp(reuse=True)  # draw it again with updated data


**Q: adjust the threshold using the widget below, and explore the effect on the image.  Try to find a threshold that yields a binary image where black corresponds to the background of the sign and white corresponds to the foreground text.**

# Challenges with thresholding

Here is another greyscale image of a sign, but this one has a highlight due to the way the scene was lit.

In [None]:
castle = Image.Read("castle2.png")
castle.disp();

In [None]:
hist = castle.hist()
plt.figure()
hist.plot()

This histogram is more bimodal, that is there are two peaks.  

**Q: Move the cursor over the histogram and you can read off the coordinates of the peaks.**

**Q: Move the mouse over the image, explore the pixel values in the sign and background, and relate that to what you see in the histogram.**

**Q: Using the widget try to find a good threshold that separates the lettering of the sign from the background**

In [None]:
binary_image = castle > 80
binary_image.disp()
@widgets.interact
def animate( threshold =  widgets.IntSlider(value=80, description='threshold:',  min=1, max=255)):
    binary_image = castle > threshold
    binary_image.disp(reuse=True)

You will find that it is impossible to find a single threshold that separate all letters from the background.  This is an example of the limitations of thresholding:

* how do we choose the threshold?  Are there algorithms to do this?
* how do we make thresholding robust to uneven or variable lighting conditions?

**Q: Can you think of some algorithmic approaches that might segment out all the letters?**

**Q: Consider a complex scene like the one below, could you find the people or motorbikes by thresholding?**

<img src="https://petercorke.com/files/images/image3.jpg" width="400">