# Tutorial 03 - Histograms and Thresholding  
## Dr. David C. Schedl

Note: this tutorial is geared towards students **experienced in general programming** and aims to introduce you to OpenCV.

Adapted from: 
* http://6.869.csail.mit.edu/fa19/schedule.html (written by Julie Ganeshan; @MIT)

Useful links:
* OpenCV Tutorials: https://docs.opencv.org/master/d9/df8/tutorial_root.html
* Image Processing in Pyhton: https://github.com/xn2333/OpenCV/blob/master/Seminar_Image_Processing_in_Python.ipynb



# Initilization

Let's import useful libraries, first.

In [None]:
import os
import cv2 # openCV
import numpy as np
import matplotlib.pyplot as plt
import plotly.graph_objects as go
import plotly.express as px
import plotly.io as pio

Let's download some images to work with. 
We use the Unix/Windows command `curl`. 
Images are in the local filesystem after downloading.

Image sources:

* [Place Kitten](https://placekitten.com/) - Of course, we will use pictures of cats! We use the base Place Kitten URL followed by a width and height separated by backslashes ''/''. For example, use the URL `https://placekitten.com/500/300` to fetch a cat image with a width of 500px and height of 300px.
* A picture of [Van Gogh from wikimedia](https://upload.wikimedia.org/wikipedia/commons/thumb/b/b2/Vincent_van_Gogh_-_Self-Portrait_-_Google_Art_Project.jpg/842px-Vincent_van_Gogh_-_Self-Portrait_-_Google_Art_Project.jpg) in a decent resolution. 
* You can use any other image, if you want.

In [None]:
!curl -o "cat.jpg" "https://placekitten.com/500/300" --silent
!curl -o "gogh.jpg" "https://upload.wikimedia.org/wikipedia/commons/thumb/3/32/Vincent_van_Gogh_-_National_Gallery_of_Art.JPG/367px-Vincent_van_Gogh_-_National_Gallery_of_Art.JPG" --silent

# Histogram

## Exercise 1<a id="exercise1" name="exercise1"> </a>📝: Manually compute a histogram!

Let's start with a simple exercise. We will compute a histogram of an image manually. 
For simplicity, we will use an 8-bit grayscale image. 
You can plot the histogram with plotly as follows:

```python
fig = px.bar(x=range(256), y=counts, labels={"x": "pixel value", "y": "count"})
```

**(a)** Compute the counts for each pixel value [0 to 255] in the image.

Compare your result with the next cell. It features histogram computations with NumPy or Plotly.

In [None]:
greyscale = cv2.imread("gogh.jpg", cv2.IMREAD_GRAYSCALE)
 

# Solution (a)
# Todo: Manually generate a histogram of the image


As always, NumPy/Python simplifies this a lot. Below you'll find a Histogram computation with NumPy (or Plotly in the comments).

In [None]:
# load the gogh image and convert it to grayscale
greyscale = cv2.imread("gogh.jpg", cv2.IMREAD_GRAYSCALE)


# image histogram
counts, bins = np.histogram(greyscale.ravel(), bins=range(257))
fig = px.bar(x=range(256), y=counts, labels={"x": "pixel value", "y": "count"})

# optionally use the plotly histogram function directly
# fig = px.histogram(x=pixels_grayscale.ravel(), nbins=256, labels={'x':'pixel value', 'y':'count'})

fig.show()


## Properties of Histograms

Let's look at some properties that can be derived from a histogram. <br>
We will plot the histogram and annotate it with the mean, median, and mode. <br>
Note: This is very much code, mostly for formatting. You can ignore it, if you want!

In [None]:
from plotly.express.colors import sample_colorscale
from plotly.subplots import make_subplots

def show_hist_stats(image: np.ndarray, show_stats: bool = True, use_cumulative: bool = False):

    x = np.linspace(0, 1, 5)
    c = sample_colorscale('HSV', list(x))

    # 8-bit (256) image histogram
    counts, bins = np.histogram(image.ravel(), bins=range(257))
    cumulative = np.cumsum(counts)

    fig = px.bar(x=bins[:-1], y=cumulative if use_cumulative else counts, labels={'x':'pixel value', 'y':'count'}, color_discrete_sequence=['black']*256)

    fig.update_layout(plot_bgcolor='white', margin=dict(t=0, b=0, r=0, l=0, pad=0))

    num_markers = 1000
    y_pos = -np.max(cumulative if use_cumulative else counts)*.05
    fig.add_traces([
        go.Scatter(x=np.linspace(0,255,num_markers), y=[y_pos]*num_markers, mode='markers', marker={'color': np.linspace(0,255,num_markers), 'colorscale': 'gray', 'size': 10, 'symbol': 'square' }),
    ])


    # show the mean, median, mode and std as vertical lines
    mean_value = np.mean(image)
    median_value = np.median(image)
    std_value = np.std(image)
    mode_value = np.argmax(counts)
    min_value = np.min(image)
    max_value = np.max(image)

    if show_stats:
        fig.add_trace(
            go.Scatter(
                x=[mean_value, mean_value],
                y=[np.max(cumulative if use_cumulative else counts),0],
                mode="lines+text",
                line=go.scatter.Line(color=c[0]),
                name="mean",
                text=["Mean", ""],
                textposition="top center",
                showlegend=True)
        )

        fig.add_trace(
            go.Scatter(
                x=[median_value, median_value],
                y=[np.max(cumulative if use_cumulative else counts),0],
                mode="lines+text",
                line=go.scatter.Line(color=c[1]),
                name="median",
                text=["Median", ""],
                textposition="top center",
                showlegend=True)
        )

        fig.add_trace(
            go.Scatter(
                x=[mode_value, mode_value],
                y=[np.max(cumulative if use_cumulative else counts),0],
                mode="lines+text",
                line=go.scatter.Line(color=c[2]),
                name="mode",
                text=["Mode", ""],
                textposition="top center",
                showlegend=True)
        )

        fig.add_trace(
            go.Scatter(
                x=[mean_value - std_value, mean_value + std_value],
                y=[np.max(cumulative if use_cumulative else counts)*0.7, np.max(cumulative if use_cumulative else counts)*0.7],
                mode="text+lines+markers",
                marker_symbol="line-ns",
                marker_line_width=1,
                marker_line_color=c[0],
                marker_size=10,
                line=go.scatter.Line(color=c[0]),
                name="std",
                text=["-std", "+std"],
                textposition=["middle left","middle right"],
                showlegend=True)
        )

        fig.add_trace(
            go.Scatter(
                x=[min_value, max_value],
                y=[-10, -10],
                mode="text+lines+markers",
                marker_symbol="line-ns",
                marker_line_width=1,
                marker_line_color=c[3],
                marker_size=10,
                line=go.scatter.Line(color=c[3]),
                name="min/max",
                text=["min", "max"],
                textposition=["middle left","middle right"],
                showlegend=True)
        )

        if use_cumulative:
            fig.add_trace(
                go.Scatter(
                    x=[0, 255],
                    y=[np.max(cumulative if use_cumulative else counts)*0.5, np.max(cumulative if use_cumulative else counts)*0.5],
                    mode="lines+text",
                    line=go.scatter.Line(color='gray', dash='dot'),
                    name="fifty_percent",
                    text=["50%", ""],
                    textposition="top center",
                    showlegend=True)
            )

    return fig



In [None]:
from plotly.subplots import make_subplots

greyscale = cv2.imread("gogh.jpg", cv2.IMREAD_GRAYSCALE)

fig = make_subplots(1, 2)
fig.add_trace(go.Image(z=cv2.cvtColor(greyscale, cv2.COLOR_GRAY2BGR), name="Image"), 1, 1)
traces = show_hist_stats(greyscale).data
for trace in traces:
    fig.add_trace(trace, 1, 2)
fig.show()

## Exercise 2<a id="exercise2" name="exercise2"> </a>📝: How do point operations change a histogram?

What happens if you apply the following point operations to an image? <br>
**(a)** multiply the image by a constant $f(a) = 1.5a$ <br>
**(b)** add a constant $f(a) = a + 50$ <br>
**(c)** invert the image $f(a) = 255 - a$ (for an 8-bit image)<br>

Think about what the histogram of the resulting image will look like and then check your answer by plotting the histogram of the resulting image. <br>

In [None]:
# Solutions: Todo

## Color Histograms

Just as we can compute a single histogram for a grayscale image, we can compute a histogram for each color channel in a color image.

In [None]:
from plotly.subplots import make_subplots
from skimage import data
img = cv2.imread('gogh.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
fig = make_subplots(1, 2)
# We use go.Image because subplots require traces, whereas px functions return a figure
fig.add_trace(go.Image(z=img), 1, 1)
for channel, color in enumerate(['red', 'green', 'blue']):
    fig.add_trace(go.Histogram(x=img[..., channel].ravel(), opacity=0.5,
                               marker_color=color, name=f'{color} channel'), 1, 2)
fig.update_layout(height=400)
fig.show()

## Cumulative Histograms


In [None]:
# load the gogh image and convert it to grayscale
greyscale = cv2.imread("gogh.jpg", cv2.IMREAD_GRAYSCALE)


# regular histogram
counts, bins = np.histogram(greyscale.ravel(), bins=range(257))
# cumulative image histogram
cumulative = np.cumsum(counts) # <--- this is the computation of the cumulative histogram
fig = px.bar(x=range(256), y=cumulative, labels={"x": "pixel value", "y": "count"})
fig.show()

print( f'number of pixels: {np.prod(greyscale.shape[:2])}\nH[K-1]: {cumulative[-1]}' )
print( f'H[median]: {cumulative[np.median(greyscale).astype(np.uint8)]} ~ (number of pixels) / 2: {np.prod(greyscale.shape[:2])/2}' )

## Histogram Equalization

Histogram equalization is a technique that can be used to improve the contrast of an image. It is a non-linear point operation that transforms the input image so that the output image has a uniform histogram.
Let's try it out on the image of Van Gogh.

In [None]:
greyscale = cv2.imread("gogh.jpg", cv2.IMREAD_GRAYSCALE)

#OpenCV's histogram equalization
equalized = cv2.equalizeHist(greyscale)


fig = make_subplots(1, 2)
fig.add_trace(go.Image(z=cv2.cvtColor(equalized, cv2.COLOR_GRAY2BGR), name="Image"), 1, 1)
traces = show_hist_stats(equalized, use_cumulative=True, show_stats=False).data
for trace in traces:
    fig.add_trace(trace, 1, 2)
fig.show()

## Exercise 3<a id="exercise3" name="exercise3"> </a>📝: Let's manually equalize the histogram of an image!


In [None]:
greyscale = cv2.imread("gogh.jpg", cv2.IMREAD_GRAYSCALE)
equalized = greyscale

# TODO: Implement histogram equalization

fig = make_subplots(1, 2)
fig.add_trace(go.Image(z=cv2.cvtColor(equalized, cv2.COLOR_GRAY2BGR), name="Image"), 1, 1)
traces = show_hist_stats(equalized, use_cumulative=True, show_stats=False).data
for trace in traces:
    fig.add_trace(trace, 1, 2)
fig.show()

## Histogram Matching

Histogram matching is a technique that can be used to transform the histogram of an image to match the histogram of another image. It is a non-linear point operation that transforms the input image so that the output image has the same histogram as the reference image.

OpenCV does not have a histogram matching function, but we can use skimage to do it. There is an example [online.](https://scikit-image.org/docs/stable/auto_examples/color_exposure/plot_histogram_matching.html)

In [None]:
import matplotlib.pyplot as plt

from skimage import data
from skimage import exposure
from skimage.exposure import match_histograms

reference = cv2.imread("gogh.jpg", cv2.IMREAD_GRAYSCALE)
image = cv2.imread("cat.jpg", cv2.IMREAD_GRAYSCALE)

# using skimage's match_histograms
matched = match_histograms(image, reference).astype(np.uint8)

# display images
imgs = [reference, image, matched]
titles = ['Reference', 'Image', 'Matched']

fig = make_subplots(2, len(imgs), subplot_titles=titles,
    horizontal_spacing = 0.05, vertical_spacing = 0.1)
for i, (img, title) in enumerate(zip(imgs, titles)):
    fig.add_trace(go.Image(z=cv2.cvtColor(img, cv2.COLOR_GRAY2BGR), name="Image"), 1, i+1)
    traces = show_hist_stats(img, use_cumulative=True, show_stats=False).data
    for trace in traces:
        fig.add_trace(trace, 2, i+1)

fig.show()