<h1 style="font-size:30px;">Image Histograms </h1>

Image histograms are primarily used as an analysis tool in computer vision as they quantify the distribution of data associated with an image (for example, intensity values within an image). In this notebook we will demonstrate how to produce image histograms and how to interpret the plots. We will also cover the topic of histogram equalization which can help improve the contrast in poorly illuminated images. As you will see, the results are often stunning.

In [None]:
import cv2
import numpy as np
import matplotlib
from matplotlib import pyplot as plt
plt.rcParams['image.cmap'] = 'gray'

In [None]:
# Check version of matplotlib, should be greter than or equal to 3.3.0. 
matplotlib.__version__

# 1. Introduction to Histograms

Histograms are collected counts of data organized into a set of predefined bins. When we plot histograms, we need to specify the number of bins along the x-axis. A bin represents a bucket range of values, such as pixel intensities 0-9, 10-19, 20-29, and so on. We will see several examples below that will help solidify this concept.

<hr style="border:none; height: 4px; background-color:#D3D3D3" />

### <font color="green">Function Syntax </font>
``` python
retval = plt.hist(x[, bins[, range[, ...]]])
```

`retval`: Array or list of arrays, the values of the histogram bins. If input is a sequence of arrays [data1, data2, ...], then this is a list of arrays with the values of the histograms for each of the arrays in the same order. The dtype of the array n (or of its element arrays) will always be float even if no weighting or normalization is used.

The function has **1 required input argument** and several optional flags:

1. `x`: Array or sequence of (n,) arrays. Input values, this takes either a single array or a sequence of arrays which are not required to be of the same length.
2. `bins`: Defines the number of equal-width bins in the range. This is an **optional argument** with a default value of 10.
2. `range`: The lower and upper range of the bins, where lower and upper outliers are ignored. This is an **optional argument** with a default value of None, equivalent to using the whole range of the input `x`.


### <font color="green">OpenCV Documentation</font>

[**`histogram tutorial`**](https://docs.opencv.org/4.5.2/d8/dbc/tutorial_histogram_calculation.html)

### <font color="green">Matplotlib / Numpy Documentation</font>

[**`zeros()`**](https://numpy.org/doc/stable/reference/generated/numpy.zeros.html)
[**`ravel()`**](https://numpy.org/doc/stable/reference/generated/numpy.ravel.html)
[**`hist()`**](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.hist.html)

<hr style="border:none; height: 4px; background-color:#D3D3D3" />

### <font style="color:rgb(50,120,230)">Compare calcHist() with plt.hist()</font>


<hr style="border:none; height: 4px; background-color:#D3D3D3" />

### <font color="green">Function Syntax </font>

```python
    hist = cv2.calcHist(images, channels, mask, histSize, ranges[, hist[, accumulate]])
```

This function has **5 required arguments**:

1. `images`:	Source arrays. They all should have the same depth, CV_8U, CV_16U or CV_32F , and the same size. Each of them can have an arbitrary number of channels.

2. `channels`:	List of the dims channels used to compute the histogram. The first array channels are numerated from 0 to images[0].channels()-1 , the second array channels are counted from images[0].channels() to images[0].channels() + images[1].channels()-1, and so on.

3. `mask`:	Optional mask. If the matrix is not empty, it must be an 8-bit array of the same size as images[i] . The non-zero mask elements mark the array elements counted in the histogram.

4. `histSize`:	Array of histogram sizes in each dimension.

5. `ranges`:	Array of the dims arrays of the histogram bin boundaries in each dimension. 
### <font color="green">OpenCV Documentation</font>


[**`calcHist()`**](https://docs.opencv.org/4.5.2/d6/dc7/group__imgproc__hist.html#ga4b2b5fd75503ff9e6844cc4dcdaed35d)

<hr style="border:none; height: 4px; background-color:#D3D3D3" />

# 3. Histogram Equalization
An image histogram is a graphical representation of the tonal distribution of data. Each histogram is simply an array with 256 bins, and each bins contains the number of pixels with that intensity.
Histogram Equalization is a non-linear method for enhancing contrast in an image. Let's see how to perform histogram equalization in OpenCV using [**`equalizeHist()`**](https://docs.opencv.org/4.1.0/d6/dc7/group__imgproc__hist.html#ga7e54091f0c937d49bf84152a16f76d6e). 

## 3.1 Histogram Equalization for Grayscale Images</font>

The function `equalizeHist()` performs histogram equalization on a grayscale image. The syntax is given below.

<hr style="border:none; height: 4px; background-color:#D3D3D3" />

### <font color="green">Function Syntax </font>

```python
	dst = cv2.equalizeHist(src[, dst])
```

**Parameters**

- **`src`**:	Source 8-bit single channel image.
- **`dst`**:	Destination image of the same size and type as src.

### <font color="green">OpenCV Documentation</font>


[**`equalizeHist()`**](https://docs.opencv.org/4.1.0/d6/dc7/group__imgproc__hist.html#ga7e54091f0c937d49bf84152a16f76d6e)

<hr style="border:none; height: 4px; background-color:#D3D3D3" />

As expected, the histogram is spread more uniformly over the range 

## 3.2 Histogram Equalization for Color Images</font>

For color images, we can not simply apply histogram equalization on the R, G, B channels separately. To understand why it is not a very good idea, let's take a look at an example.

### <font style="color:rgb(50,120,230)">Wrong Way</font>

In [None]:
# Read color image
img = cv2.imread('.jpg')
img_eq = np.zeros_like(img)

# Peform histogram equalization on each channel separately.
for i in range(0, 3):
    img_eq[:, :, i] = cv2.equalizeHist(img[:, :, i])

# Display the images.
plt.figure(figsize = (18, 6))
plt.subplot(121); plt.imshow(img[:, :, ::-1]); plt.title('Original Color Image')
plt.subplot(122); plt.imshow(img_eq[:, :, ::-1]); plt.title('Equalized Image')

### <font style="color:rgb(50,120,230)">Right Way</font>

We just saw that histogram equalization performed on the three channels separately leads to poor results. The reason is that when each color channel is non-linearly transformed independently, you can get completely new and unrelated colors. 

The right way to perform histogram equalization on color images is to transform the images to a space like the **HSV** colorspace where colors/hue/tint is separated from the intensity. 

**WORKFLOW**

1. Tranform the image to HSV colorspace.
2. Perform histogram equalization only on the V channel. 
3. Transform the image back to RGB colorspace.

In [None]:
# Read the color image.
img = cv2.imread('.jpg', cv2.IMREAD_COLOR)

# Convert to HSV.
img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

# Perform histogram equalization only on the V channel, for value intensity.
img_hsv[:,:,2] = cv2.equalizeHist(img_hsv[:, :, 2])

# Convert back to BGR format.
img_eq = cv2.cvtColor(img_hsv, cv2.COLOR_HSV2BGR)

# Display the images.
plt.figure(figsize = (18, 6))
plt.subplot(121); plt.imshow(img[:, :, ::-1]); plt.title('Original Color Image')
plt.subplot(122); plt.imshow(img_eq[:, :, ::-1]); plt.title('Equalized Image')