# Histograms (1/3)

Histogram is a visual interpreatation of the distribution of a continuous feature. It can also display a general trend of the frequency (kernel density estimation plot - a frequency distribyution graph).

## Image Histograms

For images, we can display the frequency of values for colors. Each of the three RGB channels has values between 0 and 255. We can plot these as 3 histograms on top of each other to see how much of each channel there is.

`0` value denotes minimal intensity (black colour) so if we have peaks near `0` that means we have lots of pixels of black color. 

It is possible to create image histograms with matplotlib and OpenCV.


In [None]:
import cv2
import numpy as np
import matplotlib.pyplot as plt

In [None]:
dark_horse = cv2.imread('../data/horse.jpg') # Original OpenCV channeling is BGR
# print(type(dark_horse)) # check if image loading was successful
# plt.imshow(dark_horse)
show_horse = cv2.cvtColor(dark_horse, cv2.COLOR_BGR2RGB) # OpenCV loaded image in BGR space but matplotlib requires RGB
plt.imshow(show_horse)
# we can see there is lots of black pixels so we can expect peaks around zeros

In [None]:
rainbow = cv2.imread('../data/rainbow.jpg')
# print(type(rainbow)) # check if image loading was successful
# plt.imshow(rainbow)
show_rainbow = cv2.cvtColor(rainbow, cv2.COLOR_BGR2RGB) # OpenCV loaded image in BGR space but matplotlib requires RGB
plt.imshow(show_rainbow)
# colors seem to be equally distributed across values; no significant peaks are expected in histogram

In [None]:
bricks = cv2.imread('../data/bricks.jpg')
# print(type(bricks)) # check if image loading was successful
# plt.imshow(bricks)
show_bricks = cv2.cvtColor(bricks, cv2.COLOR_BGR2RGB) # OpenCV loaded image in BGR space but matplotlib requires RGB
plt.imshow(show_bricks)
# we can expect peaks for blue color

In [None]:
# channels: OpenCV channeling is BGR so Blue has index 0 and we want to show only histogram for blue colour here
# mask - optional parameter; used if we want to apply mask on the image so histograms is calculated only for the part of the image
# histSize and ranges -the upper limit is not included
hist_values = cv2.calcHist([show_bricks], channels=[0], mask=None, histSize=[256], ranges=[0, 256])
hist_values.shape

In [None]:
plt.plot(hist_values)

In [None]:
plt.imshow(show_horse)

In [None]:
hist_values = cv2.calcHist([show_horse], channels=[0], mask=None, histSize=[256], ranges=[0, 256])
plt.plot(hist_values)

## How to plot all 3 color histograms all at once

In [None]:
def show_histogram(img, title, xmax, ymax=None): 
    color = ('b', 'g', 'r')
    for i, clr in enumerate(color):
        histr = hist_values = cv2.calcHist([img], channels=[i], mask=None, histSize=[256], ranges=[0, 256])
        plt.plot(histr, color = clr)
        plt.xlim([0, xmax])
        if ymax != None:
            plt.ylim([0, ymax])
    plt.title(title)

show_histogram(bricks,'HISTOGRAM FOR BLUE BRICKS', 256)
# Histogram shows a very little contribution by red and more contribution by green and blue

In [None]:
# Let's do the same thing for horse:
show_histogram(dark_horse,'HISTOGRAM FOR DARK HORSE', 256)

In [None]:
show_histogram(dark_horse,'HISTOGRAM FOR DARK HORSE', 15, 500000)

In [None]:
# total number of pixels in this image
dark_horse.shape[0] * dark_horse.shape[1] * dark_horse.shape[2]
# most of them are black - close to (0, 0, 0)