# FUNCTION histstat

The histstat function computes the statistics from the image histogram, which can be found at the [histogram](histogram.ipynb) notebook.

<strong> stats  = histstat(f, h, mask =[]) </strong>
<ul>
    <li> <strong> Output </strong> </li>
        <ul> 
            <li> <strong> stats: </strong> array with histogram' statistics </li>
        </ul> 
    <li> <strong> Input </strong> </li>
        <ul>
            <li> <strong> f: </strong> input image (Two-Dimensional or Three-Dimensional)</li>
            <li> <strong> mask: </strong> define the binary mask (Same dimention as the image)</li>
            <li> <strong> h:  </strong> image histogram </li>
        </ul>
</ul>

This function extracts some relevant statistics about the image from the histogram given by the histogram function. These statistics are:

<ul>
    <li> Mean; </li>
    <li> Variance; </li>
    <li> Skewness; </li>
    <li> Kurtosis; </li>
    <li> Percentile 01%; </li>
    <li> Percentile 10%; </li>
    <li> Percentile 50%; </li>
    <li> Percentile 90%; </li>
    <li> Percentile 99%; </li>
    <li> Entropy; </li>
    <li> Mode; </li>
    <li> Median; </li>
</ul>

In [2]:
import nbimporter
from auxiliary_functions import *
import ia636 as ia
import numpy as np
import matplotlib
from IPython.display import HTML

def histstat(f, h, mask=[],):
       
    hn = 1.0*h/h.sum()
    stats = np.zeros(12)
    
    ### compute statistics ###
    
    n = len(h) # number of gray values
    stats[0] = np.sum(np.arange(n)*hn)                                                   # media
    stats[1] = np.sum(np.power((np.arange(n)-stats[0]),2)*hn)                            # variancia  
    stats[2] = np.sum(np.power((np.arange(n)-stats[0]),3)*hn)/(np.power(stats[1],1.5))   # obliquidade
    stats[3] = np.sum(np.power((np.arange(n)-stats[0]),4)*hn)/(np.power(stats[1],2))-3   # curtose
    stats[4:9] = np.round(ia.iah2percentile(h,np.array([1,10,50,90,99])))                # 1, 10, 50, 90, 99 percentis 
        
    ## extra attributes ##
    
    stats[9] = -1*(hn*np.log10(hn+np.spacing(1))).sum() 
    stats[10] = np.argmax(hn)      
    stats[11] = np.where(np.cumsum(hn) >= 0.5)[0][0] 
    
    if stats[1] == 0:
        stats[2] = 0
        stats[3] = 0
    
    return (stats)

HTML('''<script>
code_show=true; 
function code_toggle() {
 if (code_show){
 $('div.input').hide();
 } else {
 $('div.input').show();
 }
 code_show = !code_show
} 
$( document ).ready(code_toggle);
</script>
<form action="javascript:code_toggle()"><input type="submit" value="Click here to toggle on/off the raw code."></form>''')


Importing Jupyter notebook from ia636.ipynb
Importing Jupyter notebook from histogram.ipynb


# EXAMPLES

In [3]:
import nbimporter
import numpy as np
import ia636 as ia
import scipy
import histogram as h

# Numeric Example
f = np.array([1,1,1,0,1,2,2,2,1])
histo = h.hist(f)
stats = histstat(f, histo)
print ('Numeric Example: Statistics')
print(stats)


Numeric Example: Statistics
[ 1.22222222  0.39506173 -0.20992233 -0.62109375  0.          1.          1.
  2.          2.          0.40688542  1.          1.        ]


In [4]:
# Numeric Example using a mask


f = np.array([1,1,1,0,1,2,2,2,1])
histo = h.hist(f)
mask = f>0
stats = histstat(f, histo, mask)
print('Numeric Example using a mask: Statistics')
print(stats)



Numeric Example using a mask: Statistics
[ 1.22222222  0.39506173 -0.20992233 -0.62109375  0.          1.          1.
  2.          2.          0.40688542  1.          1.        ]


In [5]:
# Numeric Example 3D

f = np.array([1,1,1,0,1,2,2,2,1,0,0,0,1,1,1,1,2,0,1,1,2,1,0,0]).reshape(2,3,4)
histo = h.hist(f)
stats = histstat(f, histo)
print('Numeric Example 3D: Statistics')
print(stats)

Numeric Example 3D: Statistics
[ 0.91666667  0.49305556  0.11700664 -0.97242611  0.          0.          1.
  2.          2.          0.44851494  1.          1.        ]


In [6]:
# Numeric Example 3D

f = np.array([[[1,2,2,0,0,1],[0,0,1,2,2,1],[1,1,0,0,0,2],[1,1,1,2,2,2],[1,1,2,2,0,0]], [[1,2,2,0,0,1],[0,0,1,2,2,1],[1,1,0,0,0,2],[1,1,1,2,2,2],[1,1,2,2,0,0]], [[1,2,2,0,0,1],[0,0,1,2,2,1],[1,1,0,0,0,2],[1,1,1,2,2,2],[1,1,2,2,0,0]]])
histo = h.hist(f)
stats = histstat(f, histo)
print('Numeric Example 3D: Statistics')
print(stats)


Numeric Example 3D: Statistics
[ 1.03333333  0.63222222 -0.05953097 -1.41606308  0.          0.          1.
  2.          2.          0.47567118  1.          1.        ]


In [7]:
# Numeric Example 3D using a mask

f = np.array([1,1,1,0,1,2,2,2,1,0,0,0,1,1,1,1,2,0,1,1,2,1,0,0]).reshape(2,3,4)
histo = h.hist(f)
stats = histstat(f, histo, mask = f>0)
print('Numeric Example 3D using a mask: Statistics')
print(stats)



Numeric Example 3D using a mask: Statistics
[ 0.91666667  0.49305556  0.11700664 -0.97242611  0.          0.          1.
  2.          2.          0.44851494  1.          1.        ]


In [None]:
# Image Example
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import ia636 as ia

f = plt.imread('cameraman.png', 'gray')
print ('f shape: ', f.shape)
plt.imshow(f, 'gray')
histo = h.hist(f)
stats = histstat(f, histo)
print('Image Example: Statistics')
print(stats)
print()


# EQUATIONS



The follow equations represents the statistics equations, where $N$ denotes the number of intensity levels in the imagem ( usually between 0 and 255) and $p(x)$ is the probability density function, i.e. the normalized histogram (h) of the image, and $F$ is a random variable (any image) that represents the pixels intensities $f_i$.

$\textbf{Mean}$
$$\mu(F) = \sum\limits_{i=1}^N f_i p(f_i)$$


$\textbf{Variance}$
$$Var(F) = \sum\limits_{i=1}^N (f_i - \mu )^{2}p(f_i)$$


$\textbf{Skewness}$
$$Skewness(F) = \frac{\sum\limits_{i=1}^N (f_i - \mu )^ {3} p(f_i)}{Var^{1.5}}$$


$\textbf{Kurtosis}$
$$Kurtosis(F) = {\frac{\sum\limits_{i=1}^N (f_i - \mu )^ {4} p(f_i)}{Var^{2}}}-3$$


$\textbf{Percentile}$

The k-th percentile $P_k$ is the value $f_k$ that corresponds to the cumulative frequency of $\frac{N_k}{100}$ where N is the amostral size.


$\textbf{Entropy}$
$$ Entropy(F) = - \sum\limits_{i=1}^N p(f_i) log_2{(p(f_i))}$$

$\textbf{Mode}$

The value that appears most often.


$\textbf{Median}$

It is the numerical value separating the higher half of a data sample or a probability distribution.


# REFERENCES

CARTER, T., An introduction to information theory and entropy, 2014.

DEAN, S., ILLOWSKY, Descriptive Statistics: Skewness and the Mean, Median, and Mode

EVERITT, B. S., SKRONDAL, A. The Cambridge Dictionary of Statistics, Fourth Edition, 2010.

ST0CKBURGER, D. W, Introductory Statistics: Concepts, Models, and Applications, 1996.

