<a href="https://colab.research.google.com/github/Benned-H/Reading_List/blob/master/Notes/Hand_Recognition.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Hand Segmentation Using Skin Color and Background Information
By Wei Wang, Jing Pan

This paper presents a new method for segmenting hands from the background of an image. Their method uses an adaptive skin color model with three steps:
1. Capture pixel values of hand and background
2. Propose Gaussian models on the color space
3. Segment the image using various models, intersect the results

Results were better than other skin-color-only models.

## 1. Introduction

Various applications make accurate hand quite important, and segmentation is a crucial first step in this process. Because human skin is generally within a limited range of hues, color-based hand recognition has been investigated for decades. The process depends on two choices: the **color space** and the **model of distribution** for skin colors. Prior work used a variety of techniques and spaces, including:
* Color spaces: Normalized RGB, CIE, XYZ, HSV, HSI, YCbCr
* Gaussian model for distributions
* Edge detection
* Varied chrominance spaces
* Skin/edge information in various spaces

## 2. Color Space and Gaussian model

Their method was primarily concerned with the use of background information to help segmentation. Thus they only used the normalized RGB and YCbCr color spaces and a single Gaussian model.

**Normalized RGB**   
RGB is a convenient color model widely used for processing image data. Unfortunately, the RGB color space is not robust because it cannot define the same color in different conditions or illumination. Normalized RGB was proposed to help this problem, and indeed gets better performance under different light conditions *only in uniform illumination*. Normalized RGB can be calculated as:

$R=\frac{R}{R+G+B}$; $G=\frac{G}{R+G+B}$; $B=\frac{B}{R+G+B}$.

**YCbCr**   
YCbCr is considered to be better for our purposes than RGB. The clustering is better, it's easy to calculate, and has far less overlap between skin and non-skin tones in various illumination conditions. YCbCr separates out a luminance signal (Y) and two chrominance components (Cb and Cr). We can discard signal Y to improve performance over various lighting conditions. The transform from RGB to YCbCr is simple:

$
\begin{bmatrix}
Y \\ Cb \\ Cr
\end{bmatrix}=
\begin{bmatrix}
0.2568 & 0.5041 & 0.0979 \\
-0.1482 & -0.2910 & 0.4392 \\
0.4392 & -0.3678 & -0.0714
\end{bmatrix}
\begin{bmatrix}
R \\ G \\ B
\end{bmatrix}+
\begin{bmatrix}
16 \\ 128 \\ 128
\end{bmatrix}
$

## Gaussian Mixture Model - [Brilliant](https://brilliant.org/wiki/gaussian-mixture-model/#)

Gaussian mixture models (GMMs) are a probabilistic model for representing normally distributed subpopulations within an overall population (a normal distribution has mean = median = mode, and symmetry over its center). Mixture models don't need to know which subpopulation each data point belongs to, which make them somewhat unsupervised learning (e.g. human height data would have two normal distributions between the sexes, which a GMM could capture).

**Motivation**   
We might want to try modeling data with a GMM if it appears to have more than one 'peak' distribution. Unimodal (one 'peak') models would give a poor fit in such a case, and yet GMMs retain the computational benefits of a single Gaussian model.

**To be continued upon additional probability background...**

## 2 cont.

The properties of skin color can be modeled using a Gaussian distribution, which has the formula:   
$f(x)=\frac{1}{\sqrt{2\pi \sigma ^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}$, where $\mu$ is the mean value of the samples and $\sigma$ is the variance value.

Using this model for skin color is a process of matching each pixel in the image. If matched, we consider the pixel as a skin pixel, and if not we consider it background. The two parameters ($\mu$ and $\sigma$) decide the structure of our Gaussian model, and need to be learned. A common method for this is **offline training** on thousands of images, but these authors use an adaptive skin color model, which uses the center of the hand skin to calculate and constantly update the parameters of the model. This **online** model seems to work better in different illuminations. Because this paper doesn't explain that model, I'll read through the source of this idea:

# A New Method for Hand Segmentation Using Free-Form Skin Color Model
By Ahmad Yahya Dawod, Junaidi Abdullah, and Md.Jahangir Alam

Segmentation remains difficult; this paper proposes a new method using a free-form skin color model. The pixel values of the hand are represented in the YbCbCr color space, and the CbCr space is mapped to a CbCr plane. To cluster the region of skin color on this plane, edge detection is used (as opposed to an ellipse) to construct a free-form skin color model.

## I. Introduction

The goal of hand segmentation is to detect the position and orientation of hands in an image; the aim of skin color pixel classification is to determine if a color pixel is a skin color or non-skin color. There are several techniques used to model the skin color:
* An elliptical boundary model which fits an ellipse on the CbCr plane. The ellipse ends up including non-skin pixel colors, unfortunately.
* Coarse model - Fixed straight lines are used as boundaries to a coarse region, but again this includes non-skin pixels.
* Estimate the boundary by constructing bilinear and bicubic boxes around the CbCr pixel cluster. Same issue.

This paper proposes a new method that uses a free-form boundary which models the skin color depending on the person and minimizes the inclusion of non-skin pixels.

## II. Suggested Method

Their method consists of four modules:
1. Image acquisition (skin region cropping)
2. Mapping (CbCr color space mapping)
3. Morphology (erosion & dilation)
4. Boundary creation (detecting edges)

**Image Acquisition**   
Because different people have different skin tones, the authors believe (and I agree) that we shouldn't just define some general range for skin tones. Thus we need to crop a skin image of the person using the system to develop a free-form model specific to them. As has been mentioned, choosing the right color space is the first step we need to tackle. Long story short, we choose YCbCr for the previously written reasons (taken from here, by the way). Skin image cropping just crops the image so we only see a patch of the user's skin as the cropped result. We can then form a cluster in our color space with this example.

**CbCr Color Space Mapping**   
They observed that the intensity value Y of YCbCr has little influence on the color distribution. On the Cb and Cr plane, we can generate a map where white (255) points are skin pixels and black (0) are non-skin. The resulting 255x255 map is the range of skin color present in the cropped image.

**Morphology**   
This stage uses image processing to create a cleaner single free-form shape. Two operations are used: Dilation adds pixels to fill in missing pixels in the white cluster and erosion removes extra pixels not belonging to the white cluster. Both help our resulting segmentation, and are applied in the order of dilation, erosion, and then edge point extraction.

**Boundary Creation**   
Here we determine the actual region which will define skin and non-skin color pixels. We consider the white cluster and apply an edge detection algorithm, for which there are a few different methods (gradient, laplacian). The **gradient method** detects the edge by looking for the maximum and minimum in the first derivative of the image, whereas the **laplacian method** searches for zero crossings in the second derivative of the image.

For gradient edge detection, given image function $f(x,y)$, the gradient magnitude $g(x,y)$ and direction $\theta(x,y)$ are computed as:   
$g(x,y)\cong \sqrt{\Delta x^2 + \Delta y^2}$ and $\theta(x,y)\cong a \tan(\frac{\Delta y}{\Delta x})$, where $\Delta x =f(x+n,y)-f(x-n,y)$ and $\Delta y =f(x,y+n)-f(x,y-n)$, where $n$ is a small integer.

To be continued...   
*--Last revised 6/22/2019--*

# To Read:

* https://wolfcrow.com/understanding-luminance-and-chrominance/
* https://homepages.inf.ed.ac.uk/rbf/HIPR2/dilate.htm
* https://homepages.inf.ed.ac.uk/rbf/HIPR2/erode.htm
* Finish the first paper