# *Classification of Natural vs. Computer Generated Images*

## Authors
* Yixiao Feng
* Jyothsna Kurra
* Justin Lewis
* Tim Woodbury

### -------------------------------------------------------------------------------------------------------------------

## Project Overview:

* Two sets of training data given: Natural ("scenes") and computer generated ("synthetic")
* Model-based classification
* Generalized detection approach:
    * Identify appropriate features
    * Train models for the features
    * Validate selected models against the provided data sets
    
Four strategies are considered for discriminating scenes and synthetic images.
These are as follows: (1) edge detection; (2) grayscale intensity; (3) RGB peak density; (4) image sharpness.
The strategies are each discussed in one of the following sections, in which both the motivation for a particular strategy and results are shown.

### -------------------------------------------------------------------------------------------------------------------

### I) Edge Detection

Edge detection is considered to distinguish synthetic and natural images.
Natural images, which have greater sources of distortion and imperfect lighting, are expected to have a greater number of pixels detectable as edges.
This section will present the basic approach used including the training of models for hypothesis testing.
Subsequently the likelihood ratio test for image discrimination is presented and the correct classification rate is discussed.

#### Approach
Gradient-based edge detection is used.
Pixels for which the gradient exceeds a fixed threshold are classified as edges.
The threshold is determined using a training set of 23 each synthetic and natural images.
The following figure shows normalized histograms of the pixel gradients across all images in the data sets.
Based on the histogram, a threshold value of 4 was chosen.
All pixels whose gradients exceed the threshold are classified as edges.

<img style="float: left;" src="img1.png">

The threshold is applied to the same training set.
For uniformity, each image is resized to $(540 \times 960)$ before it is processed, ensuring that differences in edge counts are not due to differences in size.
The number of edge pixels, as determined by the threshold on the gradient, is then totalled.
Applying this metric to the synthetic and natural images separately, a histogram for the number of edge pixels in the training images is obtained.
These histograms are shown in the following figure.

<img style="float: left;" src="img2.png">

The histograms are coarse because of the small size of the training sets.
However, they provide a means for determining a plausible probability density function (PDF) for the synthetic and natural image sets.
The histograms have very wide tails, so Cauchy distributions are fit to the data.
The Cauchy distributions for natural and synthetic images have the following parameters:

\begin{align}
\mathrm{scenes}: x_0 = 131700, \gamma = 52984.7 \\
\mathrm{synthetic}: x_0 = 35760.8, \gamma = 5377.61
\end{align}

Here, $x_0$ indicates the mean and $\gamma$ the scale parameter.
Having determined approximate distributions for the number of edges in the synthetic and natural images, it is a straightforward matter to apply a likelihood ratio test for a new candidate image to classify it as synthetic or natural.

#### Likelihood ratio test and performance

The likelihood ratio test compares the lieklihood of the measured datum, $z$, for the two candidate hypothesis.
A Bayesian framework is incorporated, so we arbitrarily can choose "synthetic" as the null hypothesis and "natural" as the test hypothesis.
In the Bayesian framework, the likelihood ratio is as follows:

\begin{equation}
\frac{Pr(H_0)}{Pr(H_1)} \lessgtr \frac{p_0(z)}{p_1(z)}
\end{equation}

$Pr(H_i)$ indicates the probability of a hypothesis $H_i$ and $p_i(z)$ is the associated probability density function for the test statistic.
For simplicity, synthetic and natural images are treated as equally likely, so the likelihood ratio is compared to one.
The following figure shows the likelihood ratios for the synthetic and natural image sets.
The likelihood threshold based on the priors is plotted for comparison.
Clearly, the test is conservative with respect to synthetic images, and fails to detect all the natural scenes.
No doubt better performance could be obtained using larger training sets and a more refined edge detection scheme.

<img style="float: left;" src="img3.png">

Using equal priors, there are zero false positives out of ninety-nine synthetic images and seven false negatives out of fifty-six natural scenes.
This is a total error rate of just about 4.5% for the whole data set.
It should be noted that the test set includes the images used in training the statistics for the Cauchy distributions.

### -------------------------------------------------------------------------------------------------------------------

### II) Grayscale Intensity

#### Approach: 
Synthetic and natural images are expected to have measureably different intensity distributions.
Natural images, in general, are expected to have a smoother distribution of intensity values, while synthetic images have a sharper distributions.
The strategy is to compute the sum of the difference between adjacent bins of the grayscale histogram, which should be a function of histogram smoothness.

#### Simple illustration of natural image histogram

<img src="Image_codeline3.png">

#### Fitted  Image Histograms

The following image shows a histogram of the grayscale intensity metric for the natural images.
A normal distribution fit to the data is shown, and is used to approximate the distribution for computing the likelihood ratio.

<img src="Image_codeline9.png">

The following image shows a histogram of the grayscale intensity metric for the synthetic images.

<img src="Image_codeline9(2).png">

The final image shows the distributions for the natural images (blue) and the synthetic images (green).
The continuous PDFs are used in a likelihood ratio test for new images in the same fashion introduced in Section 1.

<img src="Image_codeline9(3).png">

#### Result

The proposed likelihood ratio test based on grayscale histogram difference is evaluated on the available image sets.
In this section, the null hypothesis is a natural image, so the false positive rate indicates the fraction of images incorrectly classified as synthetic.

* False Positive Rate: 9.6%
* False Negative Rate: 26.5%
* Total Error Rate: 18%

### -------------------------------------------------------------------------------------------------------------------

### III) RGB Peak Density

#### Approach

The intensity-based metric of Section 2 performs reasonably well.
However, it seems plausble that better performance could be achieved by leveraging the color information of the images.
This is particularly reasonable, since the synthetic images are all of the same 3D models.
The limitation with this approach is that it is not practical to develop a model that is predictive for all of the natural images.

The feature of interest is essentially the same metric as in Section 2, but for all three color channels (R,G,B) simultaneously.
Essentially it is desired to look at the sharpness of all three color histograms.
The feature metric is the sum of the squared maximum difference within the subhistograms.
Taking $\Delta_i$ to be the difference between adjacent histogram bins for color channel $i$, the metric is written as follows:

\begin{equation}
z = \max ( \Delta_{red} )^2 + \max ( \Delta_{green} )^2 + \max ( \Delta_{blue} )^2
\end{equation}

To motivate the use of this metric, consider the following examples of red color histograms from one natural image and one scene.

#### Example of Scene RGB Histogram:

<img src="sceneRhist.png">

#### Example of Synthetic RGB Histogram:

<img src="synthRhist.png">

Clearly, for the particular case considered, the peak of the red channel histogram is much "sharper" for the synthetic image.
The following sub-section presents representative training histograms and summarizes performance of a likelihood-ratio-based image detector.

#### Determine Threshold Rule

##### Method Explanation:

* Distribution fit: after several different attempts to fit a distribution to the metric histograms, cauchy CDFs were chosen
* Threshold: the decision region for both hypotheses was decided by using a threshold on the Log Likelihood Ratio. The threshold on the feature metric value was found to be 3.8e-5

#### Feature Metric Histograms:

<img src="metrichistogram.png">

#### Fitted Distribution Plot:

<img src="fitteddists.png">

#### Performance

* False Positive Rate:    0.415
* False Negative Rate:    0.163
* Overall Error Rate:     0.289

### -------------------------------------------------------------------------------------------------------------------

### IV) Image Sharpness

#### Approach

We shall use the following principle to design our statistical model in order to perform the binary detection of images: "A sharper good quality image will have higher number of high frequency components compared to a blurred image."
As the synthetic (computer generated) images has sharper edges when compared to the natural (photographic/scenic) images, hence, the synthetic image shall contain greater number of high frequency components when compared to the natural images.
The image blur metric is derived from the paper by De and Masilamani [1].

#### Further Explanation

Ref. [1] defines an image sharpness metric that is adapted in the present work to differentiate the sharper synthetic images from natural images.
The sharpness metric based on the 2D fast Fourier transform (FFT) of the image.
The metric is the fraction of pixels that exceed 1/1000th of the peak absolute FFT value.
Characteristic values of the metric for synthetic and natural images are computed from training sets, as follows.
20 images from each of the sets are taken as training data.
The following figures show histograms for the image sharpness metric in the synthetic and natural image training sets:

<img src="index.png">

<img src="index2.png">


The histograms are treated as probability mass functions and used in a likelihood ratio test as in previous sections.
For these distributions, the likelihood ratio is monotonic and can be converted to a threshold on the sharpness metric.
That threshold is taken as $0.71$.

#### Results

If we assume our test hypothesis ($H_1$) to be the detection of computer generated images, the probabilities of false alarm and miss detection are given as follows:

* The Rate of False Alarms is: 0.057
* The Rate of Missed Detections is: 0.028
* The Accuracy of the Detector is: 91.428 %

### -------------------------------------------------------------------------------------------------------------------

## Results Summarized:

### I) Edge Detection
#### Total Error Rate: 4.5%

### II) Grayscale Intensity
#### Total Error Rate: 18.0%

### III) RGB Peak Density
####  Total Error Rate: 27.0%

### IV) Image Sharpness
####  Total Error Rate: 8.5%


## Conclusions

* The Intensity and RGB methods work well for computer generated images which are dominated by a few colors; however, in this way it is more limiting than other methods.
* Analyzing this problem within the Fourier Domain space seems more robust than analyzing color or intensity sharpness. There were several images within the scene set which would have been very difficult to distinguish based on predominating color. 
* Additionally, analysis of spatial variation also seems to be more robust than the intensity/RGB methods. Many natural photos are prone to more spatial noise. This noise is amplified by derivative-esque transformations and could prove as distinguishable for a wider range of photos

## References
[1] D. Kanjar, V. Masilamani. "Image sharpness measure for blurred images in frequency domain." Procedia Eng, 64 (2013), pp. 149–158.