### Capstone Project
## Machine Learning Engineer Nanodegree
Jianhua Li 
Mar 12st, 2017

## I. Definition

### Project Overview
As the built in cameras becomes more and more common in cell phones and personal computers, images and videos are piling up. The desire to understand what is in these images and thus to apply this information to facilitate our daily life becomes more and more strong. Computer vision is an interdisciplinary field and provides a powerful solution in high-level image as well as videos understanding. 

Face recognition, a form of computer vision, which uses the spatial geometry of distinguishing features of the face to identify or to authenticate a person. Software using webcam acquired images automatically log on an authorized user has been developed for many years by several companies. Computer access control system based on face recognition is also provided by three well-known computer manufacturers. Although it is convenient (hand-free), its security caused by the relatively accuracy is one of the major concerns compared with other biometric recognition systems. Nevertheless, it is still considered as a good supplement for current available solution.

The conventional face recognition pipeline consists of four stages which are face detection, face alignment, feature extraction and classification. One of the most import steps is the feature extraction. Conventional features include linear functions of the raw pixel values including Eigenface, Fisherface and Laplacianface. More recently, such linear features were replaced by hand-crafted features. Although hand-crafted features and metric learning achieve promising performance in constrained environment, the performance using these features degrades dramatically in unconstrained environments where face images conver complex and large intra-personal variations such as pose, illumination, expression and occusion. 

In recent years, deep learning, in particular Convolutional Neural Network (CNN) has achieved very impressive results in face recognition. Unlike the conventional hand-crafted features, the CNN learning-based features are more robust to complex intra-personal variations. Indeed, the top three cases of face recognition in unconstrained environment (FRUE) reported on benchmark database LFW have been achieved by CNN. One of the advantages of CNN is that all processing layers have configurable parameters which can be learned from data, thus allievate the burden of manual feature design. However, the number of parameters in CNN can be millions if not billions. To learn such a large number of parameters, a very large training datasets are required. In many cases, collecting large datasets can be very costly or even an impossible task.

To overcomes the limitd size of datasets, two methods have been successfully applied to diverse recognition problems. For example, A 3D-aided face synthesis technique has been used for facial landmark detection and face recognition. Another commonly used technique is Data augmentation which applys mirroring, cropping, rotation, and scaling without changing semantic-level image label. [here](https://arxiv.org/pdf/1603.06470.pdf)


### Problem Statement
For security reasons, we are required to lock our computer whenever we leave our desk. When we come back to our desk, we have to unlock it. There are multiple ways to do it, for example, one can type in the password through keyboard or use an external device to read identification information through smart ID card, but none of them are hand-free process. So it is kind of annoying to repeat these locking/unlocking steps manually. For this project, I propose to develop an auto-facial recognition system. Specifically, this system uses the built-in camera in the computer to capture the image of a person in front of it. The captured image then is processed and routed to the auto-facial recognition system where a prediction will be made based on the input image. If the image is from an authorized person, a "password" message will be send to the computer so that it can unlock itself automatically, Otherwise it will not do anything.

In this project, a CNN, the most recent deep learning face recognition method CNN, is applied. Our goal is to achieve more accurate face recognition capability by carefully adjust the CNN architectures.

### Metrics
To identify if a image if from an authorized person, a 1:N authentication method is applied. Images of authorized person to use the computer are taken by myself. Images of unauthorized users are collected from online ([LFW](http://vis-www.cs.umass.edu/lfw/)). Images are classified into "authorized" and "unauthorized" two categories and labeled with authorized (1) and unauthorized (0).  The captured image will then be processed with OpenCV. Decision will be made based on the supervised learning model.

When trainning the model with CNN, the commonly used cross-entropy loss function is used.  
$L_i=-\log(\frac{e^{f_yi}}{\sum_j e^{f_i}})$

Image data is devided into trainning, validation and testing three groups. To train the model, the keras python library is used with tensorflow as the backend. A supervised learning model is generated with the state of art machine learning algorithm convolutional neural networks (CNN). For the final classification, the softmax classifier function is used. To measure performance of the model, accuracy is used to evaluate the performance with the validation dataset and then test with testing data. The final model is further evaluate with 10 pieces of new images. 


## II. Analysis
_(approx. 2-4 pages)_

### Data Exploration

To reduce the labor, I have taken the advantage of the online resource. Images used for "unauthorized" category are downloaded from the website of computer vision lab at the University of Massachusetts (http://vis-www.cs.umass.edu/lfw/). This image data set contains more than 13,000 images of faces collected from the web. Each face has been labeled with the name of the person pictured. The images are packed as a 173 MB gzipped tar file. The zipped file is unzipped with the 7-zip software downloaded from (http://www.7-zip.org/) and saved on a ../unauthorized folder. It has been known that there are some mismatches between the picture and the labeled name. This should not be a problem for the current project.

Images used for "authorized" category are prepared by myself either using a cell phone or a built-in camera on the notebook. All image are taken at natural color with standard contrast. Images from these two sources have different size settings and are pre-trimmed with photoshop as 250 pixel x 250 pixels. Due to the time and labor limitation, less than 300 images were prepared. The pre-trimmed images are saved on a ../authorized folder.

For initial image data exploration, OpenCV (opencv.org) is used. All images shape is 250 x 250 x 3 with the size 187500. Images are kept in good condition and can be shown with OpenCV as well as python PIL library.


In this section, you will be expected to analyze the data you are using for the problem. This data can either be in the form of a dataset (or datasets), input data (or input files), or even an environment. The type of data should be thoroughly described and, if possible, have basic statistics and information presented (such as discussion of input features or defining characteristics about the input or environment). Any abnormalities or interesting qualities about the data that may need to be addressed have been identified (such as features that need to be transformed or the possibility of outliers). Questions to ask yourself when writing this section:
- _If a dataset is present for this problem, have you thoroughly discussed certain features about the dataset? Has a data sample been provided to the reader?_
- _If a dataset is present for this problem, are statistics about the dataset calculated and reported? Have any relevant results from this calculation been discussed?_
- _If a dataset is **not** present for this problem, has discussion been made about the input space or input data for your problem?_
- _Are there any abnormalities or characteristics about the input space or dataset that need to be addressed? (categorical variables, missing values, outliers, etc.)_

### Exploratory Visualization



In this section, you will need to provide some form of visualization that summarizes or extracts a relevant characteristic or feature about the data. The visualization should adequately support the data being used. Discuss why this visualization was chosen and how it is relevant. Questions to ask yourself when writing this section:
- _Have you visualized a relevant characteristic or feature about the dataset or input data?_
- _Is the visualization thoroughly analyzed and discussed?_
- _If a plot is provided, are the axes, title, and datum clearly defined?_

### Algorithms and Techniques
In this section, you will need to discuss the algorithms and techniques you intend to use for solving the problem. You should justify the use of each one based on the characteristics of the problem and the problem domain. Questions to ask yourself when writing this section:
- _Are the algorithms you will use, including any default variables/parameters in the project clearly defined?_
- _Are the techniques to be used thoroughly discussed and justified?_
- _Is it made clear how the input data or datasets will be handled by the algorithms and techniques chosen?_

### Benchmark
In this section, you will need to provide a clearly defined benchmark result or threshold for comparing across performances obtained by your solution. The reasoning behind the benchmark (in the case where it is not an established result) should be discussed. Questions to ask yourself when writing this section:
- _Has some result or value been provided that acts as a benchmark for measuring performance?_
- _Is it clear how this result or value was obtained (whether by data or by hypothesis)?_


## III. Methodology
_(approx. 3-5 pages)_

### Data Preprocessing

Images taken by myself are first trimmed with photoshop into a 250 x 250 pixel color image. Due to the limited time, I am not able to collect enough "authorized" images. To compensate of the limitation, the trimmed images are furthered transformed with python imutils library as it has been used for many cases. Specifically, images are rotated 3, 7, 11 degree respectivaly or shift 1.5, 3.5, 5.5 pixels. 

To further enrich the "authorized" image pool, images were automatically derived by using OpenCV from a set of self-recorded vedios. To diversify the background, videos are recorded with changing backgrounds under different ilumination and expression. All videos are taken with the faces at approximate center position. Backgrounds are mostly from a indoor setting which minicks the office environment. 

After combined with the manually taken images with the images derived from videos, I got 3811 pieces of images for "authorized" category. To balanced the image number of the two categories, images stored in subfolers start with Letter A to i are removed. After removing part of the images, there are 5922 pieces of images for "unauthorized" category.

To further reduce the size of image data set and allivelate the burden of computation, both groups of categories are further resized as a 50 x 50 x 3 image.

Image data information then read with OpenCV from the corresponding categorical folders and their subfolders, respectively. The corresponding labels were added for each image depends on which folder it originates from. For example, a image from a "authorized" folder is labeled as 1 while a image from a "unauthorized" folder is labeled as 0. The image data is then convert into numpy arrays for easy processing.  

To train a surprised model, data is randomly divided into training and testing groups with test size set as 0.3. For validation purpose, the training data set is further divided into training and validation groups with train size set as 0.7. All data is centered by dividing by 255.

In total, there are 9733 pieces of images with 0.39 of "authorized" images. After dividing, there are 4769 train samples, 2044 validation samples and 2920 test samples.

In this section, all of your preprocessing steps will need to be clearly documented, if any were necessary. From the previous section, any of the abnormalities or characteristics that you identified about the dataset will be addressed and corrected here. Questions to ask yourself when writing this section:
- _If the algorithms chosen require preprocessing steps like feature selection or feature transformations, have they been properly documented?_
- _Based on the **Data Exploration** section, if there were abnormalities or characteristics that needed to be addressed, have they been properly corrected?_
- _If no preprocessing is needed, has it been made clear why?_

### Implementation
In this section, the process for which metrics, algorithms, and techniques that you implemented for the given data will need to be clearly documented. It should be abundantly clear how the implementation was carried out, and discussion should be made regarding any complications that occurred during this process. Questions to ask yourself when writing this section:
- _Is it made clear how the algorithms and techniques were implemented with the given datasets or input data?_
- _Were there any complications with the original metrics or techniques that required changing prior to acquiring a solution?_
- _Was there any part of the coding process (e.g., writing complicated functions) that should be documented?_

### Refinement
In this section, you will need to discuss the process of improvement you made upon the algorithms and techniques you used in your implementation. For example, adjusting parameters for certain models to acquire improved solutions would fall under the refinement category. Your initial and final solutions should be reported, as well as any significant intermediate results as necessary. Questions to ask yourself when writing this section:
- _Has an initial solution been found and clearly reported?_
- _Is the process of improvement clearly documented, such as what techniques were used?_
- _Are intermediate and final solutions clearly reported as the process is improved?_


## IV. Results
_(approx. 2-3 pages)_

### Model Evaluation and Validation
In this section, the final model and any supporting qualities should be evaluated in detail. It should be clear how the final model was derived and why this model was chosen. In addition, some type of analysis should be used to validate the robustness of this model and its solution, such as manipulating the input data or environment to see how the model’s solution is affected (this is called sensitivity analysis). Questions to ask yourself when writing this section:
- _Is the final model reasonable and aligning with solution expectations? Are the final parameters of the model appropriate?_
- _Has the final model been tested with various inputs to evaluate whether the model generalizes well to unseen data?_
- _Is the model robust enough for the problem? Do small perturbations (changes) in training data or the input space greatly affect the results?_
- _Can results found from the model be trusted?_

### Justification
In this section, your model’s final solution and its results should be compared to the benchmark you established earlier in the project using some type of statistical analysis. You should also justify whether these results and the solution are significant enough to have solved the problem posed in the project. Questions to ask yourself when writing this section:
- _Are the final results found stronger than the benchmark result reported earlier?_
- _Have you thoroughly analyzed and discussed the final solution?_
- _Is the final solution significant enough to have solved the problem?_


## V. Conclusion
_(approx. 1-2 pages)_

### Free-Form Visualization
In this section, you will need to provide some form of visualization that emphasizes an important quality about the project. It is much more free-form, but should reasonably support a significant result or characteristic about the problem that you want to discuss. Questions to ask yourself when writing this section:
- _Have you visualized a relevant or important quality about the problem, dataset, input data, or results?_
- _Is the visualization thoroughly analyzed and discussed?_
- _If a plot is provided, are the axes, title, and datum clearly defined?_

### Reflection
In this section, you will summarize the entire end-to-end problem solution and discuss one or two particular aspects of the project you found interesting or difficult. You are expected to reflect on the project as a whole to show that you have a firm understanding of the entire process employed in your work. Questions to ask yourself when writing this section:
- _Have you thoroughly summarized the entire process you used for this project?_
- _Were there any interesting aspects of the project?_
- _Were there any difficult aspects of the project?_
- _Does the final model and solution fit your expectations for the problem, and should it be used in a general setting to solve these types of problems?_

### Improvement
In this section, you will need to provide discussion as to how one aspect of the implementation you designed could be improved. As an example, consider ways your implementation can be made more general, and what would need to be modified. You do not need to make this improvement, but the potential solutions resulting from these changes are considered and compared/contrasted to your current solution. Questions to ask yourself when writing this section:
- _Are there further improvements that could be made on the algorithms or techniques you used in this project?_
- _Were there algorithms or techniques you researched that you did not know how to implement, but would consider using if you knew how?_
- _If you used your final solution as the new benchmark, do you think an even better solution exists?_

-----------

**Before submitting, ask yourself. . .**

- Does the project report you’ve written follow a well-organized structure similar to that of the project template?
- Is each section (particularly **Analysis** and **Methodology**) written in a clear, concise and specific fashion? Are there any ambiguous terms or phrases that need clarification?
- Would the intended audience of your project be able to understand your analysis, methods, and results?
- Have you properly proof-read your project report to assure there are minimal grammatical and spelling mistakes?
- Are all the resources used for this project correctly cited and referenced?
- Is the code that implements your solution easily readable and properly commented?
- Does the code execute without error and produce results similar to those reported?
