An approach to the detection and identification of human faces has been presented and described for a face recognition system that identifies a person by comparing characteristics of the face to those of individuals in the training dataset. In this approach, face images are projected into a feature space that best encodes the variation among known face images. The face space is defined by the “Eigen faces” which are eigenvectors of the set of faces. The Eigenfaces method takes a holistic approach to face recognition: A facial image is a set of points having a high-dimension. Hence, a lower-dimensional representation is found where classification becomes easy. The lower-dimensional subspace is found with Principal Component Analysis, which identifies the axes with maximum variance. While this kind of transformation is optimal from a reconstruction standpoint, it doesn’t take any class labels into account. The basic idea for this approach is to minimize the variance within a class, while maximizing the variance between the classes at the same time.
In the language of information theory, the relevant information was extracted from a face image, encoded and compared to an instance of an encoded face with a database of models encoded similarly. In mathematical terms, this approach finds the principal components of the distribution of faces or the eigenvectors of the covariance matrix of the set of face images. The eigenvectors can be thought of as a set of features which together characterize the variation between face images. The faces can also be approximated using the best eigenfaces: those that have the largest eigenvalues, so they have the most variance within the set of images. The problem with the image representation was the high dimensionality. Two-dimensional m×n images span a P =mn dimensional vector space, so an image with 100×100 pixels lies in a 10,000-dimensional image space already. In order to make the process efficient and quick, PCA was used. A high-dimensional dataset is often described by correlated variables and therefore only a few meaningful dimensions’ account for most of the information. The PCA method finds the directions with the greatest variance in the data, called principal components.
In this project, a dataset of 60 images from this database (faces94) (https://cswww.essex.ac.uk/mv/allfaces/index.html) was used. A subset of the total dataset was used. Hence the dataset for this project has 60 pictures of 5 persons having different emotions like laughing, not laughing, with glasses, without glasses, happy, sad, mouth open, mouth closed etc. 50 images were used for training the recognizer and 10 images were used as test set.