# Report

### 1) Introduction: Provide a concise explanation of the topic that you have chosen, and what has been done already on such topic (maybe by putting some references).

&emsp; We wish to design a software that helps us mark attendance of workers in a company. Firstly, we need a single picture of each employee with which our software will compare live images from a camera at the front gates. If a worker is spotted at the entrance, then the camera automatically detects them and checks if they have already been checked in that day. If not, the software will register the person along with the date and time of their arrival. In both cases, once it detects someone as an employee, the gate automatically opens for them to enter into the company campus. Otherwise, if the detected face is not employed there, an alert accompanied with a photo of the person is sent through the software to see if it might be an appointment or anything of the sort. After validation, the gate opens and the person can get in. 
    
&emsp; Secondly, the software is able to recognise a face with or without a medical mask so that if a person within working hours is detected not to be wearing one, they have to be expelled.
    
&emsp; Furthemore, our software detects number plates using OCR for two reasons. The first one is that it helps the company decide how to allocate transport allowances to their employees. The second is for security reasons in case of a break in or other unfortunate situations in which a company might recquire such identification.
    
&emsp; Lastly, if someone visits the buildings outside working hours, their face will be detected and sent out through the software which will in turn call the authorities unless the person is cleared to come inside. 

&emsp; There has evidently been countless usage of face recognition in the last decades. From governmental and military affairs, to phone security to even shopping with amazon's "Amazon Go". OCR has also grown massively as a computer vision solution for security but also hacking. However we have been seeing a lot of casual uses of OCR in the last decade for solver apps or even apple's IPhone making it now possible to copy text from a picture. On the other hand, mask detection has obviously only been widly popular in response to 2020's global pandemic as a means to enforce health restrictions. Ultimately, all three methods used here have been developped and perfected for the same reasons we are using them. All we are doing is putting all of them together in one software adjusted specifically for our case.

### 2) Method: Here you will explain what you have done. I do not need the code since you will provide it together with the report. You just need to explain the several steps/the model that you have implemented, together with the used dataset.



### Face recognition ###

Function $faceDetectionAtGate$ is at the base of the **workplace entrance system**.

When a person shows up at the _entrance gate_ they can be identified either as one of the **employees** or as a **stranger**.

When an **employee** is recognized, the system must check wether the employee showed up:
1. _on time_ (**7 am - 8.30 am**); 
2. _late_ (**8.30 am - 10 am**);
3. outside of _registering time_ (**10 am - 7 am**).

In order to perform this time check and eventual registration the system makes use of function $markAttendance$.
 
In case $1.$ and $2.$ employee's attendance will be registered on the **attendance sheet** and the gate will be opened to them.

In case $3.$ employee's attendance will not be registered on the attendance sheet although the gate will be opened to them.

In both cases $2.$ and $3.$ a **notice email** is sent to the Director displaying:
- subject: "mr. {} is late Date: {}, Time: {}";
- content: "mr. {} has just arrived";
- attachment: captured image of the person.

When a **stranger** is detected within working hours (**7am - 7pm**) a notice email is sent to the Director displaying:
- subject: "{} at the gate Date: {}, Time: {}"; 
- content: "There is {} waiting at the gate. See if you prefer opening the gate for him/her";
- attachment: captured image of the person.

When either a **stranger** or an **employee** is detected outside of _working hours_ (**7 pm - 7 am**) an **alert email** is sent to the Director displaying:
- subject: "{} at the gate outside working hours Date: {}, Time: {}"; 
- content: "There is {} at the gate. Perhaps, he is a robber, etc. If necessary, alert the police";
- attachment: capture image of the person.

Function $sendEmailWithAttachment$, used to send the **notice emails**, takes advantage of $yagmail$ library:
1. $yagmail.SMTP$ function is passed **sender**'s email and password for authentication and _initializes_ the server connection.
2. $send$ function, applied to the instance created by $1.$, is passed email's **recipient(s)** email address(es), subject, and body and will _deliver_ the email.

**Yagmail** is designed to interact with _Gmail_ accounts. We set up an account for this project, and turned on the *Allow less secure apps* option (which allows our app to interact with Gmail without any issues but it makes the account vulnerable to _unauthorized_ access).

In order to perform **face recognition**, $faceDetectionAtGate$ must compare unknown person's face with known employees'faces and to this end it uses $face\_recognition.face\_encodings$ function of $dlib$.

It calls function $findEncodings$ passing it the path to a _folder_ containing employees'**identification photos** (one per employee), each having as _filename_ the employee's name. 

$findEncodings$ iterates over each directory image and returns two **lists**:
- one of employees'**names**, extracted from images'filenames by removing the _extension_ suffix; 
- one of **encodings** of employees'faces, created using $face\_encodings$.
Since for a given picture $face\_encodings$ creates an array of encodings of cardinality equal to the number of faces detected, multi-subject photos are discarded to avoid identity _ambiguities_.

$faceDetectionAtGate$ then proceeds to perform the comparison between encoding(s) of face(s) captured at the gate and encodings of employees'faces.

If multiple people are detected at the gate it iterates over their encodings.

$face\_recognition.compare\_faces$, passed two arrays of encodings (the first of _cardinality = num. of employees_ and the second of _cardinality = 1_), returns a **boolean array** (of _cardinality = num. employees_) where a $True$ value at position $i$ corresponds to a **match** (correspondence) between the person at the gate currently under scrutiny and the employee whose name is at position $i$ of the _names'list_ outputted by $findEncodings$.

Between the matches, in case there were more than one, $faceDetectionAtGate$ selects the index corresponding to the employee whose encodings are at smallest **distance** with respect to the unknown subject's, where the distance is calculated using $face\_recognition.face\_distance$.

To simulate the entrance gate's **security camera**, in this project we use webcam _video capture_ perfomed using $OpenCV2$.

### Marking employee attendance ###

Function $markAttendance$, called by $faceDetectionAtGate$ when an employee has been recognized, is passed the employee's **name** and the **path** of the attendance sheet, a **csv file**.

$markAttendance$ performs the employee's arrival time check, returning:
- True, if the employee showed up _late_ (**8.30 am - 10 am**), outside of _registration time_ (**10 am - 7 pm**) or outside of _working hours_ (**7 pm - 7 am**);
- False, if the employee showed up _on time_ (**7 am - 8.30 am**).

As a _side effect_ the function will update the csv file, marking the attendance of the employee if and only if they showed up within _registration time_ (**7 am - 10 am**).

**Attendance sheet** is a _table_ composed of three **columns** ($Name$, $Date$ and $Time$), and a **row** for each registered employee (represented in the file as a _string_ of the form '$\backslash n\{name\},\{date\},\{time\}$').

e.g.
$$
\begin{aligned}
& \text {Attendance Sheet}\\
&\begin{array}{cccc}
\hline \hline \text { Name } & \text { Date } & \text { Time }\\
\hline Barboni Giorgia, & 25-05-2022, & 7:02:14\\
Baturone Elise, & 25-05-2022, & 7:03:21\\
Juwara Yusupha, &  25-05-2022, & 8:13:44\\
Tassybayev Saurik, & 25-05-2022, & 9:55:37\\
\hline
\end{array}
\end{aligned}
$$

$markAttendance$ will avoid adding the same employee _twice_ on the same date (i.e. in the event of the employee exiting and re-entering the workplace).

Hence, in order to avoid _duplicates_, before registering a name $markAttendance$ will: 
1. _read_ the lines of the csv file, each line being a different row of the attendance sheet;
2. _split_ each line by separator ',' obtaining a list of the form $[\{name\},\{date\},\{time\}]$.
3. _append_ the obtained sublist containing $[\{name\},\{date\}]$ to a list.
4. _check_ wether the current $[\{name\},\{date\}]$ couple is already present in the list created at step 2. 
5. If not, _write_ entry '$\backslash n\{name\},\{date\},\{time\}$' in the file.





### Face-mask detection ###

We deal with *face-mask detection* as a binary *classification* problem we are going to solve by training a machine learning *model* using $scikit$-$learn$.

The steps we are going to follow to develop our image classifier are:
1. Collecting our dataset
2. Pre-processing the images
3. Model training
4. Model evaluation


$Dataset$ folder contains image files that divided into $2$ folders subfolders whose name corresponds to the *class* that will be associated to each image they contain.


To upload our dataset, function $loadDataset$ reads the images from $Datasets$'s subfolders and stores them in separate lists.

Next, image data *preprocessing* is needed to convert image data into a form that allows machine learning algorithms to solve it. 
It is often used to increase a model’s accuracy, as well as reduce its complexity. 
Since images exist in different formats, we need to take it into consideration and standardize them before feeding them into our ML algorithms.

Of the several existing techniques to preprocess image data we're going to use:
- *grayscale* conversion of BGR images, to reduce the number of pixels in an image (hence reducing computations required, since color isn't needed for classification);
- *normalization* (projection) of pixels'intensity to (0,1) range to achieve a standard learning rate for all images;
- *scaling*

Furthermore, $sklearn$ expects input data samples in the form of scalars or $1D$ vectors. Images in our dataset, on the other hand, are represented in the form of $2D$ matrices with $3$ color channels ($BGR$) (i.e. tensors).

Nevertheless we can convert our $2D$ images to *row vectors* by using the $reshape$ method from the $NumPy$ library to change the shape of all the images in our dataset from matrices to $1D$ vectors (by simply concatenating all the matrix rows together to form one big row).

$preprocessEntireImageFolder$ takes care of this, returning the array of image *labels* and a *stacked* list of grayfied, resized and reshaped images, while we will use built-in transformer $StandardScaler$ to normalize and scale our features (standardize the feature values by computing the mean, subtracting the mean from the data points, and then dividing by the standard deviation)
*Transformers* are objects that take in the array of data, transform each item and return the resulting data.

Just as it is important to test a predictor on data held-out from training, preprocessing (such as standardization, feature selection, etc.) and similar data transformations similarly should be learnt from a training set and applied to held-out data for prediction. 


Next, we need to split our dataset into:
- *training* data, to fit the models; 
- *test* data, to evaluate model performance (in an unbiased manner). 

We use the $train\_test\_split$ function provided by $sklearn$ and divide the data into an $80%$ *training* set and a $20%$ *test* set. 
(We always want to train our model with more data so that our model generalizes well, so we keep test size to be in the range $[0.10 - 0.30]$.)

In the stacked dataset, the photos are *ordered* by class (samples with the same class label are contiguous), so we cannot simply split at $80%$, otherwise we will end up with some types appearing in only one of the two sets preventing us from training our model to recognise/predict them correctly.

We can solve this by *shuffling* the data prior to splitting. This way, we even out the distributions in the training and test set and make them comparable.

After extracting, concatenating and saving feature and label vectors from our training dataset, it’s time to train our machine learning models provided by $sklearn$.

As *models* we chose $Support Vector Machine$, $Logistic Regression$, $K-Nearest Neighbors$, $Decision Trees$, $Random Forests$ and $Gaussian Naive Bayes$, and we'll perform a *selection* between them based on the comparison of their cross validation results.

Having chosen our model, we can proceed to train it with the training data and test it on the unseen test data.

To *train* an algorithm, we’ll call the $fit$ method and pass it our training features and labels. 
The $random\_state$ attribute is used to randomly shuffle our data.

The $fit$ method returns a trained model that can be used to make *predictions*: we need to call the $predict$ method and pass it our test set, and it will return predicted labels for the data samples in our test set.

To see how well our algorithm has learned to *classify* the images, we have to match the predicted labels for our test set with actual labels.

Looking only at model's *accuracy* (i.e. the proportion of cases classified correctly) may not be enough; to get a better insight we will also need to observe its classification report and confusion matrix. 
We'll use $accuracy$, $confusion\_matrix$ and $classification\_report$ methods from the $sklearn.metrics$ module.

*Confusion matrix* is a $3x3$ matrix $-$ where $3$ is the number of classes $-$ that gives a comparison between actual and predicted values. For each *actual* class row, prediction counts are displayed in the cells intersecting *predicted* classes' columns, as shown below. (In our code we visualize it as a *heatmap* using $seaborn$.) Displayed values are true positive $TP$, true negative $TN$, false positive $FP$ and false negative $FN$ values.


*Classification report*, instead, for each class provides us directly with:
- accuracy = $\frac{TP+TN}{TOT\_RECORDS}$ 
- precision = $\frac{TP}{TP+FP}$ 
- recall = $\frac{TP}{TP+FN}$ 
- specificity = $\frac{TN}{TN+FN}$ 
- prevalence = $\frac{TP+FN}{TOT\_RECORDS}$ 

$evalModel$ brings these analyses together in a single function call.

The question of "the best model" is about finding a sweet spot in the tradeoff between bias and variance, since a model can:
- *underfit* the data, meaning it does not have enough model flexibility to account for all the features in the data (high bias); 
- *overfit* the data meaning it has so much model flexibility that it ends up accounting for random errors (high variance).

### Social distancing monitoring ###

To detect **social distancing** violations, the system checks physical distance between faces, which mustn't fall below a given threshold.

Detecting distances between faces from _monocular_ images without any extra information is **not possible**. 
At the same number of pixels apart, the **closer** two faces are to the camera, the bigger they are and the **smaller** is the _actual_ **distance** between them.
Hence, to put distances between faces into **scale**, we must be able to determine the distance of a given face from the camera first.

We can use a face as a **marker** object and _calibrate_ our system on it while being captured by the camera.
 
If we define:
- $W$ = face's **physical** _width_;
- $D$ = face's physical _distance_ from camera;
- $P$ = face's **apparent** _width_ in pixels (being captured by the camera);
- $F$ = camera's **focal length**, or face's apparent distance from camera in pixels;

the following **proporion** holds: 
$$ \frac{F}{D} = \frac {P}{W} $$

In our application $W$ and $D$ are known; we take as $W$ the _avarage_ human **face width** (approximately 15 cm) and as $D$ the _average_ human **arm length** (approximately 30 cm), which stand in for the precise measurements of the person who'll perform the calibration.

**Calibration** enables to derive $P$ through function $compute_faceWidthPixels$, which takes advantage of $face_recognition.face_locations$ to work out face **location coordinates** of the person conducting it at distance $D$ from the camera.

_Face locations coordinates_ are returned in the form of $top, right, left, bottom$ values, making it easy to obtain $P$ as $right$-$left$.

We can now pass these data as input to $compute_focalLength$ to get our $F$ as
$$ F = \frac{P \cdot D}{W} $$ 

Possessing camera's $F$, we can now determine _any_ other face's **actual distance** from the camera through _inverse formula_ of the original _proportion_
$$ D' = \frac{F \cdot W}{P} $$
as done by $compute_distToCamera$.

Finally, $checkDistance$ will _loop_ over camera captured _frames_ $-$ in our application this is achieved using $OpenCV2$'s $VideoCapture(0)$ $-$
and if more than a face is detected using $face_recognition.face_locations$ (i.e. _length_ of the _list_ of **face locations** returned is _greater_ than $1$), **distance check** is performed: for any _pair_ of face locations (**rectangular** face _delimiters_), if the geometric **centers** of the two are at distance smaller than a **theshold**, a red rectangle displaying a **warning** is drawn on screen around each face to indicate a _violation_ has been detected.

The computation of **distance** between faces takes of course into account the **depth** information that can be drawn from _apparent_ face **widths**. 
If we considered only face **centers**'coordinates, we'd _miss out_ on **depth** information.

Viewing the **frame** as an $xy$ plane, the face centers occupy two spots on it.
Depending on security camera's **positioning** and orientation (e.g. hanging high/low from a ceiling, frontal/lateral etc.), $y$ coordinates of the **centers** give us _incomplete_ indication of people **closeness** due to **perspective**.

However, incorporating the $compute_distToCamera$ results, we can get the full picture.

The distance check is carried out on an imaginary **plane** where:
- **horizontal** axis is used to measure the distance converted from apparent to physical between $x$ **coordinates** of the **centers**; 
- **vertical** axis is used to measure **depth** distance of the centers.

The **actual** distance could then be represented on this plane as the **sum** of two _positive-oriented_ **vectors** running in the direction of the _axes_. 

This leads us to find the **actual distance** between the two faces as the **L2 norm** of the _sum_ of these two _vectors_ through $numpy.linalg.norm$.



### Plate number recognition ###
### Face-mask detection ###

We deal with **face-mask detection** as a _binary_ **classification** problem we are going to solve by _training_ a machine learning **model** using $scikit$-$learn$.

The steps we are going to follow to develop our image _classifier_ are:
1. Collecting our dataset
2. Pre-processing the images
3. Model training
4. Model evaluation


$Dataset$ folder contains image files that divided into $2$ folders _subfolders_ whose name corresponds to the **class** that will be associated to each image they contain.


To upload our dataset, function $loadDataset$ reads the images from $Datasets$'s _subfolders_ and stores them in separate _lists_.

Next, image data **preprocessing** is needed to convert image data into a form that allows machine learning algorithms to solve it. 
It is often used to increase a model’s _accuracy_, as well as reduce its _complexity_. 
Since images exist in different formats, we need to take it into consideration and standardize them before feeding them into our ML algorithms.

Of the several existing techniques to preprocess image data we're going to use:
- **grayscale** conversion of BGR images, to reduce the number of pixels in an image (hence reducing computations required, since color isn't needed for classification);
- **normalization** (projection) of pixels'intensity to (0,1) range to achieve a standard learning rate for all images;
- **scaling**

Furthermore, $sklearn$ expects input data samples in the form of scalars or $1D$ vectors. Images in our dataset, on the other hand, are represented in the form of $2D$ matrices with $3$ color channels ($BGR$) (i.e. _tensors_).

Nevertheless we can convert our $2D$ images to **row vectors** by using the $reshape$ method from the $NumPy$ library to change the _shape_ of all the images in our dataset from matrices to $1D$ vectors (by simply _concatenating_ all the matrix rows together to form one big row).

$preprocessEntireImageFolder$ takes care of this, returning the array of image **labels** and a **stacked** list of _grayfied_, _resized_ and _reshaped_ images, while we will use built-in _transformer_ $StandardScaler$ to normalize and scale our _features_ (standardize the feature values by computing the mean, subtracting the mean from the data points, and then dividing by the standard deviation)
**Transformers** are objects that take in the array of data, transform each item and return the resulting data.

Just as it is important to test a predictor on data held-out from training, preprocessing (such as standardization, feature selection, etc.) and similar data transformations similarly should be learnt from a training set and applied to held-out data for prediction. 


Next, we need to _split_ our dataset into:
- **training** data, to _fit_ the models; 
- **test** data, to _evaluate_ model performance (in an _unbiased_ manner). 

We use the $train\_test\_split$ function provided by $sklearn$ and divide the data into an $80%$ **training** set and a $20%$ **test** set. 
(We always want to train our model with more data so that our model _generalizes_ well, so we keep test size to be in the range $[0.10 - 0.30]$.)

In the _stacked_ dataset, the photos are **ordered** by _class_ (samples with the same class label are contiguous), so we cannot simply split at $80%$, otherwise we will end up with some types appearing in only one of the two sets preventing us from training our model to recognise/predict them correctly.

We can solve this by **shuffling** the data prior to splitting. This way, we even out the _distributions_ in the training and test set and make them _comparable_.

After extracting, concatenating and saving _feature_ and _label_ vectors from our training dataset, it’s time to _train_ our machine learning models provided by $sklearn$.

As **models** we chose $Support Vector Machine$, $Logistic Regression$, $K-Nearest Neighbors$, $Decision Trees$, $Random Forests$ and $Gaussian Naive Bayes$, and we'll perform a **selection** between them based on the comparison of their _cross validation_ results.

Having chosen our model, we can proceed to train it with the training data and test it on the _unseen_ test data.

To **train** an algorithm, we’ll call the $fit$ method and pass it our training _features_ and _labels_. 
The $random\_state$ attribute is used to randomly _shuffle_ our data.

The $fit$ method returns a trained model that can be used to make **predictions**: we need to call the $predict$ method and pass it our test set, and it will return _predicted labels_ for the data samples in our test set.

To see how _well_ our algorithm has learned to **classify** the images, we have to match the _predicted_ labels for our test set with _actual_ labels.

Looking only at model's **accuracy** (i.e. the proportion of cases classified correctly) may not be enough; to get a better insight we will also need to observe its _classification report_ and _confusion matrix_. 
We'll use $accuracy$, $confusion\_matrix$ and $classification\_report$ methods from the $sklearn.metrics$ module.

**Confusion matrix** is a $3x3$ matrix $-$ where $3$ is the number of classes $-$ that gives a comparison between _actual_ and _predicted_ values. For each **actual** class _row_, prediction counts are displayed in the cells intersecting **predicted** classes' _columns_, as shown below. (In our code we visualize it as a **heatmap** using $seaborn$.) Displayed values are _true positive_ $TP$, _true negative_ $TN$, _false positive_ $FP$ and _false negative_ $FN$ values.


**Classification report**, instead, for each class provides us directly with:
- _accuracy_ = $\frac{TP+TN}{TOT\_RECORDS}$ 
- _precision_ = $\frac{TP}{TP+FP}$ 
- _recall_ = $\frac{TP}{TP+FN}$ 
- _specificity_ = $\frac{TN}{TN+FN}$ 
- _prevalence_ = $\frac{TP+FN}{TOT\_RECORDS}$ 

$evalModel$ brings these analyses together in a single function call.

The question of "the _best_ model" is about finding a sweet spot in the tradeoff between bias and variance, since a model can:
- **underfit** the data, meaning it does not have enough model flexibility to account for all the features in the data (high _bias_); 
- **overfit** the data meaning it has so much model flexibility that it ends up accounting for random errors (high _variance_).

#### Plate Detection
In order to detect number plates, we must manipulate the image first. We start by resizing it and filtering the colors to a gray scale. We then reduce the noise before finally detecting the edges using cv2.Canny. From there, we have to find the contours of the number plate and grab them. We use cv2.findContours and imutils.grab_contours for this which can also detect some minuscule contours that aren't the actual number plates. For that reason we sort the contours by size and remove the insignificant ones. We then find which contours are closest to a rectangular shape as our desired object, the number plate, is. After all of this we have our number plate. We use a mask to make everything but the plate black and we get the points of the white rectangle (which is our number plate). We then resize to just the plate and use OCR to get the the text. Once we have the number plate we need to register it for the company to know who has come with their car.

#### Plate registration
We register the plates in two different file. 
One, plate_date_file, is a log of every plate the camera detects along with the date and time no matter the hour. For registering a plate in this one we simply open the file in append mode and write at the end of the file the number plate, and the date and time it was seen at.
The second one, plate_count_file, only registers plates within working hours. This file gives a count of how many times it has seen a plate instead of always appending again. For this we read through the file and if we find the number plate, and it is during working hours, we increment the count associated. If it working time but we haven't found the number plate already written, we append it with a count of 1. If a plate is detected out of working hours it is not registered on the plate_count_file.
The plate_date_file is used for security mesures both in terms of making sure there was no error in the counting, and detecting the plates that come at improper hours. The plate_count_file is for the company to know who and how much to rembourse their employee for transport costs.

### 3) Results: Here you will show the results that you got, and the problems you encountered.

### Send Email

With this function, we send email when certain conditions are met. some of which are when distance is violated, when mask is not worn, when our worker is late, when it is outside working hours, and much more. We use this function as a helper function which we shall use at the appropriate places. We encountered a problem because google removed **access to less secure apps** which yagmail uses to send email message. But we have found a solution to it by manually generating a code that we used as the password to our email address (not the real password). Apart from that everything else works fine for this section.

### Distance Calculation

Here, we just compute the distance between faces if there are at least 2 faces. We first calibrate to get the camera's focal length, with which we make the conversion between distances in real life and distances in pixels of images captured by the cctv camera. It works so well. The function draws red rectangles on faces that violate the distance rule. We use this information to penalise workers that constantly violate the rule.

### Face Recognititon at the gate

Here, we use this section to detect faces at the gate as suggested by the name. Its sole purpose is to:

1) recognize our workers and automatically opens up the gate for them during working hours
2) Detect other faces too that are not workers but maybe are present at the gate because of appointment, of course during working hours
3) anything else after working hours, a message is sent to the coordinator to see if the person might be a robber, etc.

We did not encounter any difficulties here. This is solely to detect our workers at the gate and other things which we already explained at above in point 2).

### Number plate recognition and extraction

Here, we just collect the number plates of our workers and register them for later use. We use those number plates to calculate the number of times a vehicle was present at the working place. We use those counts to calculate reimbursments that a worker may be eligible for. have spent for travelling to the work place.

We stored the number plates in register files and check certain conditions which we already did explain above in section 2).

### Face Mask Detection inside the workplace

Here, we defined 5 models that we used to test which one among them classifies our datasets correctly. We see that random forest and Support Vector Machines stand out. They sometimes have an accuracy close to 98%, sometimes around 90%.
But on avaerage for all the models, the accuracy is around 90%.

We trained the models with our datasets, test them to see the accuracy, the cunfussion matrix, the F-1 score, and so forth. It is really a successfull project. 

Everything is done correctly and efficiently. It classifiers very well on datasets taken from disk. 
But it sometimes misclassifies on live images.