# Introduction
<hr/>

In machine learning, **binary classification** is a <u>supervised</u> algorithm that categorizes images into one of two classes. You provide two datasets of images, a **label** for each one, and the algorithm will learn from it classifying both label in  0 or 1.

Doubtless, binary classification is widely spread, and it does work well in many scenarios. But, what if I just want to be able to recognize one single image? To make it clear, consider the problem of determining whether a fingerprint exists in an image or not. You create a dataset with several of fingerprints you would like to train the model to recognize, but what about instances of "Not Fingerprint" images? Instances of "Not Fingerprint" are nearly infinite.

To overcome this problem, we going to use *Convolutional Neural Networks(CNN)*

# Dataset
<hr>

The dataset is the original [NIST Special Database 4. NIST 8-Bit Gray Scale Images of Fingerprint Image Groups](https://www.nist.gov/publications/nist-special-database-4-nist-8-bit-gray-scale-images-fingerprint-image-groups). It is in the `filename.png` and `filename.txt`, this last contains information about the fingerprint, such as *gender* and *class*.

There are 5 different fingerprint patterns, classes, namely:
- Arch (A);
- Left Loop (L);
- Right Loop (R);
- Tented Arch (T);
- Whirl (W).

# Dependencies and setup
<hr>

In [1]:
import os

`os` is a built-in Python module that provides functions for interacting with the operating system, like creating and removing a directory (folder), fetching its contents, changing and identifying the current directory, etc.

In [2]:
import random

`random` is also a built-in Python module that provides pseudo-random number generators.

In [4]:
import cv2

`cv2` called *OpenCV*, is a huge open-source library for computer vision, machine learning, and image processing.

In [6]:
import numpy as np

`numpy` is a module used for working with multidimensional arrays, as well as variations such as masks and matrices, which can be used for various math operations.

"But python doesn't have its own arrays? Why do people use numpy so much?" you may ask. Quite simply, it’s faster than regular Python arrays. Another reason is that numpy arrays and operations are vectorized, which means they lack explicit looping or indexing in the code. This makes the code not only more readable but also more similar to standard mathematical notation. For two arrays A and B of the same size, if we wanted to do a vector multiplication in Python:

```python
c = []
for i in range(len(a)):
    c.append(a[i]*b[i])
```

In numpy, this can simply be done with the following line of code:

```python
c = a * b
```
Numpy makes many mathematical operations used widely in **scientific computing** fast and easy to use, such as: vector-vector multiplication; matrix-matrix and matrix-vector multiplication; element-wise operations on vectors and matrices (i.e., adding, subtracting, multiplying, and dividing by a number); element-wise or array-wise comparisons; applying functions element-wise to a vector/matrix (like pow, log, and exp); reduction, statistics, and much more.

When speaking of a digital image, we are speaking of a matrix so numpy is extremely important for us.

In [7]:
import pandas as pd

`pandas` is a open source module used to analyze and manipulating data.

In [8]:
import seaborn as sns
import matplotlib.pyplot as plt

`seaborn` is a data visualization library based on `matplotlib`. It provides a high-level interface for plotting attractive and informative statistical graphics.

In [9]:
from sklearn.model_selection import train_test_split

`sklearn` is probably the most useful library for machine learning in Python. It contains a lot of efficient tools for machine learning and statistical modeling including classification, regression, clustering and dimensionality reduction. Note that sklearn is used to build machine learning models. It should not be used for reading the data, manipulating and summarizing it. There are better libraries for that (e.g. NumPy, Pandas etc.).

In [13]:
from tensorflow import keras
from tensorflow.keras import layers

`tensorflow` is an open source library for numerical computation that makes machine learning faster and easier. `keras` is on top of TensorFlow and is tightly integrated with it. Keras is a neural network API that is used to build machine learning models. `layers` are a set of 'neurons' of an artificial neural network.