<img src="logo_ups.jpg" width="400">
<div style="text-align: right"> Por: Jorge Ortiz</div>

# Neural Network: Number Prediction.
This document is presented to recognize digits from 0 - 9, written by hand, it with main aspects to create, train and validate Artificial neural networks in Python with the scikit-learn library.

The first point, it is important to verify that we have all the libraries installed.

## Prerequisites:

Python Libraries.

* Python (versiones >=2.7 o >=3.3)
* [Numpy >= 1.8.2](http://www.numpy.org/)
* [SciPy >= 0.13.3](https://www.scipy.org/)

## Setup:
The installation of ** scikit-learn ** can be done easily through the following command:

    pip install -U scikit-learn
    
Where the option ** - U ** indicates that if the package exists, it must be updated to the last existing stable version.

Similarly, if more details are desired, it is feasible to consult the following [enlace](http://scikit-learn.org/stable/install.html).

## Deep Neural Networks
Deep-learning networks are distinguished from the more commonplace single-hidden-layer neural networks by their depth; that is, the number of node layers through which data must pass in a multistep process of pattern recognition.

Earlier versions of neural networks such as the first perceptrons were shallow, composed of one input and one output layer, and at most one hidden layer in between. More than three layers (including input and output) qualifies as “deep” learning. So deep is not just a buzzword to make algorithms seem like they read Sartre and listen to bands you haven’t heard of yet. It is a strictly defined term that means more than one hidden layer.

In deep-learning networks, each layer of nodes trains on a distinct set of features based on the previous layer’s output. The further you advance into the neural net, the more complex the features your nodes can recognize, since they aggregate and recombine features from the previous layer.

1) We define utilities to classify the patterns of ones and zeros. In addition to the reading of the corpus where the writing patterns of several individuals that make up the 0 - 9 are found.

In [1]:
import re #Expresiones regulares
import itertools

class Utilities:
    
    def __init__(self, path = '/home/jorge/Documentos/ia2/ProyectoInterciclo/Digitos/corpus/digits-database.data'):
        self.path = path
        self.regex = re.compile('(0|1){2,}') # Patrones pares de 0 y unos
        self.regexno = re.compile('(\s)+[0-9]{1}') # Busca un unico numero el cual tenga un espacio o tabulacion antes del mismo.
        
    
    def generate_indices(self):
        _dict = []
        with open(self.path, 'r') as _f: #abre el archivo corpus
            pivote = 0
            flag = False
            lineno = 0
            for line in _f:
                if self.regex.match(line)!=None and not flag:
                    pivote = lineno
                    flag = True
                if self.regexno.match(line)!=None and flag:
                    _dict.append((int(line.replace(' ','')),pivote,lineno))
                    flag = False
                lineno += 1
            _f.close()
            
        return _dict

    def get_digit(self,_slice, _end):
        data = []
        with open(self.path, 'r') as _f:
            for line in itertools.islice(_f, _slice, _end):
                data.append([int(i) for i in line.lstrip().rstrip()])
            
            _f.close()
        return data


2) The training is done.
Number of interactions: 500.

In [2]:
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from tkinter import *
from tkinter import ttk
import tkinter.messagebox as msg
import numpy as np


utilities = Utilities()

class Ventana(Frame):

    def __init__(self, master = None):
        super().__init__(root)
        self.master = master
        self.coordenadas = [] #Almacena la matriz que se recupera de la interfaz
        self.utilities = Utilities()
        self.indices = self.utilities.generate_indices()
        self.n = []
        self.entrada = []
        self.datos = []
        self.delta = []
        self.init() #llama al init para entrenar la red.

    def normalizador(self):
        for j, k, l in self.indices:
            self.n.append(j)
            self.entrada.append((k,l))

        for i in range(0, len(self.indices)):
            inicio, fin = self.entrada[i]
            fila = np.ravel(np.matrix(self.utilities.get_digit(inicio, fin)))
            self.datos.append(fila)
            self.delta.append(self.n[i])

    def init(self):
        self.master.resizable(0, 0)
        self.grid(row = 0,column = 0)
        self.matriz()
        self.normalizador()
        self.train() #Llama al metodo para entrenar la red

        btnReiniciar = Button(self, text="Reiniciar", height=3, command=self.reiniciar) #Limpia la grilla
        btnReiniciar.grid(columnspan = 16, sticky = W + E + N + S,row = 32, column = 0)

        btnPredecir = Button(self, text="Predecir", height=3, command=self.decode)
        btnPredecir.grid(columnspan = 16, sticky = W + E + N + S,row = 32, column = 16)

    def train(self): #entrena la red
        self.label_encoder = LabelEncoder()
        salida = self.label_encoder.fit_transform(self.delta)
        onehot_encoder = OneHotEncoder(sparse=False)
        salida = salida.reshape(len(salida), 1)
        self.onehot_encoded = onehot_encoder.fit_transform(salida)
        x_train, x_test, d_train, d_test = train_test_split(self.datos, self.onehot_encoded, test_size=0.80, random_state=0)
        self.mlp = MLPClassifier(solver = 'lbfgs', activation='logistic', verbose=True, alpha=1e-4, tol=1e-15, max_iter=500, \
        hidden_layer_sizes=(1024, 800, 400, 200, 10))
        self.mlp.fit(self.datos, self.onehot_encoded)

        prediccion = (np.argmax(self.mlp.predict(x_test), axis = 1) + 1).reshape(-1, 1)
        matriz = confusion_matrix((np.argmax(d_test, axis = 1) + 1).reshape(-1, 1), prediccion)
        print(matriz)

    def decode(self):
        entrada = self.normaliza(32, self.coordenadas)
        numero = np.ravel(np.matrix(entrada))
        res = self.mlp.predict(numero.reshape(1, -1)) #Red ya entrenada
        num = (np.argmax(res, axis=1)+1).reshape(-1, 1)
        aux = []
        matriz = []
        resultado = int(num[0] - 1)
        print(resultado)
        return resultado

    def matriz(self):
        self.btn = [[0 for x in range(32)] for x in range(32)] 
        for x in range(32):
            for y in range(32):
                self.btn[x][y] = Button(self, command=lambda x1=x, y1=y: self.dibujar(x1,y1))
                self.btn[x][y].grid(column = x, row = y)

    def normaliza(self, n, coordenadas): #Transforma la interfaz de botones en una matriz
        matriz = []
        for i in range(n):
            matriz.append([0 for j in range(n)])

        for i in range(len(coordenadas)):
            x, y = coordenadas[i]
            matriz[y][x] = 1
        return matriz

    def dibujar(self, x, y):
        self.btn[x][y].config(bg = "black")
        self.coordenadas.append((x, y))
        
    def reiniciar(self):
        self.matriz()
        self.coordenadas = [] #vacia la matriz

if __name__ == '__main__':
    root = Tk()
    ventana = Ventana(root)
    root.mainloop()


[[64  0  0  0  0  0  0  0  0  0]
 [ 5 68  0  0  0  6  0  0  0  0]
 [33  0 40  0  0  0  0  0  0  0]
 [51  0 17  0  0  0  0  1  4  0]
 [ 2  0  0  0 81  0  0  4  0  0]
 [ 4  0  1  0  0 83  0  0  1  0]
 [ 1  0  0  0  0  0 66  0  0  0]
 [ 0  0  0  0  0  0  0 78  0  0]
 [20  0  0  0  0  3  0  0 51  0]
 [44  1  4  0  1  0  0  3 18  2]]
1
0
0
0
0
2
0
0
0
7
5
5
0
0
0
7
0
0
7
7


A blank window will open, this window contains buttons so we will press it.
![titulo](./imagenes/pi2.png)

## Adding new tests
* The first test is to write the number 1
![titulo](./imagenes/piN1.png)
It checks in the console that the desired number offers.
![titulo](./imagenes/piN1R.png)
Correct prediction

* The second test is to write the number 2
![titulo](./imagenes/piN2R.png)
We verify in the console that the desired number offers 0, for which it is wrong, we will add more points to obtain better precision.
![titulo](./imagenes/piN2RC.png)
Correct prediction.


* The third test is to write the number 9
![titulo](./imagenes/pin9E.png)
The prediction is incorrect, print 7 instead of 9
![titulo](./imagenes/pin9E1.png)
The prediction failed.

## CONCLUSION.
* The more training interactions you could get better results, but it is not always the case, it reaches such a point that there is no improvement.