<p>26 de febrero de 2024</p>
<p>Universidad Autónoma de Aguascalientes</p>
<p>Departamento: Ciencias de la computación</p>
<p>Carrera: Ingeniería en Computación Inteligente</p>
<p>Maestro: Dr. Francisco Javier Luna Rosas</p>
<p>Curso: Machine learning y deep learning</p>
<p>Alumno: Antonio Muñoz Barrientos</p>
<p>Semestre Enero - junio 2024</p>

# Red neuronal para detectar las firmas

### Librerías

In [93]:
import cv2
import numpy as np
import pandas as pd

## Lectura de la firma

In [94]:
imagen_firma = cv2.imread("Extracted/Firma Extraida 1.jpg", cv2.IMREAD_GRAYSCALE)
imagen_firma.shape

(76, 170)

#### Escalado de la firma
Para mejorar los resultados de los parámetros y tener información con mejor consistencia, vamos a rescalar todas las firmas a una misma resolución de 300x300

In [95]:
imagen_firma = cv2.resize(imagen_firma, (300,300))
imagen_firma.shape

(300, 300)

### Obtener parámetros de la firma


#### Perímetro y área

Se busca encontrar ciertos parámetros generales que nos permitan identificar rasgos simples de una firma, buscaremos al área de la firma y su perímetro. Ya que cada firma tiene es diferente, estos valores deberían cambiar, pero no ser tan diferentes siendo la misma firma.

In [96]:
def obtener_parametros_firma(imagen_firma):

    kernel = np.ones((3, 3), np.uint8)
    opening = cv2.morphologyEx(imagen_firma.copy(), cv2.MORPH_OPEN, kernel, iterations=2)
    closing = cv2.morphologyEx(opening, cv2.MORPH_CLOSE, kernel, iterations=2)
    contornos, _ = cv2.findContours(closing, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    if(contornos):
        if (len(contornos) <= 1):
            area_firma = cv2.contourArea(contornos[0])
            perimetro_firma = cv2.arcLength(contornos[0], True)
        else:
            area_firma = cv2.contourArea(contornos[1])
            perimetro_firma = cv2.arcLength(contornos[1], True)
    else:
        area_firma = 0
        perimetro_firma = 0

    return area_firma, perimetro_firma

area, perimetro = obtener_parametros_firma(imagen_firma=imagen_firma)
print("Área de la firma:", area)
print("Perímetro de la firma:", perimetro)

Área de la firma: 89401.0
Perímetro de la firma: 1196.0


#### Elementos estructurales

Para obtener elementos estructurales vamos a utilizar 3 morfologías incluídas en la librería de cv2: MORPH_ELLIPSE, MORPH_RECT, MORPH_CROSS. Vamos a crear un Kerlnel de cada estructura y después vamos a erosionar la imagen con esa estructura, sumaremos los bits prendidos (en 1) que signfica que es donde se encontraron estructuras de dicho elemento en la firma original. La suma es nuestroe elemento a considerar

##### Función para visualizar una imagen cuando sea necesario

In [97]:
def printI(imagen):
    cv2.imshow('Imagen', imagen)
    
    cv2.waitKey(0)

##### Obtener elementos estructurales

In [98]:
def obtener_elementos_estruturales(imagen):
    elementos_estructurantes = []
    for i in range(3, 10, 2):
        elemento = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (i, i))
        elementos_estructurantes.append(elemento)

    for i in range(11, 20, 2):
        elemento = cv2.getStructuringElement(cv2.MORPH_RECT, (i, i))
        elementos_estructurantes.append(elemento)

    for i in range(21, 28, 2):
        elemento = cv2.getStructuringElement(cv2.MORPH_CROSS, (i, i))
        elementos_estructurantes.append(elemento)

    caracteristicas = []
    for elemento in elementos_estructurantes:
        erosion = cv2.erode(imagen, elemento)
        pixeles_encendidos = np.sum(erosion > 0)
        caracteristicas.append(pixeles_encendidos)

    return caracteristicas

caracteristicas = obtener_elementos_estruturales(imagen_firma)
print(caracteristicas)

[77343, 71802, 68519, 65235, 59617, 56343, 53357, 50557, 47956, 57244, 55767, 54411, 53101]


### Creación del dataframe

Solo nos hace falta juntar los elementos para obtener el df correspondiente a la firma actual

<h7>Ejemplo de una entrada del DF</h7>

In [99]:
entrada = {
    'Firma':'5',
    'area':area,
    'perimetro': perimetro,
    'EE1': caracteristicas[0],
    'EE2': caracteristicas[1],
    'EE3': caracteristicas[2],
    'EE4': caracteristicas[3],
    'EE5': caracteristicas[4],
    'EE6': caracteristicas[5],
    'EE7': caracteristicas[6],
    'EE8': caracteristicas[7],
    'EE9': caracteristicas[8],
    'EE10': caracteristicas[9],
    'EE11': caracteristicas[10],
    'EE12': caracteristicas[11],
    'EE12': caracteristicas[12],
}

df = pd.DataFrame(entrada, index=[0])
df

Unnamed: 0,Firma,area,perimetro,EE1,EE2,EE3,EE4,EE5,EE6,EE7,EE8,EE9,EE10,EE11,EE12
0,5,89401.0,1196.0,77343,71802,68519,65235,59617,56343,53357,50557,47956,57244,55767,53101


## Realizar todo el proceso para todas las firmas

In [100]:
import os
import re

#Diccionario de datos

datos_firmas = {
    'Firma':list(),
    'area':list(),
    'perimetro': list(),
    'EE1': list(),
    'EE2': list(),
    'EE3': list(),
    'EE4': list(),
    'EE5': list(),
    'EE6': list(),
    'EE7': list(),
    'EE8': list(),
    'EE9': list(),
    'EE10': list(),
    'EE11': list(),
    'EE12': list(),
    'EE12': list(),
}

# Leer correctamente la carpeta
def sorted_alphanumeric(data):
    convert = lambda text: int(text) if text.isdigit() else text.lower()
    alphanum_key = lambda key: [ convert(c) for c in re.split('([0-9]+)', key) ] 
    return sorted(data, key=alphanum_key)

ruta = "Extracted/"

firmas = sorted_alphanumeric(os.listdir(ruta))

for i,firma in enumerate(firmas):
    imagen_firma = cv2.imread(ruta+firma, cv2.IMREAD_GRAYSCALE)
    imagen_firma = cv2.resize(imagen_firma, (300,300))
    
    area, perimetro = obtener_parametros_firma(imagen_firma=imagen_firma.copy())
    caracteristicas = obtener_elementos_estruturales(imagen_firma.copy())
    datos_firmas['Firma'].append(i+1)
    datos_firmas['area'].append(area)
    datos_firmas['perimetro'].append(perimetro)
    for j,element in enumerate(list(datos_firmas.keys())[3:]):
        datos_firmas[element].append(caracteristicas[j])

dataframe = pd.DataFrame(datos_firmas, index=range(len(datos_firmas)))
dataframe = dataframe.round(4)
dataframe

Unnamed: 0,Firma,area,perimetro,EE1,EE2,EE3,EE4,EE5,EE6,EE7,EE8,EE9,EE10,EE11,EE12
0,1,89401.0,1196.0,77343,71802,68519,65235,59617,56343,53357,50557,47956,57244,55767,54411
1,2,89401.0,1196.0,75624,69832,66454,63108,57309,54050,50913,47893,45031,54335,52598,50933
2,3,85728.5,1456.468,78115,72815,69442,65994,59750,56359,53209,50421,47884,57266,55670,54217
3,4,89401.0,1196.0,76647,70481,66758,63051,56611,53175,50050,47149,44470,54444,52771,51155
4,5,87465.5,1578.4092,78157,72959,69448,65947,59892,56600,53631,50915,48376,57703,56166,54728
5,6,87994.0,1431.2548,77357,72169,68963,65796,60388,57441,54797,52352,50171,56762,55161,53678
6,7,89378.0,1202.8284,74758,68345,64566,60953,54967,51435,48150,45105,42418,50945,48969,47200
7,8,87638.0,1566.3675,78412,73394,70123,66827,61416,58358,55546,52853,50271,59095,57652,56265
8,9,89339.0,1207.6569,80841,74080,69921,66284,60573,57371,54510,51972,49819,59659,58093,56580
9,10,87800.5,1486.468,76254,70650,67319,64050,58500,55405,52615,49988,47574,55704,54197,52782


## Crear nuevas entradas
Para mejorar la precisión de nuestra red neuronal, crearemos nuevas entradas calculadas a partir de las anteriores, utilizaremos la media y la desviación estándar de cada entrada para crear nuevas entradas. El objetivo es tener un dataset de nuevas entradas.

In [101]:
import random
datos_firmas = {
    'Firma':list(),
    'area':list(),
    'perimetro': list(),
    'EE1': list(),
    'EE2': list(),
    'EE3': list(),
    'EE4': list(),
    'EE5': list(),
    'EE6': list(),
    'EE7': list(),
    'EE8': list(),
    'EE9': list(),
    'EE10': list(),
    'EE11': list(),
    'EE12': list(),
    'EE12': list(),
}

for i in range(16,61):
    datos_firmas["Firma"].append(i)
    area_mean = dataframe['area'].mean()
    area_standar = dataframe['area'].std()
    perimetro_mean = dataframe['perimetro'].mean()
    perimetro_standar = dataframe['perimetro'].std()
    random_value = random.randint(1,3)
    if(random_value < 2):
        datos_firmas['area'].append(area_mean - area_standar)
        datos_firmas['perimetro'].append(perimetro_mean - perimetro_standar)
        
        for _,element in enumerate(list(datos_firmas.keys())[3:]):
            EE_mean = dataframe[element].mean()
            EE_std = dataframe[element].std()
            datos_firmas[element].append(EE_mean - EE_std)
    elif(random_value > 2):
        datos_firmas['area'].append(area_mean + area_standar)
        datos_firmas['perimetro'].append(perimetro_mean + perimetro_standar)
        for _,element in enumerate(list(datos_firmas.keys())[3:]):
            EE_mean = dataframe[element].mean()
            EE_std = dataframe[element].std()
            datos_firmas[element].append(EE_mean + EE_std)
    else:
        datos_firmas['area'].append(area_mean)
        datos_firmas['perimetro'].append(perimetro_mean)
        for _,element in enumerate(list(datos_firmas.keys())[3:]):
            EE_mean = dataframe[element].mean()
            datos_firmas[element].append(EE_mean)
            
    dataframe_generado = pd.DataFrame(datos_firmas, index=[0])
    dataframe = pd.concat([dataframe, dataframe_generado], ignore_index=True)
    datos_firmas = {
    'Firma':list(),
    'area':list(),
    'perimetro': list(),
    'EE1': list(),
    'EE2': list(),
    'EE3': list(),
    'EE4': list(),
    'EE5': list(),
    'EE6': list(),
    'EE7': list(),
    'EE8': list(),
    'EE9': list(),
    'EE10': list(),
    'EE11': list(),
    'EE12': list(),
    'EE12': list(),
    }

dataframe

Unnamed: 0,Firma,area,perimetro,EE1,EE2,EE3,EE4,EE5,EE6,EE7,EE8,EE9,EE10,EE11,EE12
0,1,89401.0,1196.0,77343.0,71802.0,68519.0,65235.0,59617.0,56343.0,53357.0,50557.0,47956.0,57244.0,55767.0,54411.0
1,2,89401.0,1196.0,75624.0,69832.0,66454.0,63108.0,57309.0,54050.0,50913.0,47893.0,45031.0,54335.0,52598.0,50933.0
2,3,85728.5,1456.468,78115.0,72815.0,69442.0,65994.0,59750.0,56359.0,53209.0,50421.0,47884.0,57266.0,55670.0,54217.0
3,4,89401.0,1196.0,76647.0,70481.0,66758.0,63051.0,56611.0,53175.0,50050.0,47149.0,44470.0,54444.0,52771.0,51155.0
4,5,87465.5,1578.4092,78157.0,72959.0,69448.0,65947.0,59892.0,56600.0,53631.0,50915.0,48376.0,57703.0,56166.0,54728.0
5,6,87994.0,1431.2548,77357.0,72169.0,68963.0,65796.0,60388.0,57441.0,54797.0,52352.0,50171.0,56762.0,55161.0,53678.0
6,7,89378.0,1202.8284,74758.0,68345.0,64566.0,60953.0,54967.0,51435.0,48150.0,45105.0,42418.0,50945.0,48969.0,47200.0
7,8,87638.0,1566.3675,78412.0,73394.0,70123.0,66827.0,61416.0,58358.0,55546.0,52853.0,50271.0,59095.0,57652.0,56265.0
8,9,89339.0,1207.6569,80841.0,74080.0,69921.0,66284.0,60573.0,57371.0,54510.0,51972.0,49819.0,59659.0,58093.0,56580.0
9,10,87800.5,1486.468,76254.0,70650.0,67319.0,64050.0,58500.0,55405.0,52615.0,49988.0,47574.0,55704.0,54197.0,52782.0


### Juntamos los DataFrame e indicamos con una variable booleana que si es nuestra firma

In [102]:
data_frame_correcto = dataframe.round(2)
data_frame_correcto["Valido"] = 1
# Ejemplo de 10 firmas
data_frame_correcto.sample(10)

Unnamed: 0,Firma,area,perimetro,EE1,EE2,EE3,EE4,EE5,EE6,EE7,EE8,EE9,EE10,EE11,EE12,Valido
42,43,88710.9,1380.48,77662.16,72179.49,68772.05,65432.03,59758.5,56589.44,53677.83,50998.32,48534.05,57069.31,55506.5,54049.24,1
53,54,89693.51,1524.71,79288.82,74095.93,70822.32,67622.75,62148.99,59130.28,56364.08,53817.03,51463.62,59920.3,58483.86,57122.31,1
11,12,89401.0,1196.0,73546.0,67554.0,63884.0,60259.0,54280.0,50846.0,47669.0,44751.0,42062.0,50148.0,48345.0,46745.0,1
19,20,89621.91,1514.2,79170.29,73956.28,70672.92,67463.12,61974.8,58945.13,56168.33,53611.63,51250.15,59712.55,58266.9,56898.38,1
57,58,89701.68,1525.9,79302.33,74111.85,70839.35,67640.95,62168.85,59151.38,56386.39,53840.44,51487.96,59943.98,58508.59,57147.84,1
52,53,88823.01,1396.94,77847.75,72398.15,69005.97,65681.98,60031.24,56879.33,53984.32,51319.92,48868.29,57394.59,55846.2,54399.86,1
37,38,89629.91,1515.37,79183.52,73971.88,70689.6,67480.94,61994.25,58965.81,56190.19,53634.57,51273.99,59735.75,58291.13,56923.39,1
43,44,89620.02,1513.92,79167.15,73952.58,70668.96,67458.89,61970.19,58940.23,56163.15,53606.19,51244.5,59707.05,58261.15,56892.45,1
38,39,88709.5,1380.28,77659.85,72176.77,68769.14,65428.92,59755.1,56585.83,53674.02,50994.32,48529.88,57065.26,55502.27,54044.87,1
22,23,89699.37,1525.57,79298.52,74107.35,70834.54,67635.81,62163.25,59145.43,56380.09,53833.83,51481.09,59937.3,58501.61,57140.64,1


### Debemos crear los datos que no son correctos

Lo haremos con datos random para cada columna y el campo valido igual a 0

In [103]:
datos_firmas = {
    'Firma':list(),
    'Valido': list(),
    'area':list(),
    'perimetro': list(),
    'EE1': list(),
    'EE2': list(),
    'EE3': list(),
    'EE4': list(),
    'EE5': list(),
    'EE6': list(),
    'EE7': list(),
    'EE8': list(),
    'EE9': list(),
    'EE10': list(),
    'EE11': list(),
    'EE12': list(),
    'EE12': list(),
}

for i in range(61,120):
    datos_firmas['Firma'].append(i)
    datos_firmas['Valido'].append(0)
    for _,element in enumerate(list(datos_firmas.keys())[2:]):
            datos_firmas[element].append(random.randint(100,10000))

dataframe_incorrecto = pd.DataFrame(datos_firmas, index=range(59))
dataframe_incorrecto

Unnamed: 0,Firma,Valido,area,perimetro,EE1,EE2,EE3,EE4,EE5,EE6,EE7,EE8,EE9,EE10,EE11,EE12
0,61,0,6825,7563,2684,8555,777,1848,9583,4115,2195,4367,4557,7353,1952,4899
1,62,0,6198,9162,1105,8756,7398,9893,5148,7641,8305,1384,601,3995,3846,9355
2,63,0,8373,4145,8809,3940,6620,449,7021,4838,1347,8380,8002,4733,8685,8253
3,64,0,8356,9810,709,6017,179,9752,6408,3801,5740,4330,7924,8473,3104,347
4,65,0,2350,529,194,1383,8877,7791,1091,5914,1701,9907,502,378,8007,2681
5,66,0,2443,9029,1469,2651,7533,3848,4568,1555,8910,7592,9376,5404,1152,1721
6,67,0,8043,8644,8757,1550,7327,6227,3470,4671,852,8411,6698,4523,4088,3850
7,68,0,3088,4651,4242,957,6272,5363,8114,8830,8888,9931,856,2094,8596,3054
8,69,0,4681,6895,8385,7505,9532,4634,2209,1852,6790,4085,4173,2671,3935,5363
9,70,0,4094,8770,5511,8878,4283,6244,236,2231,1658,8337,396,9656,7119,7330


### Juntamos los dataframes

In [104]:
data_frame_completo = pd.concat([data_frame_correcto, dataframe_incorrecto], ignore_index=True)
data_frame_completo = data_frame_completo.round(2)
data_frame_completo.sample(20)

Unnamed: 0,Firma,area,perimetro,EE1,EE2,EE3,EE4,EE5,EE6,EE7,EE8,EE9,EE10,EE11,EE12,Valido
14,15,89401.0,1196.0,78813.0,74225.0,71067.0,67909.0,62358.0,59279.0,56435.0,53898.0,51506.0,60377.0,59088.0,57872.0,1
47,48,88770.2,1389.19,77760.34,72295.16,68895.8,65564.26,59902.78,56742.79,53839.96,51168.45,48710.86,57241.38,55686.2,54234.72,1
19,20,89621.91,1514.2,79170.29,73956.28,70672.92,67463.12,61974.8,58945.13,56168.33,53611.63,51250.15,59712.55,58266.9,56898.38,1
36,37,89604.71,1511.67,79141.8,73922.72,70637.01,67424.76,61932.94,58900.64,56121.29,53562.27,51198.85,59662.63,58214.76,56844.57,1
6,7,89378.0,1202.83,74758.0,68345.0,64566.0,60953.0,54967.0,51435.0,48150.0,45105.0,42418.0,50945.0,48969.0,47200.0,1
48,49,89658.66,1519.59,79231.13,74027.96,70749.6,67545.06,62064.21,59040.16,56268.8,53717.06,51359.72,59819.18,58378.26,57013.32,1
90,91,8395.0,4233.0,7112.0,6871.0,1500.0,6325.0,2241.0,9137.0,6641.0,6316.0,8077.0,3522.0,1351.0,4163.0,0
12,13,87709.5,1537.92,79424.0,75028.0,72210.0,69380.0,64361.0,61579.0,58977.0,56529.0,54216.0,61516.0,60091.0,58748.0,1
74,75,2664.0,4141.0,8584.0,6538.0,4329.0,5363.0,1664.0,3162.0,9090.0,5452.0,893.0,6895.0,4182.0,8454.0,0
115,116,1548.0,6132.0,4886.0,3565.0,6101.0,4459.0,2820.0,2152.0,6885.0,1064.0,6756.0,4686.0,6462.0,2759.0,0


In [105]:
### Exportamos el dataFrame
data_frame_completo.to_csv('DataFrame_Firmas.csv', index=False)

# Entrenamiento del modelo

#### Librerías

In [106]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import confusion_matrix
from sklearn.preprocessing import StandardScaler

#### Separamos las variables predictoras (X) y la variable objetivo (y) 

In [107]:
X = data_frame_completo.iloc[:,1:15]
X

Unnamed: 0,area,perimetro,EE1,EE2,EE3,EE4,EE5,EE6,EE7,EE8,EE9,EE10,EE11,EE12
0,89401.0,1196.00,77343.0,71802.0,68519.0,65235.0,59617.0,56343.0,53357.0,50557.0,47956.0,57244.0,55767.0,54411.0
1,89401.0,1196.00,75624.0,69832.0,66454.0,63108.0,57309.0,54050.0,50913.0,47893.0,45031.0,54335.0,52598.0,50933.0
2,85728.5,1456.47,78115.0,72815.0,69442.0,65994.0,59750.0,56359.0,53209.0,50421.0,47884.0,57266.0,55670.0,54217.0
3,89401.0,1196.00,76647.0,70481.0,66758.0,63051.0,56611.0,53175.0,50050.0,47149.0,44470.0,54444.0,52771.0,51155.0
4,87465.5,1578.41,78157.0,72959.0,69448.0,65947.0,59892.0,56600.0,53631.0,50915.0,48376.0,57703.0,56166.0,54728.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
114,2629.0,1391.00,7869.0,9347.0,2365.0,6105.0,373.0,2842.0,262.0,8278.0,1575.0,2136.0,2304.0,6857.0
115,1548.0,6132.00,4886.0,3565.0,6101.0,4459.0,2820.0,2152.0,6885.0,1064.0,6756.0,4686.0,6462.0,2759.0
116,9417.0,3846.00,3602.0,6706.0,7124.0,4165.0,4803.0,3822.0,7890.0,9422.0,7792.0,3518.0,4069.0,8207.0
117,7777.0,616.00,1117.0,3406.0,4106.0,3132.0,6006.0,8545.0,6486.0,3568.0,2751.0,472.0,5045.0,9183.0


## Normalizamos los datos 

In [108]:
data_frame_completo.iloc[:,1:15] = StandardScaler().fit_transform(data_frame_completo.iloc[:,1:15])
data_frame_completo.head()

Unnamed: 0,Firma,area,perimetro,EE1,EE2,EE3,EE4,EE5,EE6,EE7,EE8,EE9,EE10,EE11,EE12,Valido
0,1,1.003546,-0.734097,0.97459,0.970064,0.971303,0.971403,0.969712,0.963695,0.956798,0.948259,0.939229,0.97769,0.979131,0.982263,1
1,2,1.003546,-0.734097,0.927471,0.911832,0.90687,0.901337,0.886193,0.876078,0.858256,0.833133,0.808463,0.867829,0.856262,0.841236,1
2,3,0.916177,-0.644335,0.995751,1.000008,1.000103,0.996405,0.974524,0.964306,0.950831,0.942381,0.93601,0.97852,0.97537,0.974397,1
3,4,1.003546,-0.734097,0.955512,0.931016,0.916355,0.899459,0.860935,0.842643,0.82346,0.800981,0.783383,0.871945,0.86297,0.850238,1
4,5,0.9575,-0.602313,0.996903,1.004265,1.00029,0.994857,0.979663,0.973515,0.967846,0.96373,0.958005,0.995024,0.994601,0.995117,1


#### Re asignamos X

In [109]:
X = data_frame_completo.iloc[:,1:15]
X

Unnamed: 0,area,perimetro,EE1,EE2,EE3,EE4,EE5,EE6,EE7,EE8,EE9,EE10,EE11,EE12
0,1.003546,-0.734097,0.974590,0.970064,0.971303,0.971403,0.969712,0.963695,0.956798,0.948259,0.939229,0.977690,0.979131,0.982263
1,1.003546,-0.734097,0.927471,0.911832,0.906870,0.901337,0.886193,0.876078,0.858256,0.833133,0.808463,0.867829,0.856262,0.841236
2,0.916177,-0.644335,0.995751,1.000008,1.000103,0.996405,0.974524,0.964306,0.950831,0.942381,0.936010,0.978520,0.975370,0.974397
3,1.003546,-0.734097,0.955512,0.931016,0.916355,0.899459,0.860935,0.842643,0.823460,0.800981,0.783383,0.871945,0.862970,0.850238
4,0.957500,-0.602313,0.996903,1.004265,1.000290,0.994857,0.979663,0.973515,0.967846,0.963730,0.958005,0.995024,0.994601,0.995117
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
114,-1.060773,-0.666897,-0.929753,-0.876067,-1.092867,-0.976414,-1.174129,-1.080618,-1.183993,-0.878844,-1.134289,-1.103514,-1.093745,-0.945975
115,-1.086490,0.966912,-1.011519,-1.046980,-0.976294,-1.030635,-1.085580,-1.106984,-0.916954,-1.190600,-0.902666,-1.007211,-0.932530,-1.112142
116,-0.899285,0.179127,-1.046715,-0.954134,-0.944374,-1.040320,-1.013822,-1.043172,-0.876432,-0.829406,-0.856351,-1.051321,-1.025312,-0.891234
117,-0.938301,-0.933972,-1.114831,-1.051680,-1.038543,-1.074348,-0.970290,-0.862702,-0.933041,-1.082388,-1.081715,-1.166356,-0.987470,-0.851659


#### Obtenemos variable objetivo

In [110]:
y = data_frame_completo.iloc[:, -1]
y

0      1
1      1
2      1
3      1
4      1
      ..
114    0
115    0
116    0
117    0
118    0
Name: Valido, Length: 119, dtype: int64

## Separamos el dataset en entramiento 70% y prueba 30%

In [111]:
X_train, X_test, y_train, y_test = train_test_split(X,y, train_size=0.7, random_state=13)

### Inicializamos la red neuornal

In [112]:
instancia_red = MLPClassifier(hidden_layer_sizes=(101,))
print(instancia_red)

MLPClassifier(hidden_layer_sizes=(101,))


In [113]:
instancia_red.fit(X_train, y_train)

### Función para calcular que tan bueno es el modelo

In [114]:
def indices_general(MC, nombres=None):
    precision_global = np.sum(MC.diagonal()) / np.sum(MC)
    error_global = 1 - precision_global
    precision_categorica = pd.DataFrame(MC.diagonal()/np.sum(MC,axis=1)).T
    if nombres != None:
        precision_categorica.columns = nombres
    return {
        "Matriz de confusion": MC,
        "Precision global": precision_global,
        "Error global": error_global,
        "Precision por categoria": precision_categorica
    }

#### Índice de precisión del modelo

In [115]:
prediccion = instancia_red.predict(X_test)
MC = confusion_matrix(y_test, prediccion)
indices = indices_general(MC,list(np.unique(y)))
for k in indices:
    print("\n%s:\n%s"%(k,str(indices[k])))


Matriz de confusion:
[[18  0]
 [ 0 18]]

Precision global:
1.0

Error global:
0.0

Precision por categoria:
     0    1
0  1.0  1.0


### 

# Referencias
- OpenCV Resize image using cv2.resize(): https://www.tutorialkart.com/opencv/python/opencv-python-resize-image/#gsc.tab=0
- Morphological Transformations: https://docs.opencv.org/4.x/d9/d61/tutorial_py_morphological_ops.html
- OpenCV Morphological Operations: https://pyimagesearch.com/2021/04/28/opencv-morphological-operations/
- sklearn.neural_network.MLPClassifier: https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html
