# **Laboratorio 5 Redes Neuronales**
Arancibia Aguilar Daniel Andree

Ingeniería en Ciencias de la Computación

Para las redes neuronales hare uso del siguiente dataset:

https://www.kaggle.com/datasets/anseldsouza/water-pump-rul-predictive-maintenance

Convertido a una clasificación de 9 clases

Haciendo las pruebas con diferentes hiperparametros para encontrar el mas optimo


In [44]:
import os
import pandas as pd
from sklearn.model_selection import train_test_split
import numpy as np
from matplotlib import pyplot
from scipy import optimize

%matplotlib inline

In [45]:
from google.colab import drive
drive.mount('/content/gdrive')

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


In [46]:
!ls
%mkdir data
!ls

data  gdrive  sample_data
mkdir: cannot create directory ‘data’: File exists
data  gdrive  sample_data


**Preproceso el dataset convirtiendolo a una clasificación one vs all**

In [78]:
url = "/content/gdrive/MyDrive/SIS420/Lab5/rul_hrs.csv"
dataframe = pd.read_csv(url)
# Definir los rangos y los valores a los que se deben convertir
rango_valores = [
    (range(0, 84), 0),
    (range(84, 167), 1),
    (range(167, 251), 2),
    (range(251, 334), 3),
    (range(334, 418), 4),
    (range(418, 502), 5),
    (range(502, 585), 6),
    (range(585, 669), 7),
    (range(669, 753), 8),
    (range(753, 838), 9)
]
# Iterar sobre los rangos y valores y reemplazar los valores en la columna "rul"
for rango, valor in rango_valores:
    dataframe['rul'] = dataframe['rul'].replace(rango, valor)

print(dataframe)

        Unnamed: 0  timestamp  sensor_00  sensor_01  sensor_02  sensor_03  \
0                0      43191          2         47         53         46   
1                1      43191          2         47         53         46   
2                2      43191          2         47         53         46   
3                3      43191          2         47         53         46   
4                4      43191          2         47         53         46   
...            ...        ...        ...        ...        ...        ...   
166436      166436      43307          2         46         53         44   
166437      166437      43307          2         46         53         44   
166438      166438      43307          2         46         53         44   
166439      166439      43307          2         46         53         44   
166440      166440      43307          2         46         53         44   

        sensor_04  sensor_05  sensor_06  sensor_07  ...  sensor_42  sensor_

In [79]:
X = dataframe.drop(columns=['rul'])
y = dataframe['rul']
print(X.shape)
print(y.shape)
X=X.values
y=y.values
y = np.squeeze(y)

(166441, 52)
(166441,)


Se hace una normalización para evitar el overflow al sacar la funcion de costo y ademas tener una mejor precisión

In [80]:
def featureNormalize(X):
    mu = np.mean(X, axis=0)
    sigma = np.std(X, axis=0)
    epsilon = 1e-8
    sigma += epsilon
    X_norm = (X - mu) / sigma
    return X_norm, mu, sigma

In [81]:
X, mu, sigma = featureNormalize(X)
print(X)

[[-1.7320404  -1.73196589 -0.23395421 ...  0.56148079  1.07881881
  -0.00905751]
 [-1.73201959 -1.73196589 -0.23395421 ...  0.56148079  1.07881881
  -0.00905751]
 [-1.73199878 -1.73196589 -0.23395421 ...  0.53020102  1.00861677
   0.01600662]
 ...
 [ 1.73199878  1.74447716 -0.23395421 ...  1.98471052  1.35962696
   0.04107075]
 [ 1.73201959  1.74447716 -0.23395421 ...  1.92215097  1.429829
   0.06613488]
 [ 1.7320404   1.74447716 -0.23395421 ...  1.68755266  1.21922289
   0.03271604]]


**Separo los valores de entrenamiento y los valores de prueba**

In [51]:
X_train, X_test, Y_train, Y_test = train_test_split(X, y, test_size=0.2, random_state=42)
print(X_train.shape)
print(Y_train.shape)
print(X.shape)
y = np.array([int(e) for e in y])
print(y.shape)
y = np.squeeze(y)
print(y)

(133152, 52)
(133152,)
(166441, 52)
(166441,)
[3 3 3 ... 0 0 0]


# **Definición de funciones**

In [52]:
def sigmoid(z):
    """
    Computes the sigmoid of z.
    """
    return 1.0 / (1.0 + np.exp(-z))


def sigmoidGradient(z):

    g = np.zeros(z.shape)

    g = sigmoid(z) * (1 - sigmoid(z))

    return g

In [53]:
def nnCostFunction(nn_params,
                   input_layer_size,
                   hidden_layer_size,
                   num_labels,
                   X_train, Y_train, lambda_=0.0):

    Theta1 = np.reshape(nn_params[:hidden_layer_size * (input_layer_size + 1)],
                        (hidden_layer_size, (input_layer_size + 1)))

    Theta2 = np.reshape(nn_params[(hidden_layer_size * (input_layer_size + 1)):],
                        (num_labels, (hidden_layer_size + 1)))

    m = Y_train.size

    J = 0
    Theta1_grad = np.zeros(Theta1.shape)
    Theta2_grad = np.zeros(Theta2.shape)

    a1 = np.concatenate([np.ones((m, 1)), X_train], axis=1)

    a2 = sigmoid(a1.dot(Theta1.T))
    a2 = np.concatenate([np.ones((a2.shape[0], 1)), a2], axis=1)

    a3 = sigmoid(a2.dot(Theta2.T))

    y_matrix = Y_train.reshape(-1)
    # print(y.shape)
    y_matrix = np.eye(num_labels)[y_matrix]
    # print(y_matrix)

    temp1 = Theta1
    temp2 = Theta2

    # Agregar el termino de regularización

    reg_term = (lambda_ / (2 * m)) * (np.sum(np.square(temp1[:, 1:])) + np.sum(np.square(temp2[:, 1:])))

    J = (-1 / m) * np.sum((np.log(a3) * y_matrix) + np.log(1 - a3) * (1 - y_matrix)) + reg_term

    # Backpropogation

    delta_3 = a3 - y_matrix
    delta_2 = delta_3.dot(Theta2)[:, 1:] * sigmoidGradient(a1.dot(Theta1.T))

    Delta1 = delta_2.T.dot(a1)
    Delta2 = delta_3.T.dot(a2)

    # Agregar regularización al gradiente

    Theta1_grad = (1 / m) * Delta1
    Theta1_grad[:, 1:] = Theta1_grad[:, 1:] + (lambda_ / m) * Theta1[:, 1:]

    Theta2_grad = (1 / m) * Delta2
    Theta2_grad[:, 1:] = Theta2_grad[:, 1:] + (lambda_ / m) * Theta2[:, 1:]

    grad = np.concatenate([Theta1_grad.ravel(), Theta2_grad.ravel()])

    return J, grad

In [54]:
def randInitializeWeights(L_in, L_out, epsilon_init=0.12):
    """
    Randomly initialize the weights of a layer in a neural network.

    Parameters
    ----------
    L_in : int
        Number of incomming connections.

    L_out : int
        Number of outgoing connections.

    epsilon_init : float, optional
        Range of values which the weight can take from a uniform
        distribution.

    Returns
    -------
    W : array_like
        The weight initialiatized to random values.  Note that W should
        be set to a matrix of size(L_out, 1 + L_in) as
        the first column of W handles the "bias" terms."""


    W = np.zeros((L_out, 1 + L_in))
    W = np.random.rand(L_out, 1 + L_in) * 2 * epsilon_init - epsilon_init

    return W

In [55]:
def predict(Theta1, Theta2, X):
    """
    Predict the label of an input given a trained neural network
    Outputs the predicted label of X given the trained weights of a neural
    network(Theta1, Theta2)
    """
    m = X.shape[0]
    num_labels = Theta2.shape[0]
    p = np.zeros(m)
    h1 = sigmoid(np.dot(np.concatenate([np.ones((m, 1)), X], axis=1), Theta1.T))
    h2 = sigmoid(np.dot(np.concatenate([np.ones((m, 1)), h1], axis=1), Theta2.T))
    p = np.argmax(h2, axis=1)
    return p

# **Prueba 1 de hiperparametros:**

In [56]:
# Configurando parametros necesario
input_layer_size  = 52
hidden_layer_size = 50
num_labels = 10

pesos = {}
pesos['Theta1'] = np.random.rand(hidden_layer_size, 53)
pesos['Theta2'] = np.random.rand(num_labels, hidden_layer_size + 1)

Theta1, Theta2 = pesos['Theta1'], pesos['Theta2']


# Desenrollar parámetros
print(Theta1.ravel().shape)
print(Theta2.ravel().shape)

nn_params = np.concatenate([Theta1.ravel(), Theta2.ravel()])
print(nn_params.shape)

(2650,)
(510,)
(3160,)


In [57]:
lambda_ = 0.1
J, _ = nnCostFunction(nn_params, input_layer_size, hidden_layer_size, num_labels, X_train, Y_train, lambda_)
print('Costo en parametros : %.6f ' % J)

Costo en parametros : 164.177366 


In [58]:
z = np.array([-1, -0.5, 0, 0.5, 1])
g = sigmoidGradient(z)
print('Gradiente sigmoide evaluada con [-1 -0.5 0 0.5 1]:\n  ')
print(g)

Gradiente sigmoide evaluada con [-1 -0.5 0 0.5 1]:
  
[0.19661193 0.23500371 0.25       0.23500371 0.19661193]


In [59]:
print('Inicialización de parámetros de redes neuronales...')

initial_Theta1 = randInitializeWeights(input_layer_size, hidden_layer_size)
initial_Theta2 = randInitializeWeights(hidden_layer_size, num_labels)

# Desenrrollr parametros
initial_nn_params = np.concatenate([initial_Theta1.ravel(), initial_Theta2.ravel()], axis=0)

Inicialización de parámetros de redes neuronales...


In [60]:
#lambda se define al igual que el numero de veces que se optimizará
lambda_ = 0.1
max_iterations = 200

# La funcion de costo es aplicada al modelo
costFunction = lambda p: nnCostFunction(p, input_layer_size, hidden_layer_size, num_labels, X, y, lambda_)

#funcion que permite minimizar la funcion de costo
res = optimize.minimize(costFunction, initial_nn_params, jac=True,
                        method='L-BFGS-B',  # Cambio de metodo 'L-BFGS-B', para poder definir las iteraciones
                        options={'maxiter': max_iterations})
# get the solution of the optimization
nn_params = res.x

# Obtain Theta1 and Theta2 back from nn_params
Theta1 = np.reshape(nn_params[:hidden_layer_size * (input_layer_size + 1)],
                    (hidden_layer_size, (input_layer_size + 1)))

Theta2 = np.reshape(nn_params[(hidden_layer_size * (input_layer_size + 1)):],
                    (num_labels, (hidden_layer_size + 1)))

In [61]:
pred = predict(Theta1, Theta2, X_train)
print(pred)
print('Precisión de entrenamiento: %f' % (np.mean(pred == Y_train) * 100))

pred = predict(Theta1, Theta2, X_test)
print(pred)
print('Precisión de prueba: %f' % (np.mean(pred == Y_test) * 100))


[0 4 1 ... 1 3 1]
Precisión de entrenamiento: 98.290675
[0 1 2 ... 1 3 2]
Precisión de prueba: 98.350807


# **Prueba 2 cambiando los hiperparametros**

In [62]:
# Configurando parametros necesario
input_layer_size  = 52
hidden_layer_size = 42
num_labels = 10

pesos = {}
pesos['Theta1'] = np.random.rand(hidden_layer_size, 53)
pesos['Theta2'] = np.random.rand(num_labels, hidden_layer_size + 1)

Theta1, Theta2 = pesos['Theta1'], pesos['Theta2']
# Desenrollar parámetros
print(Theta1.ravel().shape)
print(Theta2.ravel().shape)

nn_params = np.concatenate([Theta1.ravel(), Theta2.ravel()])
print(nn_params.shape)

(2226,)
(430,)
(2656,)


In [63]:
lambda_ = 0.02
J, _ = nnCostFunction(nn_params, input_layer_size, hidden_layer_size, num_labels, X_train, Y_train, lambda_)
print('Costo en parametros : %.6f ' % J)

Costo en parametros : 146.523772 


In [64]:
print('Inicialización de parámetros de redes neuronales...')

initial_Theta1 = randInitializeWeights(input_layer_size, hidden_layer_size)
initial_Theta2 = randInitializeWeights(hidden_layer_size, num_labels)

# Desenrrollr parametros
initial_nn_params = np.concatenate([initial_Theta1.ravel(), initial_Theta2.ravel()], axis=0)

Inicialización de parámetros de redes neuronales...


In [65]:
#lambda se define al igual que el numero de veces que se optimizará
lambda_ = 0.02
max_iterations = 200

# La funcion de costo es aplicada al modelo
costFunction = lambda p: nnCostFunction(p, input_layer_size, hidden_layer_size, num_labels, X, y, lambda_)

#funcion que permite minimizar la funcion de costo
res = optimize.minimize(costFunction, initial_nn_params, jac=True,
                        method='L-BFGS-B',  # Cambio de metodo 'L-BFGS-B', para poder definir las iteraciones
                        options={'maxiter': max_iterations})
# get the solution of the optimization
nn_params = res.x

# Obtain Theta1 and Theta2 back from nn_params
Theta1 = np.reshape(nn_params[:hidden_layer_size * (input_layer_size + 1)],
                    (hidden_layer_size, (input_layer_size + 1)))

Theta2 = np.reshape(nn_params[(hidden_layer_size * (input_layer_size + 1)):],
                    (num_labels, (hidden_layer_size + 1)))

In [66]:
pred = predict(Theta1, Theta2, X_train)
print(pred)
print('Precisión de entrenamiento: %f' % (np.mean(pred == Y_train) * 100))

pred = predict(Theta1, Theta2, X_test)
print(pred)
print('Precisión de prueba: %f' % (np.mean(pred == Y_test) * 100))


[0 4 1 ... 1 3 1]
Precisión de entrenamiento: 97.495344
[0 1 2 ... 1 3 2]
Precisión de prueba: 97.581784


# **Prueba 3 cambiando los hiperparametros**

In [67]:
# Configurando parametros necesario
input_layer_size  = 52
hidden_layer_size = 18
num_labels = 10

pesos = {}
pesos['Theta1'] = np.random.rand(hidden_layer_size, 53)
pesos['Theta2'] = np.random.rand(num_labels, hidden_layer_size + 1)

Theta1, Theta2 = pesos['Theta1'], pesos['Theta2']
# Desenrollar parámetros
print(Theta1.ravel().shape)
print(Theta2.ravel().shape)

nn_params = np.concatenate([Theta1.ravel(), Theta2.ravel()])
print(nn_params.shape)

(954,)
(190,)
(1144,)


In [68]:
lambda_ = 0.001
J, _ = nnCostFunction(nn_params, input_layer_size, hidden_layer_size, num_labels, X_train, Y_train, lambda_)
print('Costo en parametros : %.6f ' % J)

Costo en parametros : 61.448132 


In [69]:
print('Inicialización de parámetros de redes neuronales...')

initial_Theta1 = randInitializeWeights(input_layer_size, hidden_layer_size)
initial_Theta2 = randInitializeWeights(hidden_layer_size, num_labels)

# Desenrrollr parametros
initial_nn_params = np.concatenate([initial_Theta1.ravel(), initial_Theta2.ravel()], axis=0)

Inicialización de parámetros de redes neuronales...


In [70]:
#lambda se define al igual que el numero de veces que se optimizará
lambda_ = 0.001
max_iterations = 200

# La funcion de costo es aplicada al modelo
costFunction = lambda p: nnCostFunction(p, input_layer_size, hidden_layer_size, num_labels, X, y, lambda_)

#funcion que permite minimizar la funcion de costo
res = optimize.minimize(costFunction, initial_nn_params, jac=True,
                        method='L-BFGS-B',  # Cambio de metodo 'L-BFGS-B', para poder definir las iteraciones
                        options={'maxiter': max_iterations})
# get the solution of the optimization
nn_params = res.x

# Obtain Theta1 and Theta2 back from nn_params
Theta1 = np.reshape(nn_params[:hidden_layer_size * (input_layer_size + 1)],
                    (hidden_layer_size, (input_layer_size + 1)))

Theta2 = np.reshape(nn_params[(hidden_layer_size * (input_layer_size + 1)):],
                    (num_labels, (hidden_layer_size + 1)))

In [71]:
pred = predict(Theta1, Theta2, X_train)
print(pred)
print('Precisión de entrenamiento: %f' % (np.mean(pred == Y_train) * 100))

pred = predict(Theta1, Theta2, X_test)
print(pred)
print('Precisión de prueba: %f' % (np.mean(pred == Y_test) * 100))


[0 4 1 ... 1 4 1]
Precisión de entrenamiento: 93.325673
[0 1 2 ... 1 3 2]
Precisión de prueba: 93.222987


# **Prueba 4 cambiando los hiperparametros**

In [72]:
# Configurando parametros necesario
input_layer_size  = 52
hidden_layer_size = 62
num_labels = 10

pesos = {}
pesos['Theta1'] = np.random.rand(hidden_layer_size, 53)
pesos['Theta2'] = np.random.rand(num_labels, hidden_layer_size + 1)

Theta1, Theta2 = pesos['Theta1'], pesos['Theta2']
# Desenrollar parámetros
print(Theta1.ravel().shape)
print(Theta2.ravel().shape)

nn_params = np.concatenate([Theta1.ravel(), Theta2.ravel()])
print(nn_params.shape)

(3286,)
(630,)
(3916,)


In [73]:
lambda_ = 0.5
J, _ = nnCostFunction(nn_params, input_layer_size, hidden_layer_size, num_labels, X_train, Y_train, lambda_)
print('Costo en parametros : %.6f ' % J)

Costo en parametros : 206.133261 


In [74]:
print('Inicialización de parámetros de redes neuronales...')

initial_Theta1 = randInitializeWeights(input_layer_size, hidden_layer_size)
initial_Theta2 = randInitializeWeights(hidden_layer_size, num_labels)

# Desenrrollr parametros
initial_nn_params = np.concatenate([initial_Theta1.ravel(), initial_Theta2.ravel()], axis=0)

Inicialización de parámetros de redes neuronales...


In [75]:
#lambda se define al igual que el numero de veces que se optimizará
lambda_ = 0.5
max_iterations = 200

# La funcion de costo es aplicada al modelo
costFunction = lambda p: nnCostFunction(p, input_layer_size, hidden_layer_size, num_labels, X, y, lambda_)

#funcion que permite minimizar la funcion de costo
res = optimize.minimize(costFunction, initial_nn_params, jac=True,
                        method='L-BFGS-B',  # Cambio de metodo 'L-BFGS-B', para poder definir las iteraciones
                        options={'maxiter': max_iterations})
# get the solution of the optimization
nn_params = res.x

# Obtain Theta1 and Theta2 back from nn_params
Theta1 = np.reshape(nn_params[:hidden_layer_size * (input_layer_size + 1)],
                    (hidden_layer_size, (input_layer_size + 1)))

Theta2 = np.reshape(nn_params[(hidden_layer_size * (input_layer_size + 1)):],
                    (num_labels, (hidden_layer_size + 1)))

In [76]:
pred = predict(Theta1, Theta2, X_train)
print(pred)
print('Precisión de entrenamiento: %f' % (np.mean(pred == Y_train) * 100))

pred = predict(Theta1, Theta2, X_test)
print(pred)
print('Precisión de prueba: %f' % (np.mean(pred == Y_test) * 100))


[0 4 1 ... 1 3 1]
Precisión de entrenamiento: 98.921533
[0 1 2 ... 1 3 2]
Precisión de prueba: 98.927574


# **Resultado de las 4 pruebas**
## **Prueba 1 con los hiperparametros**

Capas ocultas = 50 lambda = 0.1

[0 4 1 ... 1 3 1]
**Precisión de entrenamiento: 98.290675**

[0 1 2 ... 1 3 2]
**Precisión de prueba: 98.350807**

## **Prueba 2 con los hiperparametros**

Capas ocultas = 42 lambda = 0.02

[0 4 1 ... 1 3 1]
**Precisión de entrenamiento: 97.495344**

[0 1 2 ... 1 3 2]
**Precisión de prueba: 97.581784**

## **Prueba 3 con los hiperparametros**

Capas ocultas = 18 lambda = 0.001

[0 4 1 ... 1 4 1]
**Precisión de entrenamiento: 93.325673**

[0 1 2 ... 1 3 2]
**Precisión de prueba: 93.222987**

## **Prueba 4 con los hiperparametros**

Capas ocultas = 62 lambda = 0.5

[0 4 1 ... 1 3 1]
**Precisión de entrenamiento: 98.921533**

[0 1 2 ... 1 3 2]
**Precisión de prueba: 98.927574**

Las pruebas muestran que un valor más alto de lambda suele mejorar la precisión en entrenamiento y test, cambiar el número de capas ocultas no muestra una variación tan clara en los resultados pero al tener pocas capas si hubo una disminución en la precisión