<a href="https://colab.research.google.com/github/Esau-May/MachineLearningCourse/blob/main/Activities/Perceptron.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Perceptron**

The Perceptron algorithm is a simple yet fundamental supervised learning technique for binary classification tasks. It aims to find a linear decision boundary that separates data points into two classes based on their features. The algorithm iteratively updates a set of weights and a bias term to adjust the decision boundary. During each iteration, the Perceptron evaluates data points and updates the weights and bias if a misclassification is encountered, effectively "learning" from its mistakes. The process continues until convergence or for a specified number of epochs.

**Pseudocode**

Import the necessary libraries, such as pandas, numpy, and matplotlib.

Load a dataset from a CSV file named 'wdbc.data.'

Convert the values in column 1 ('M' and 'B') into numerical values (1 and -1).

Extract the features (columns 2 to 31) and labels (column 1) from the dataset.

Calculate the number of samples for training and testing (80% for training, 20% for testing).

Shuffle the data randomly to avoid biases.

Split the data into training and testing sets.

Define a function to train a perceptron, which adjusts weights and bias to classify the data.

Train the perceptron with a learning rate of 0.1 over 100 epochs.

Calculate the accuracy on the test set and a loss value.

Print the perceptron's accuracy and the loss function value.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Cargar el conjunto de datos desde el archivo .DATA
data = pd.read_csv('wdbc.data', header=None)

# Convertir valores con caracteristicas a numericas
data[1] = data[1].map({'M': 1, 'B': -1})

# Extraer las características y las etiquetas
X = data.iloc[:, 2:32].to_numpy()
y = data.iloc[:, 1].to_numpy()

total_samples = X.shape[0]

# Dividir el dataset en entrenamiento y test
train_size = int(0.8 * total_samples)
test_size = total_samples - train_size


indices = np.arange(total_samples)
np.random.shuffle(indices)

# Dividir los datos
X_train = X[indices[:train_size]]
y_train = y[indices[:train_size]]
X_test = X[indices[train_size:]]
y_test = y[indices[train_size:]]

# Definir la función de entrenamiento del perceptrón
def train_perceptron(X, y, learning_rate, epochs):
    w = np.zeros(X.shape[1])
    b = 0
    errors = []
    for _ in range(epochs):
        total_error = 0
        for xi, target in zip(X, y):
            update = learning_rate * (target - np.sign(np.dot(xi, w) + b))
            w += update * xi
            b += update
            total_error += int(update != 0)
        errors.append(total_error)
    return w, b, errors

# Entrenar el perceptrón
learning_rate = 0.1
epochs = 100
weights, bias, errors = train_perceptron(X_train, y_train, learning_rate, epochs)


# Calcular la precisión en el conjunto de prueba
def predict(X, weights, bias):
    z = np.dot(X, weights) + bias
    return np.sign(z)

def loss(X, y, weights, bias):

    y_pred = np.sign(np.dot(X, weights) + bias)


    hinge_loss = np.maximum(0, 1 - y * y_pred)


    average_loss = np.mean(hinge_loss)

    return average_loss


y_pred = predict(X_test, weights, bias)
accuracy = np.mean(y_pred == y_test)
loss_value = loss(X_test, y_test, weights, bias)

print(f'Precisión del Perceptrón: {accuracy * 100:.2f}%')
print(f'Valor de la Loss function: {loss_value:.2f}')


Precisión del Perceptrón: 88.60%
Valor de la Loss function: 0.23


**Loss function**

The loss function in this code is the Hinge Loss, which is used to measure the error between the predicted values and the true labels in the context of binary classification. The Hinge Loss is defined as the maximum of zero and the difference between 1 and the product of the true label and the predicted value. This loss function penalizes the model when its predictions are on the wrong side of the decision boundary. If the predicted value and the true label have the same sign and are on the correct side of the boundary, the loss is zero, indicating that the prediction is accurate.