# Notebook ICD - 22

### Libraries

In [1]:
import numpy as np
import pandas as pd

## Perceptron from scratch

The Perceptron is a foundational algorithm in ML, representing one of the simplest forms of neural networks. It is a type of linear classifier used for binary classification tasks, where it determines the decision boundary based on a linear combination of input features. The main steps in the Perceptron algorithm include:

Initialization: The algorithm begins by initializing weights (including the bias term) to zero or small random values. These weights correspond to the influence each feature has on the final classification.

Prediction: For each input, the Perceptron computes a weighted sum of the features and adds a bias. This result is then passed through a step function (the activation function), where it outputs 1 if the result is greater than or equal to zero, and 0 otherwise.

Training and Weight Update: During training, the Perceptron iteratively updates its weights based on the difference between the predicted output and the actual target. If the prediction is incorrect, the weights are adjusted to reduce future errors for that specific example. This process repeats over multiple iterations (epochs) until the algorithm converges (i.e., when all examples are classified correctly or the maximum number of iterations is reached).

Limitations: A key limitation of the Perceptron algorithm is that it can only solve linearly separable problems. For datasets that are not linearly separable, the algorithm will not converge, leading to suboptimal performance.

In [2]:
class Perceptron:
    def __init__(self, learning_rate=0.1, n_iterations=100):
        # Initialize the learning rate and the number of iterations (epochs)
        self.learning_rate = learning_rate
        self.n_iterations = n_iterations
    
    def fit(self, X, y):
        # Initialize weights to zero, including the bias term (w0)
        self.weights = np.zeros(X.shape[1] + 1)
        ###print(f"Initial weights (iteration 0): {self.weights}")
        
        # Iteratively adjust weights based on prediction error
        for iteration in range(self.n_iterations):
            for xi, target in zip(X, y):
                # Predict the output for the current example
                prediction = self.predict(xi)
                
                # Update weights if the prediction is incorrect
                update = self.learning_rate * (target - prediction)
                self.weights[1:] += update * xi  # Update weights for features
                self.weights[0] += update        # Update bias term
            
            # Print weights at the end of each iteration
            ###print(f"Weights after iteration {iteration + 1}/{self.n_iterations}: {self.weights}")
    
    def predict(self, X):
        # Compute the weighted sum of inputs and add the bias term
        weighted_sum = np.dot(X, self.weights[1:]) + self.weights[0]
        
        # Activation function: returns 1 if weighted sum is >= 0, else 0
        return 1 if weighted_sum >= 0 else 0

### Implementation example

This example demonstrates how to use the Perceptron model to classify instances from the "weather" dataset, which determines if playing tennis is advisable based on weather conditions. The process includes data preprocessing, defining feature and target sets, and training the Perceptron model.

Data Preprocessing: The original dataset contains categorical features (e.g., "outlook," "temperature," etc.) that need to be converted to numerical values for the Perceptron algorithm. We use a simple encoding method to map each categorical value to a unique integer.

Defining Features and Labels: After preprocessing, the features (X) include all columns except the target label ("play"), which indicates whether playing tennis is recommended. The target (y) consists of binary values where 1 represents "yes" and 0 represents "no."

Training the Perceptron: With preprocessed data, we initialize a Perceptron instance and train it on the feature and target sets. The model iteratively adjusts its weights based on prediction errors, eventually learning a decision boundary that separates the two classes.

In [3]:
# Load the weather dataset
data = pd.read_csv('weather.nominal.csv')

# Data Preprocessing: Convert categorical features to numerical codes
# This step maps each unique categorical value to an integer, preparing the data for the Perceptron
data['outlook'] = data['outlook'].map({'sunny': 0, 'overcast': 1, 'rainy': 2})
data['temperature'] = data['temperature'].map({'hot': 0, 'mild': 1, 'cool': 2})
data['humidity'] = data['humidity'].map({'high': 0, 'normal': 1})
data['windy'] = data['windy'].map({False: 0, True: 1})
data['play'] = data['play'].map({'no': 0, 'yes': 1})

# Define X (features) and y (target)
# X includes all columns except the last one, which is the target
X = data.iloc[:, :-1].values  # All columns except the last one (features)
y = data.iloc[:, -1].values   # Last column only (target label)

# Train the Perceptron model with the weather data
# Create a Perceptron instance with a learning rate of 0.1 and 100 iterations
perceptron = Perceptron(learning_rate=0.1, n_iterations=100)
perceptron.fit(X, y)

A new weather condition to test if the trained Perceptron model correctly predicts whether to play tennis is created. The new example represents specific weather features (e.g., sunny outlook, mild temperature, high humidity, weak wind). Based on these input conditions, the model predicts "Yes" or "No" for playing tennis.

New Example: The array nuevo_ejemplo encodes the weather features into integers matching the preprocessing mapping used during training.
Prediction: We pass this array to the predict method of the trained Perceptron model.
Result Output: Based on the model’s output (1 for "Yes," 0 for "No"), we print an interpretable result to confirm if playing tennis is recommended.

In [4]:
# Prueba con un nuevo ejemplo (Clima='soleado', Temperatura='templado', Humedad='alta', Viento='débil')
nuevo_ejemplo = np.array([2, 1, 0, 0])
resultado = perceptron.predict(nuevo_ejemplo)

# Resultado de la predicción
print(f"¿Jugar tenis con las condiciones {nuevo_ejemplo}? {'Sí' if resultado == 1 else 'No'}")

¿Jugar tenis con las condiciones [2 1 0 0]? No


## Perceptron using Scikit-learn

In [5]:
df = pd.read_csv(r'weather.numeric.csv')
print(df)

# defining the dependent and independent variables
X_train = df[['Outlook', 'Temperature', 'Humidity', 'Wind']]
y_train = df[['Play']]

print(X_train.head())
print(y_train.head())

    Day   Outlook  Temperature  Humidity    Wind   Play
0     1     sunny           85        85    weak  False
1     2     sunny           80        90  strong  False
2     3  overcast           83        86    weak   True
3     4      rain           70        96    weak   True
4     5      rain           68        80    weak   True
5     6      rain           65        70  strong  False
6     7  overcast           64        65  strong   True
7     8     sunny           72        95    weak  False
8     9     sunny           69        70    weak   True
9    10      rain           75        80    weak   True
10   11     sunny           75        70  strong   True
11   12  overcast           72        90  strong   True
12   13  overcast           81        75    weak   True
13   14      rain           71        91  strong  False
    Outlook  Temperature  Humidity    Wind
0     sunny           85        85    weak
1     sunny           80        90  strong
2  overcast           83       

In [6]:
from sklearn.preprocessing import LabelEncoder
encoder = LabelEncoder()

outlook = X_train.iloc[:,0]
outlook_enc = encoder.fit_transform(outlook)
wind = X_train.iloc[:,3]
wind_enc = encoder.fit_transform(wind)

df_outlook = pd.DataFrame(outlook_enc, columns = ['Outlook'])
df_wind = pd.DataFrame(outlook_enc, columns = ['Wind'])
X_train_num = pd.concat([df_outlook, X_train.iloc[:,1], X_train.iloc[:,2], df_wind], axis=1)
print(X_train_num)

    Outlook  Temperature  Humidity  Wind
0         2           85        85     2
1         2           80        90     2
2         0           83        86     0
3         1           70        96     1
4         1           68        80     1
5         1           65        70     1
6         0           64        65     0
7         2           72        95     2
8         2           69        70     2
9         1           75        80     1
10        2           75        70     2
11        0           72        90     0
12        0           81        75     0
13        1           71        91     1


In [8]:
from sklearn.linear_model import Perceptron

clf = Perceptron(tol=1e-3, random_state=0)
clf.fit(X_train_num, y_train.values.ravel())

In [9]:
# sunny:2, hot:85, normal:65, strong:0 
new_example = [[2, 60, 65, 1]]
X_test = pd.DataFrame(new_example, columns = ['Outlook', 'Temperature', 'Humidity', 'Wind'])
print(X_test)
print(clf.predict(X_test))

   Outlook  Temperature  Humidity  Wind
0        2           60        65     1
[False]


## Multilayer Perceptron from scratch

A Multilayer Perceptron (MLP) is a class of neural networks consisting of multiple layers: an input layer, one or more hidden layers, and an output layer. Each layer is fully connected to the next layer, enabling complex, nonlinear transformations. The MLP operates in two phases:

- Forward Propagation: The input data passes through each layer, where weights and biases are applied, and an activation function (e.g., sigmoid) is used to introduce non-linearity. The output from one layer becomes the input to the next until the final output is produced.

- Backpropagation and Weight Adjustment: The error is calculated between the network's output and the true target values. This error is backpropagated through the layers, and weights are updated based on the gradient of the loss function. The process uses the learning rate to control weight adjustments.

The MLP code below demonstrates these principles, showing weight updates across iterations and printing the loss and accuracy at each step to monitor learning progress.

In [10]:
class MLP:
    def __init__(self, n_inputs, n_hidden, n_outputs, learning_rate=0.1, epochs=1000):
        # Initialize network parameters: input, hidden, and output layer sizes, learning rate, and training epochs
        self.n_inputs = n_inputs
        self.n_hidden = n_hidden
        self.n_outputs = n_outputs
        self.learning_rate = learning_rate
        self.epochs = epochs
        
        # Initialize weights for input to hidden layer connections (including bias)
        self.weights_input_hidden = np.random.rand(self.n_inputs + 1, self.n_hidden) * 0.1
        
        # Initialize weights for hidden to output layer connections (including bias)
        self.weights_hidden_output = np.random.rand(self.n_hidden + 1, self.n_outputs) * 0.1

    def sigmoid(self, x):
        # Sigmoid activation function, used to introduce non-linearity
        return 1 / (1 + np.exp(-x))

    def sigmoid_derivative(self, x):
        # Derivative of the sigmoid function, required for backpropagation
        return x * (1 - x)
    
    def fit(self, X, y):
        # Add a bias column (ones) to the input data matrix X
        X = np.c_[np.ones(X.shape[0]), X]
        
        for epoch in range(self.epochs):
            errors = []  # List to record the error for each sample
            predictions = []  # List to store predicted values for accuracy calculation
            
            for i in range(X.shape[0]):
                # Forward pass: compute hidden layer activations and final output
                input_layer = X[i]
                hidden_net = np.dot(input_layer, self.weights_input_hidden)
                hidden_output = self.sigmoid(hidden_net)
                
                # Add bias to the hidden layer output
                hidden_output = np.append(1, hidden_output)
                
                output_net = np.dot(hidden_output, self.weights_hidden_output)
                final_output = self.sigmoid(output_net)
                
                # Calculate output error (target - prediction) and record squared error
                output_error = y[i] - final_output
                errors.append(output_error**2)
                
                # Store binary predictions (1 if >= 0.5, else 0) for accuracy
                predictions.append(1 if final_output >= 0.5 else 0)
                
                # Backpropagation: calculate gradients and update weights
                delta_output = output_error * self.sigmoid_derivative(final_output)
                
                # Calculate hidden layer error and gradient (excluding bias neuron)
                hidden_error = delta_output.dot(self.weights_hidden_output[1:].T)
                delta_hidden = hidden_error * self.sigmoid_derivative(hidden_output[1:])
                
                # Update weights for hidden to output layer
                self.weights_hidden_output += self.learning_rate * np.outer(hidden_output, delta_output)
                
                # Update weights for input to hidden layer
                self.weights_input_hidden += self.learning_rate * np.outer(input_layer, delta_hidden)
            
            # Calculate and print loss and accuracy for each epoch
            loss = np.mean(errors)
            accuracy = np.mean(np.equal(predictions, y))
            print(f"Epoch {epoch+1}/{self.epochs} - Loss: {loss:.4f}, Accuracy: {accuracy:.4f}")
     
    def predict(self, X):
        # Add bias to input data point
        X = np.insert(X, 0, 1)
        
        # Forward pass: calculate hidden layer and output layer activations
        hidden_net = np.dot(X, self.weights_input_hidden)
        hidden_output = self.sigmoid(hidden_net)
        
        # Add bias to hidden layer output
        hidden_output = np.insert(hidden_output, 0, 1)
        
        output_net = np.dot(hidden_output, self.weights_hidden_output)
        final_output = self.sigmoid(output_net)
        
        return 1 if final_output >= 0.5 else 0

### Implementation example

Previous MLP is implemented to classify weather conditions to determine whether playing tennis is advisable. First, the weather dataset is loaded and preprocessed to make it compatible with the MLP model. The dataset consists of categorical variables, which are converted to numerical values, as neural networks typically require numerical input data. Each category within variables such as outlook, temperature, humidity, and windy is mapped to a unique integer value. For instance, the outlook values of sunny, overcast, and rainy are represented by 0, 1, and 2, respectively. Similarly, the target variable play is encoded as 0 for "no" and 1 for "yes." This encoding process enables the MLP model to process the input effectively during training.

With the dataset prepared, the feature matrix X is defined to include all columns except the target label play, while y contains the target labels indicating the decision to play tennis (0 or 1) for each set of weather conditions. Next, the MLP model is initialized with specified parameters: 4 input neurons (one for each feature), 3 neurons in the hidden layer, and 1 output neuron for binary classification. The learning rate is set at 0.1, and the model is trained over 100 iterations, allowing it to learn from the dataset and refine its predictions.

During training, the MLP’s fit function iterates over the dataset, updating weights by calculating the error between the predicted and actual outputs. This iterative process continues for the specified number of epochs, gradually minimizing the model’s prediction error. Through this adjustment, the MLP improves its capacity to make accurate predictions based on the input data.

In [11]:
# Load the dataset
data = pd.read_csv('weather.nominal.csv')

# Preprocess the data: encode categorical variables
data['outlook'] = data['outlook'].map({'sunny': 0, 'overcast': 1, 'rainy': 2})
data['temperature'] = data['temperature'].map({'hot': 0, 'mild': 1, 'cool': 2})
data['humidity'] = data['humidity'].map({'high': 0, 'normal': 1})
data['windy'] = data['windy'].map({False: 0, True: 1})
data['play'] = data['play'].map({'no': 0, 'yes': 1})

# Define X (features) and y (target)
X = data.iloc[:, :-1].values  # All columns except the last one
y = data.iloc[:, -1].values  # The last column as the target label

# Initialize the MLP with 4 inputs, 3 hidden neurons, and 1 output
mlp = MLP(n_inputs=4, n_hidden=3, n_outputs=1, learning_rate=0.1, epochs=100)

# Train the MLP
mlp.fit(X, y)

Epoch 1/100 - Loss: 0.2426, Accuracy: 0.6429
Epoch 2/100 - Loss: 0.2393, Accuracy: 0.6429
Epoch 3/100 - Loss: 0.2369, Accuracy: 0.6429
Epoch 4/100 - Loss: 0.2352, Accuracy: 0.6429
Epoch 5/100 - Loss: 0.2340, Accuracy: 0.6429
Epoch 6/100 - Loss: 0.2332, Accuracy: 0.6429
Epoch 7/100 - Loss: 0.2326, Accuracy: 0.6429
Epoch 8/100 - Loss: 0.2321, Accuracy: 0.6429
Epoch 9/100 - Loss: 0.2318, Accuracy: 0.6429
Epoch 10/100 - Loss: 0.2315, Accuracy: 0.6429
Epoch 11/100 - Loss: 0.2314, Accuracy: 0.6429
Epoch 12/100 - Loss: 0.2312, Accuracy: 0.6429
Epoch 13/100 - Loss: 0.2311, Accuracy: 0.6429
Epoch 14/100 - Loss: 0.2310, Accuracy: 0.6429
Epoch 15/100 - Loss: 0.2310, Accuracy: 0.6429
Epoch 16/100 - Loss: 0.2309, Accuracy: 0.6429
Epoch 17/100 - Loss: 0.2309, Accuracy: 0.6429
Epoch 18/100 - Loss: 0.2308, Accuracy: 0.6429
Epoch 19/100 - Loss: 0.2308, Accuracy: 0.6429
Epoch 20/100 - Loss: 0.2308, Accuracy: 0.6429
Epoch 21/100 - Loss: 0.2307, Accuracy: 0.6429
Epoch 22/100 - Loss: 0.2307, Accuracy: 0.64

Once trained, the model is tested on a new example. A sample weather condition with specific values (e.g., sunny for outlook, mild for temperature, high for humidity, and false for windy) is encoded into a format compatible with the model. The predict function is then called with this input, and the MLP outputs its recommendation on whether to play tennis under these conditions. The result, displayed as either "Yes" or "No," offers a user-friendly interpretation of the model's assessment based on the weather parameters.

In [12]:
# Test with a new example (outlook='sunny', temperature='mild', humidity='high', windy='false')
new_example = np.array([0, 1, 0, 0])  # Encoded as per the mappings above
result = mlp.predict(new_example)

# Print the prediction result
print(f"Play tennis with conditions {new_example}? {'Yes' if result == 1 else 'No'}")

Play tennis with conditions [0 1 0 0]? Yes


## MLP using Scikit-learn

In [13]:
# from previous example
print(X_train_num)
print(y_train)

    Outlook  Temperature  Humidity  Wind
0         2           85        85     2
1         2           80        90     2
2         0           83        86     0
3         1           70        96     1
4         1           68        80     1
5         1           65        70     1
6         0           64        65     0
7         2           72        95     2
8         2           69        70     2
9         1           75        80     1
10        2           75        70     2
11        0           72        90     0
12        0           81        75     0
13        1           71        91     1
     Play
0   False
1   False
2    True
3    True
4    True
5   False
6    True
7   False
8    True
9    True
10   True
11   True
12   True
13  False


In [14]:
from sklearn.neural_network import MLPClassifier

# Initialize the MLPClassifier with similar parameters to the custom MLP
mlp = MLPClassifier(hidden_layer_sizes=(3,), activation='logistic', learning_rate_init=0.1, max_iter=100, random_state=42)

# Train the MLP on the training data
mlp.fit(X_train_num.values, y_train.values.ravel())

In [15]:
# Test the model with a new example (e.g., outlook='sunny', temperature='mild', humidity='high', windy=False)
new_example = [[0, 1, 0, 0]]  # Encoded as per preprocessing
result = mlp.predict(new_example)

# Display the prediction result
print(f"Should play tennis with conditions {new_example[0]}? {'Yes' if result[0] == 1 else 'No'}")

Should play tennis with conditions [0, 1, 0, 0]? Yes
