<a href="https://colab.research.google.com/github/sayeduzzamancuet/ml/blob/main/neural_network_assignment_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## The McCulloch-Pitts Neuron

The basic building block of neural networks is the McCulloch-Pitts neuron, which has three components:

1. Weights (w₁, w₂, ..., wₙ) corresponding to synapses
2. An adder for summing input signals
3. An activation function for determining neuron firing

The mathematical representation is:
h = Σᵢ wᵢxᵢ

where h is the weighted sum and wᵢ are the weights for inputs xᵢ.

In [None]:
# Import required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
from IPython.display import display

## Understanding Logical Operations

We'll start by implementing basic logical operations (AND, OR) using a perceptron. These are fundamental building blocks for more complex operations.

In [None]:
# Create truth tables for logical operations
AND = pd.DataFrame({'x1': (0,0,1,1), 'x2': (0,1,0,1), 'y': (0,0,0,1)})
print("AND Truth Table:")
display(AND)

AND Truth Table:


Unnamed: 0,x1,x2,y
0,0,0,0
1,0,1,0
2,1,0,0
3,1,1,1


In [None]:
# Define our activation function
def g(inputs, weights):
    """Simple threshold activation function"""
    return np.where(np.dot(inputs, weights) > 0, 1, 0)

# Define training function
def train(inputs, targets, weights, eta, n_iterations):
    """Train the perceptron

    Parameters:
    inputs: input data
    targets: target values
    weights: initial weights
    eta: learning rate
    n_iterations: number of training iterations
    """
    # Add bias input
    inputs = np.c_[inputs, -np.ones((len(inputs), 1))]

    for n in range(n_iterations):
        activations = g(inputs, weights)
        weights -= eta * np.dot(np.transpose(inputs), activations - targets)

    return weights

In [None]:
# Define the AND problem inputs and outputs
inputs = np.array([
    [0, 0],
    [0, 1],
    [1, 0],
    [1, 1]
])

# Target output for AND logic
targets = np.array([0, 0, 0, 1])

# Initialize weights (with random values)
weights = np.random.randn(inputs.shape[1] + 1)

# Learning rate
eta = 0.1

# Number of iterations
n_iterations = 1000

# Train the perceptron
weights = train(inputs, targets, weights, eta, n_iterations)

# Test the perceptron after training
inputs_with_bias = np.c_[inputs, -np.ones((len(inputs), 1))]
predictions = g(inputs_with_bias, weights)

# Print predictions and compare with targets
print("Predictions:", predictions)
print("Targets:", targets)


Predictions: [0 0 0 1]
Targets: [0 0 0 1]


In [None]:
import numpy as np

# Define sigmoid activation function
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# Define its derivative (for backpropagation)
def sigmoid_derivative(x):
    return x * (1 - x)

# Define training function for a neural network
def train(inputs, targets, weights_input_hidden, weights_hidden_output, eta, n_iterations):
    """Train a neural network to solve XOR problem"""

    for epoch in range(n_iterations):
        # Forward propagation
        # Hidden layer
        hidden_input = np.dot(inputs, weights_input_hidden)
        hidden_output = sigmoid(hidden_input)

        # Final layer
        final_input = np.dot(hidden_output, weights_hidden_output)
        final_output = sigmoid(final_input)

        # Calculate the error (r,c)=(4,1)
        error = targets - final_output

        # Backpropagation
        d_output = error * sigmoid_derivative(final_output)

        error_hidden = d_output.dot(weights_hidden_output.T)

        d_hidden = error_hidden * sigmoid_derivative(hidden_output)

        # Update weights using gradient descent
        weights_hidden_output += hidden_output.T.dot(d_output) * eta

        weights_input_hidden += inputs.T.dot(d_hidden) * eta


    return weights_input_hidden, weights_hidden_output

# Define XOR problem inputs and outputs
inputs = np.array([
    [0, 0],
    [0, 1],
    [1, 0],
    [1, 1]
])

# XOR target output
targets = np.array([[0], [1], [1], [0]])

# Initialize weights randomly for input-to-hidden and hidden-to-output layers
input_size = inputs.shape[1] #2
hidden_size = 4  # Number of neurons in the hidden layer
output_size = 1  # XOR output is a single value (0 or 1)


np.random.seed(42)
weights_input_hidden = np.random.rand(input_size, hidden_size)
weights_hidden_output = np.random.rand(hidden_size, output_size)


# Learning rate
eta = 0.1

# Number of iterations for training
n_iterations = 10

# Train the neural network
weights_input_hidden, weights_hidden_output = train(inputs, targets, weights_input_hidden, weights_hidden_output, eta, n_iterations)

# Test the network after training
hidden_input = np.dot(inputs, weights_input_hidden)
hidden_output = sigmoid(hidden_input)

final_input = np.dot(hidden_output, weights_hidden_output)
final_output = sigmoid(final_input)

# Print the predictions and compare with the targets
print("Predictions (after training):")
print(np.round(final_output))  # Round the output to 0 or 1 for binary classification
print("Targets:")
print(targets)


Predictions (after training):
[[1.]
 [1.]
 [1.]
 [1.]]
Targets:
[[0]
 [1]
 [1]
 [0]]


In [None]:
mA = np.random.rand(4, 1)
mB = np.random.rand(4,1)

display(mA)
display(mB.T)

array([[0.86310343],
       [0.62329813],
       [0.33089802],
       [0.06355835]])

array([[0.31098232, 0.32518332, 0.72960618, 0.63755747]])

## Assignment: The XOR Problem

Now it's your turn! The XOR (exclusive OR) function returns 1 when inputs are different, and 0 when they are the same.

Your tasks:

1. Create the truth table for XOR
2. Try to train a perceptron to learn XOR
3. Analyze the results
4. Explain why the perceptron succeeds or fails

In [None]:
# Your code here
# Create XOR truth table
XOR = pd.DataFrame({'x1': (0,0,1,1), 'x2': (0,1,0,1), 'y': (0,1,1,0)})
print("XOR Truth Table:")
display(XOR)

# Train a perceptron for XOR
# Your implementation here

XOR Truth Table:


Unnamed: 0,x1,x2,y
0,0,0,0
1,0,1,1
2,1,0,1
3,1,1,0


### Questions to Answer:

1. What happens when you try to train the perceptron on XOR?
2. Why does this happen? (Hint: Think about linear separability)
3. How could you modify the network to successfully implement XOR?
4. What does this tell us about the limitations of single-layer perceptrons?

Write your answers and analysis below:

## Your Analysis
[Write your analysis here]