<a href="https://colab.research.google.com/github/ketanp23/sit-neuralnetworks-class/blob/main/XORProblem.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Single-Layer Failure

In [1]:
import torch
import torch.nn as nn
import torch.optim as optim

# Data
X = torch.tensor([[0,0], [0,1], [1,0], [1,1]], dtype=torch.float)
y = torch.tensor([[0], [1], [1], [0]], dtype=torch.float)

# Single layer model
model = nn.Linear(2,1)
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.1)

# Train for 1000 epochs
for epoch in range(1000):
    optimizer.zero_grad()
    output = model(X)
    loss = criterion(output, y)
    loss.backward()
    optimizer.step()

# Predictions
with torch.no_grad():
    pred = model(X)
    print("Predictions:", pred)
    print("Rounded:", (pred > 0.5).float())

Predictions: tensor([[0.5000],
        [0.5000],
        [0.5000],
        [0.5000]])
Rounded: tensor([[0.],
        [0.],
        [0.],
        [1.]])


See? It doesn't match [0,1,1,0]. Wrong on some.

Multi-Layer Success

In [2]:
import torch
from torch import nn
from torch import optim

# 1. Define XOR inputs and targets
# Inputs are [0,0], [0,1], [1,0], [1,1]
# Targets are [0], [1], [1], [0]
xor_inputs = torch.tensor([[0,0],[0,1],[1,0],[1,1]], dtype=torch.float32)
xor_targets = torch.tensor([[0],[1],[1],[0]], dtype=torch.float32)

# 2. Define the MLP model
class XORModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.hidden_layer = nn.Linear(2, 4) # 2 input features, 4 hidden units
        self.relu = nn.ReLU()
        self.output_layer = nn.Linear(4, 1) # 4 hidden units, 1 output
        self.sigmoid = nn.Sigmoid() # Sigmoid for binary output (0 or 1)

    def forward(self, x):
        x = self.hidden_layer(x)
        x = self.relu(x)
        x = self.output_layer(x)
        x = self.sigmoid(x)
        return x

# Instantiate the model, loss function, and optimizer
model = XORModel()
criterion = nn.BCELoss() # Binary Cross Entropy Loss
optimizer = optim.Adam(model.parameters(), lr=0.01)

# 3. Train the model
epochs = 10000
for epoch in range(epochs):
    # Forward pass
    outputs = model(xor_inputs)
    loss = criterion(outputs, xor_targets)

    # Backward pass and optimization
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if (epoch + 1) % 1000 == 0:
        print(f'Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}')

# 4. Test the model
with torch.no_grad():
    predicted = model(xor_inputs)
    # Apply threshold to get binary predictions
    predicted_class = (predicted > 0.5).int()

    print("\nPredictions after training:")
    for i in range(len(xor_inputs)):
        print(f"Input: {xor_inputs[i].tolist()}, Target: {int(xor_targets[i].item())}, Predicted: {predicted_class[i].item()}")


Epoch [1000/10000], Loss: 0.0089
Epoch [2000/10000], Loss: 0.0021
Epoch [3000/10000], Loss: 0.0008
Epoch [4000/10000], Loss: 0.0004
Epoch [5000/10000], Loss: 0.0002
Epoch [6000/10000], Loss: 0.0001
Epoch [7000/10000], Loss: 0.0001
Epoch [8000/10000], Loss: 0.0000
Epoch [9000/10000], Loss: 0.0000
Epoch [10000/10000], Loss: 0.0000

Predictions after training:
Input: [0.0, 0.0], Target: 0, Predicted: 0
Input: [0.0, 1.0], Target: 1, Predicted: 1
Input: [1.0, 0.0], Target: 1, Predicted: 1
Input: [1.0, 1.0], Target: 0, Predicted: 0


Perfect match! The hidden layer makes the difference.