In [1]:
AND = [
    (0,0,0),
    (0,1,0),
    (1,0,0),
    (1,1,1)
]

OR = [
    (0,0,0),
    (0,1,1),
    (1,0,1),
    (1,1,1)
]

NAND = [
    (0,0,1),
    (0,1,1),
    (1,0,1),
    (1,1,0)
]

NOR = [
    (0,0,1),
    (0,1,0),
    (1,0,0),
    (1,1,0)
]

XOR = [
    (0,0,0),
    (0,1,1),
    (1,0,1),
    (1,1,0)
]

In [2]:
def train(dataset, lr, epochs):
    w = [0, 0]
    b = 0
    for epoch in range(epochs):
        for x1, x2, y in dataset:
            z = w[0]*x1 + w[1]*x2 + b
            y_hat = 1 if z > 0 else 0
            error = y - y_hat
            w[0] += lr * error * x1
            w[1] += lr * error * x2
            b += lr * error
    return w, b

In [3]:
def test(dataset, w, b):
    for x1, x2, y in dataset:
        z = w[0]*x1 + w[1]*x2 + b
        y_hat = 1 if z > 0 else 0
        print(f"Input: ({x1}, {x2}), Predicted: {y_hat}, Actual: {y}")

In [4]:
gates = {
    "AND": AND,
    "OR": OR,
    "NAND": NAND,
    "NOR": NOR,
    "XOR": XOR
}
for gate in gates:
    print(f"Training on {gate} gate")
    data = gates[gate]
    w, b = train(data, lr=0.1, epochs=10)
    print(f"Final weights: {w}, bias: {b}")
    test(data, w, b)
    print()

Training on AND gate
Final weights: [0.2, 0.1], bias: -0.2
Input: (0, 0), Predicted: 0, Actual: 0
Input: (0, 1), Predicted: 0, Actual: 0
Input: (1, 0), Predicted: 0, Actual: 0
Input: (1, 1), Predicted: 1, Actual: 1

Training on OR gate
Final weights: [0.1, 0.1], bias: 0.0
Input: (0, 0), Predicted: 0, Actual: 0
Input: (0, 1), Predicted: 1, Actual: 1
Input: (1, 0), Predicted: 1, Actual: 1
Input: (1, 1), Predicted: 1, Actual: 1

Training on NAND gate
Final weights: [-0.2, -0.1], bias: 0.20000000000000004
Input: (0, 0), Predicted: 1, Actual: 1
Input: (0, 1), Predicted: 1, Actual: 1
Input: (1, 0), Predicted: 1, Actual: 1
Input: (1, 1), Predicted: 0, Actual: 0

Training on NOR gate
Final weights: [-0.1, -0.1], bias: 0.1
Input: (0, 0), Predicted: 1, Actual: 1
Input: (0, 1), Predicted: 0, Actual: 0
Input: (1, 0), Predicted: 0, Actual: 0
Input: (1, 1), Predicted: 0, Actual: 0

Training on XOR gate
Final weights: [-0.1, 0.0], bias: 0.1
Input: (0, 0), Predicted: 1, Actual: 0
Input: (0, 1), Predic

**Effect of learning rate (Î·):**

- A smaller learning rate (e.g., 0.01) makes weight updates small. Training is more stable but may need more epochs to reach correct weights.


- A larger learning rate (e.g., 0.5 or 1.0) makes updates large. It can learn faster when it works, but it may overshoot the solution and oscillate or fail to converge for some datasets.

In our code, changing `lr` mainly changes how fast the perceptron reaches a set of weights that correctly classifies all linearly separable gates (AND, OR, NAND, NOR).

**Why the same code learned different gates:**

- The perceptron learning rule is the same, but the training data (truth table) is different for each gate.
- For each gate, the algorithm adjusts weights and bias until it finds a linear decision boundary that separates outputs 0 and 1 for that truth table.
- AND, OR, NAND, and NOR are **linearly separable**, so a single-layer perceptron can find suitable weights and bias for each of them.
- XOR is **not linearly separable**. No single straight line can separate its 0 and 1 outputs, so a single-layer perceptron cannot learn XOR perfectly, no matter how you tune the learning rate or epochs.