# Introduction

**What is PyTorch?**

PyTorch is an open-source deep learning framework developed by Facebook's AI Research lab (FAIR). It provides:

* A flexible and dynamic tensor computation library (like NumPy but with GPU support)
-- Tensor is 3D / 4D / n-D array.
* An intuitive autograd system for automatic differentiation
* A powerful module system (torch.nn) for building neural networks

PyTorch is widely used in both research and industry due to its ease of use and strong support for dynamic computational graphs.

## Perceptron

Here we will import the necessary PyTorch modules:

* torch: for tensor operations and autograd
* torch.nn: for building neural networks
* torch.optim: for optimization algorithms like SGD

These are core components in building and training neural networks in PyTorch.

But what are tensors? Tensors are **multi-dimensional arrays** that generalize scalars, vectors, and matrices to higher dimensions. They are fundamental data structures used to represent and manipulate data in machine learning models, particularly in deep learning.

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim

We define the input-output pairs for the AND gate, which is linearly separable:
* Inputs: Two binary values
* Output: 1 only if both inputs are 1, otherwise 0
This dataset will be used to train a perceptron model from scratch.

In [None]:
X = torch.tensor([[0., 0.],
                  [0., 1.],
                  [1., 0.],
                  [1., 1.]])
y_true = torch.tensor([0., 0., 0., 1.])

We manually initialize the weights and bias of the Perceptron. Single perceptron does not need gradients

In [None]:
w = torch.randn(2, requires_grad=False)
b = torch.randn(1, requires_grad=False)
'''
# If you want to set the values manually
w = torch.tensor([0., 0.], dtype=torch.float32, requires_grad=False)
b = torch.tensor([0.], dtype=torch.float32, requires_grad=False)
'''
# Set learning rate
lr = 0.1

In this block, we train the perceptron using the **Perceptron Learning Rule**, a simple and intuitive way to update the weights based on classification errors.

The key idea is:

- For each input sample, compute the predicted label using a **step function** that outputs `1` or `-1`.
- If the prediction is **correct**, do nothing.
- If the prediction is **incorrect**, update the weights and bias using:

$$
\mathbf{w} \leftarrow \mathbf{w} + \eta \cdot (y_{true}-y_{pred}) \cdot \mathbf{x}
$$

$$
b \leftarrow b + \eta \cdot (y_{true}-y_{pred})
$$

Where:
- $ \eta $ is the learning rate
- $ y \in \{-1, 1\} $ is the true label
- $ \mathbf{x} $ is the input vector

This rule adjusts the decision boundary only when the model makes a mistake, helping the perceptron converge to a solution for **linearly separable problems** like the AND gate.

We repeat this for multiple epochs and track the number of misclassified samples in each epoch. When misclassifications become zero, the perceptron has converged.


In [None]:
for epoch in range(20):
    errors = 0
    for i in range(4):
        xi = X[i]
        yi = y_true[i]

        z = torch.dot(w, xi) + b
        prediction = 1.0 if z >= 0 else 0.0  # Step activation

        # Only update if misclassified
        if prediction != yi:
            w += lr * (yi-prediction) * xi
            b += lr * (yi-prediction)
            errors += 1

    print(f"Epoch {epoch+1}, Misclassified: {errors}")
    if errors == 0:
        break

Epoch 1, Misclassified: 1
Epoch 2, Misclassified: 2
Epoch 3, Misclassified: 2
Epoch 4, Misclassified: 3
Epoch 5, Misclassified: 1
Epoch 6, Misclassified: 0


Here we display the final weight and bias after all the updates and test the final perceptron model against our input.

In [None]:
# Final weights and bias
print("\nFinal weights:", w)
print("Final bias:", b)

# Test model
print("\nPredictions:")
for i in range(4):
    xi = X[i]
    z = torch.dot(w, xi) + b
    prediction = 1.0 if z >= 0 else -1.0
    print(f"Input: {xi.tolist()}, Output: {prediction}, Target: {y_true[i].item()}")


Final weights: tensor([0.1144, 0.0634])
Final bias: tensor([-0.1690])

Predictions:
Input: [0.0, 0.0], Output: -1.0, Target: 0.0
Input: [0.0, 1.0], Output: -1.0, Target: 0.0
Input: [1.0, 0.0], Output: -1.0, Target: 0.0
Input: [1.0, 1.0], Output: 1.0, Target: 1.0
