# Perceptron learning

In [257]:
import numpy as np
from matplotlib import pyplot as plt

Perceptron learning is an iterative algorithm that converges to the appropriate weights for a single perceptron, given a learnable training set. If the training set is not learnable, the algorithm will not converge.

**Perceptron Learning Algorithm**

Given a dataset $X$ of $m$ observations $x \in \mathcal{R}^{1 \times n}$, an outcome vector $y \in \mathcal{R}^{m \times 1}$, we wish to find a weight vector $w \in \mathcal{R}^{n \times 1}$ such that $y_i = A(\mathbf{x}_i \cdot \mathbf{w})$ for each $0 \leq i < m$. In matrix form: $A(\mathbf{X} \mathbf{w}) = \mathbf{y}$. The last element of each observation vector $\mathbf{x}$ will be -1 to account for the bias term.

1. Initialize $ \mathbf{w} $ as a random vector.
2. **For each epoch**:
    - For each $ (\mathbf{x}_i, y_i)$ in the training set:
      - Compute $ \hat{y} = A(\mathbf{x}_i \cdot \mathbf{w}) $.
      - Update $ \mathbf{w} $ as:
        $$
        \mathbf{w} \leftarrow \mathbf{w} + (y_i - \hat{y}) \lambda \mathbf{x}_i
        $$
3. Compute accuracy:
   $$
   \text{accuracy} = 1 - \frac{\sum_i |y_i - A(\mathbf{x}_i \cdot \mathbf{w})|}{\text{len}(\text{training\_set})}
   $$
4. Return $ (\mathbf{w}, \text{accuracy}) $.

The learning rate $\lambda$ determines the rate of convergence. In practice $\lambda$ can be close to, but not greater than, 1.

We'll define the data set as a matrix $X$ and a vector $y$ using numpy. 

In [286]:
## The dataset corresponding to boolean AND.

X = np.array([
    [0, 0, -1],
    [0, 1, -1],
    [1, 0, -1],
    [1, 1, -1]
])
y = np.array([[0, 0, 0, 1]])

In [295]:
## The heaviside step function

def A(x):
    return np.heaviside(x,0)

In numpy and tensorflow you will find extracting data elements with the appropriate dimensionality can be difficult. To get to i_th row of $X$, use the syntax

`x = X[i:i+1]`

this will ensure `x` is a matrix and not a vector (shape will be 2D not 1D)

You can now multiply `x` and `w` with `x @ w`

If you want `x` to be a 1D vector, you can use `x = X[i]`

In [6]:
def perceptron_learn(X,y,lr):
    """
    X = matrix of observations
    y = vector of outcomes (as a numpy matrix)
    lr = 0 < learning rate < 1
    returns (weight vector, accuracy)
    """
    # implement

**Assignment**: Determine the appropriate weight vector $w$ for the boolean function AND. Then check all 16 boolean functions on 2 variables and print weights for each *learnable* function.

*Sub-assignment*: You may want to write a function to generate the dataset for the i_th boolean function