# Perceptron

- 📺 **Video:** [https://youtu.be/tMGv5ZcuVP4](https://youtu.be/tMGv5ZcuVP4)

## Overview
- Study the perceptron algorithm as an online mistake-driven learner.
- Learn how initialization, order, and learning rate influence convergence.

## Key ideas
- **Online updates:** process examples one at a time and update only on errors.
- **Separability:** perceptron converges if a separating hyperplane exists.
- **Learning rate:** scales how aggressively weights move toward correcting the mistake.
- **Averaging:** averaging weights over time improves generalization on noisy data.

## Demo
Run the perceptron on a noisy dataset twice—once with standard updates and once with averaged weights—to mirror the lecture (https://youtu.be/KQrmmdfOP3I).

In [1]:
from sklearn.datasets import make_classification
import numpy as np
from sklearn.metrics import accuracy_score

X, y = make_classification(n_samples=400, n_features=6, n_informative=6, n_redundant=0, class_sep=1.0, flip_y=0.05, random_state=42)
y_signed = np.where(y == 1, 1, -1)
X = np.c_[np.ones(len(X)), X]

standard_w = np.zeros(X.shape[1])
avg_w = np.zeros_like(standard_w)
weights_sum = np.zeros_like(standard_w)
count = 0
eta = 0.1

for epoch in range(5):
    for xi, yi in zip(X, y_signed):
        if yi * (standard_w @ xi) <= 0:
            standard_w += eta * yi * xi
        if yi * (avg_w @ xi) <= 0:
            avg_w += eta * yi * xi
        weights_sum += avg_w
        count += 1

final_avg_w = weights_sum / count
std_preds = np.where((X @ standard_w) > 0, 1, -1)
avg_preds = np.where((X @ final_avg_w) > 0, 1, -1)

print('Standard perceptron accuracy:', accuracy_score(y_signed, std_preds))
print('Averaged perceptron accuracy:', accuracy_score(y_signed, avg_preds))


Standard perceptron accuracy: 0.79
Averaged perceptron accuracy: 0.81


## Try it
- Modify the demo
- Add a tiny dataset or counter-example


## References
- [Eisenstein 2.0-2.5, 4.2-4.4.1](https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)
- [Perceptron and logistic regression](https://www.cs.utexas.edu/~gdurrett/courses/online-course/perc-lr-connections.pdf)
- [Eisenstein 4.1](https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)
- [Perceptron and LR connections](https://www.cs.utexas.edu/~gdurrett/courses/online-course/perc-lr-connections.pdf)
- [Thumbs up? Sentiment Classification using Machine Learning Techniques](https://www.aclweb.org/anthology/W02-1011/)
- [Baselines and Bigrams: Simple, Good Sentiment and Topic Classification](https://www.aclweb.org/anthology/P12-2018/)
- [Convolutional Neural Networks for Sentence Classification](https://www.aclweb.org/anthology/D14-1181/)
- [[GitHub] NLP Progress on Sentiment Analysis](https://github.com/sebastianruder/NLP-progress/blob/master/english/sentiment_analysis.md)


*Links only; we do not redistribute slides or papers.*