Definition: The simplest type of binary linear classifier; single-layer neural network.

Key Idea: Finds a linear decision boundary by updating weights using misclassified samples.

Update Rule:  w=w+η(y−y^)x

Pros: Simple, fast, works for linearly separable data.

Cons: Cannot handle non-linear data, sensitive to learning rate.

When to Use: Binary classification, small/linearly separable datasets.

Key Parameters:

eta0 → learning rate

max_iter → maximum iterations

fit_intercept → add bias term

Adaline / Madaline

Definition: Adaline = Adaptive Linear Neuron; Madaline = Multiple Adaline units.

Key Idea: Uses continuous activation (linear output) and MSE loss to update weights.

Pros: Uses gradient descent → smoother convergence than perceptron.

Cons: Only linear decision boundary; Madaline more complex.

When to Use: Linearly separable regression/classification problems.

Key Parameters:

learning_rate

epochs / max_iter

Nearest Centroid Classifier

Definition: Distance-based classifier; assigns sample to class with closest centroid.

Key Idea: Computes mean feature vector for each class.

Pros: Simple, fast, interpretable. Works for well-separated clusters.

Cons: Sensitive to outliers, poor performance on overlapping classes.

When to Use: High-dimensional data, quick baseline model, linearly separable clusters.

Key Parameters:

metric → distance metric (Euclidean, Manhattan)

shrink_threshold → for robustness to noise

Ridge Classifier

Definition: Linear classifier using Ridge regression; converts labels to {-1,1} and fits linear model.

Key Idea: L2 regularization minimizes overfitting.

Pros: Works well for high-dimensional data, reduces variance.

Cons: Only linear decision boundary.

When to Use: High-dimensional data, text classification, regularization needed.

Key Parameters:

alpha → regularization strength

fit_intercept → include bias term

Passive Aggressive Classifier

Definition: Online learning algorithm; updates weights only when misclassification occurs.

Key Idea: Passive if correctly classified, aggressive if misclassified → updates weights minimally to fix error.

Pros: Good for large-scale online learning, fast.

Cons: Sensitive to outliers; may oscillate on noisy data.

When to Use: Online/streaming data, large datasets.

Key Parameters:

C → regularization parameter

loss → hinge, squared_hinge

max_iter

SGD Classifier / Regressor

Definition: Stochastic Gradient Descent (SGD) algorithm for linear models; supports classification & regression.

Key Idea: Updates weights incrementally per sample using gradient of the loss.

Pros: Very fast on large datasets, supports online learning, flexible (can implement SVM, logistic regression, etc.).

Cons: Needs careful tuning of learning rate, sensitive to feature scaling.

When to Use: Very large datasets, online learning, sparse data.

Key Parameters:

loss → hinge, log, squared_loss, epsilon_insensitive

penalty → l2, l1, elasticnet

alpha → regularization strength

learning_rate → constant, optimal, adaptive

max_iter → epochs

| Classifier                 | Type            | Pros                                       | Cons                          | Use Case                                   |
| -------------------------- | --------------- | ------------------------------------------ | ----------------------------- | ------------------------------------------ |
| Perceptron                 | Linear          | Simple, fast                               | Only linearly separable       | Small binary classification                |
| Adaline / Madaline         | Linear          | Smooth convergence                         | Linear only, Madaline complex | Regression/classification                  |
| Nearest Centroid           | Distance-based  | Fast, interpretable                        | Sensitive to outliers         | Well-separated clusters, baseline          |
| Ridge Classifier           | Linear + L2     | Handles high-dim data, reduces overfitting | Linear only                   | Text classification, high-dimensional data |
| Passive Aggressive         | Online          | Fast, online learning                      | Sensitive to noise            | Streaming/large datasets                   |
| SGD Classifier / Regressor | Online / Linear | Fast, scalable, flexible                   | Needs tuning, feature scaling | Large/streaming datasets                   |
