cpp-linear-regression

This project implements multiple linear regression in C++ from scratch, featuring both gradient descent and the analytical normal equation solution.

What This Project Does

This project implements linear regression two ways:

Gradient Descent — iteratively nudges weights toward lower loss
Normal Equation — computes the exact optimal weights in one shot using linear algebra

Both methods are implemented without any ML libraries. Everything from matrix operations to normalization is written from scratch in C++.

The Math

Model

Given a sample with features $x_1, x_2, \ldots, x_k$, the model predicts:

$$\hat{y} = w_1 x_1 + w_2 x_2 + \cdots + w_k x_k + b$$

Loss Function (MSE)

$$\mathcal{L} = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2$$

Our goal is to minimize this loss function.

Gradient Descent

Weights are updated each epoch by computing the gradient of the loss:

$$w_j \leftarrow w_j + \frac{\alpha}{n} \sum_{i=1}^{n} x_{ij}(y_i - \hat{y}_i)$$

where $\alpha$ is the learning rate.

This process is repeated for a fixed number of epochs, or until the loss converges. It works on the principle of moving in the direction of the negative gradient to minimize the loss.

Normal Equation (Analytical Solution)

Instead of iterating, we can solve for the exact optimal weights directly. This can be done using the normal equation.

Derivation

To find the optimal weights $\mathbf{w}$ that minimize the Mean Squared Error (MSE) loss, we take the derivative of the matrix loss function with respect to $\mathbf{w}$ and set it to zero.

The MSE loss function in matrix form is: $$\mathcal{L}(\mathbf{w}) = \frac{1}{n} (\mathbf{X}\mathbf{w} - \mathbf{y})^T (\mathbf{X}\mathbf{w} - \mathbf{y})$$

Expanding the expression: $$\mathcal{L}(\mathbf{w}) = \frac{1}{n} (\mathbf{w}^T\mathbf{X}^T\mathbf{X}\mathbf{w} - 2\mathbf{w}^T\mathbf{X}^T\mathbf{y} + \mathbf{y}^T\mathbf{y})$$

Taking the gradient with respect to $\mathbf{w}$ and setting it to zero: $$\nabla_{\mathbf{w}} \mathcal{L} = \frac{2}{n} (\mathbf{X}^T\mathbf{X}\mathbf{w} - \mathbf{X}^T\mathbf{y}) = 0$$

$$\mathbf{X}^T\mathbf{X}\mathbf{w} = \mathbf{X}^T\mathbf{y}$$

Multiplying both sides by the inverse of $(\mathbf{X}^T\mathbf{X})$ yields the Normal Equation: $$\mathbf{w} = (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y}$$

where $\mathbf{X}$ is the data matrix with a prepended column of 1s (for the bias term). Computing this directly requires a matrix inverse using Gauss-Jordan elimination, which is $O(F^3)$ in time complexity for $F$ features.

Evaluation: R-Squared

$$R^2 = 1 - \frac{\sum(y_i - \hat{y}_i)^2}{\sum(y_i - \bar{y})^2}$$

An $R^2$ close to 1.0 means the model explains nearly all variance in the data.

Project Structure

cpp-linear-regression/
├── data/
├── includes/
│   ├── regression.hpp      ← LinearRegression class
│   ├── normalizer.hpp      ← Z-score normalization
│   └── matrix.hpp          ← Matrix class with inverse
├── src/
│   ├── regression.cc
│   ├── normalizer.cc
│   ├── matrix.cc           ← Gauss-Jordan elimination
│   └── driver.cc           ← Training and evaluation
└── Makefile

Classes

`LinearRegression`

Method	Description
`predict(x)`	Dot product of weights and input + bias
`computeLoss(X, y)`	Mean squared error
`train(X, y, lr, epochs)`	Gradient descent training loop
`trainAnalytical(X, y)`	Normal equation — exact closed-form solution
`computeR2(X, y)`	R-squared metric
`getWeights()`, `getBias()`	Accessors

`Normalizer`

Implements the fit/transform pattern to prevent data leakage.

Method	Description
`fit(X)`	Compute mean and std from training data only
`transform(X)`	Z-score normalize a full dataset
`transformSingle(x)`	Normalize a single sample at inference time

Z-score formula: $x_{norm} = \frac{x - \mu}{\sigma}$

`Matrix`

Full matrix class supporting the operations needed for the normal equation.

Method	Description
`multiply(B)`	Matrix multiplication
`transpose()`	Flip rows and columns
`inverse()`	Gauss-Jordan elimination on augmented matrix $[A \mid I]$
`add(B)`	Element-wise addition
`scalarMultiply(s)`	Scale all elements

Results

Synthetic Data

Trained on $y = 3x_1 + 2x_2 + 1 + \epsilon$ (Gaussian noise), 80/20 train/test split:

Method	Test MSE	R²
Gradient Descent	~0.19	0.996
Normal Equation	~0.19	0.996

Both methods converge to the same solution — the normal equation just gets there instantly.

Build & Run

make
./regression

Requires g++ with C++17 support.

Key Concepts Learned

Multiple linear regression with vectorized operations
Feature normalization and the fit/transform pattern to avoid data leakage
Gradient descent with batch updates
Matrix inverse via Gauss-Jordan elimination
Closed-form solution vs iterative optimization — same result, different paths
Train/test split and R² evaluation

Part of a Series

This project is part of a series building ML components from scratch in C++:

cpp-perceptron — single-layer perceptron trained on UCI Iris
cpp-linear-regression ← you are here
cpp-logistic-regression — logistic regression with gradient descent and normal equation
cpp-neural-network — neural network with backpropagation and gradient descent

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.vscode		.vscode
Notes		Notes
includes		includes
src		src
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
compile_flags.txt		compile_flags.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cpp-linear-regression

What This Project Does

The Math

Model

Loss Function (MSE)

Gradient Descent

Normal Equation (Analytical Solution)

Derivation

Evaluation: R-Squared

Project Structure

Classes

`LinearRegression`

`Normalizer`

`Matrix`

Results

Synthetic Data

Build & Run

Key Concepts Learned

Part of a Series

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

cpp-linear-regression

What This Project Does

The Math

Model

Loss Function (MSE)

Gradient Descent

Normal Equation (Analytical Solution)

Derivation

Evaluation: R-Squared

Project Structure

Classes

LinearRegression

Normalizer

Matrix

Results

Synthetic Data

Build & Run

Key Concepts Learned

Part of a Series

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`LinearRegression`

`Normalizer`

`Matrix`

Packages