
This project contains a Python implementation of a gradient descent optimiser,
a fully custom neural network framework, and a
generalised Jacobian class for handling tensor derivatives.
The framework is built from scratch using NumPy
, with tensor-calculus-based backpropagation,
and demonstrates training on toy datasets such as fitting a sine wave and a helix projection.
Implements the optimiser:
-
class GradientDescent(np.ndarray)
- Preserves the shape of optimisation variables.
- Stores metadata (
gradient_current
,gradient_previous
,gradient_step
,minima
). - Supports momentum and noise injection.
-
gradient_descent_step(...)
Performs one update step:- Learning rate control.
- Descent or ascent.
- Momentum support.
- Optional Gaussian noise at fixed intervals.
-
gradient_descent_from_function(...)
Optimises any function with a known gradient:- Multiple random restarts.
- Gradient clipping.
- Convergence checks (gradient norm).
- Maximisation as well as minimisation.
Implements a general neural network pipeline:
- Activation functions:
sigmoid
,relu
,tanh
,softmax
,identity
. Each with corresponding derivatives, accessible via theActivationFunctions
registry. -
class TensorJacobian(np.ndarray)
- Generalised Jacobian supporting arbitrary tensor derivatives.
- Stores numerator and denominator tensor types.
- Overloads multiplication to compose tensor derivatives via Einstein summation (
einsum
). - Handles higher-order tensor calculus automatically.
- Forward propagation: arbitrary depth and width defined via
layer_node_nums
. - Backward propagation:
- Derived from full tensor calculus (Jacobian compositions).
- Computes gradients with respect to weights and biases (
DEDWs
,DEDbs
). - Supports multiple hidden layers with arbitrary sizes.
- Loss function: column-wise mean squared error
E(Y, Yhat) = (1 / (2m)) * Σ ||yᵢ - ŷᵢ||²
with derivative included.
Demonstrates training the network on two toy problems:
- Sine wave fitting:
- Random samples of sin(x) between 0 and 2π, rescaled to [0, 1].
- Network: multilayer perceptron, e.g.
[1, 6, 1]
. - Plots:
- True sine curve (black).
- Training samples (red points, smaller markers for clarity).
- Intermediate predictions every 10% of training (low opacity lines).
- Final prediction (green line).
- Also plots training loss over iterations.
- Helix projection fitting:
- Function: f(t) = [t cos t, t sin t].
- Network:
[1, 10, 2]
. - Plots:
- True XY helix projection (black).
- Training samples (red points).
- Intermediate predictions every 10% of training (low opacity lines).
- Final NN prediction (green dashed line).
- Also plots training loss.
python neural_network_test.py
This will:
- Initialise random neural networks.
- Train on sine and helix datasets.
- Output progress logs (iterations + loss).
- Generate plots for true functions, training data, predictions, and training losses.
test_gradient_descent.py
contains pytest-based unit tests:
- Quartic function minimisation.
- Simple quadratic minimum.
- Higher-dimensional exponential minimisation.
- Gradient ascent demonstration (maximisation).
Tolerance for all tests: 1e-8.
pytest
This project demonstrates:
- Subclassing NumPy arrays to store optimisation metadata.
- Implementing gradient descent with momentum and noise.
- Deriving a neural network training pipeline from first principles of tensor calculus.
- Creating a generalised Jacobian (
TensorJacobian
) that supports higher-order tensor derivatives. - Training a custom neural network on both sine wave regression and helix projection.
- Visualising training with snapshots of predictions and loss curves.