Skip to content

Jaffulee/deep_neural_network

Repository files navigation

image

Gradient Descent Optimiser and Neural Network (Tensor-based Backpropagation)

This project contains a Python implementation of a gradient descent optimiser, a fully custom neural network framework, and a generalised Jacobian class for handling tensor derivatives. The framework is built from scratch using NumPy, with tensor-calculus-based backpropagation, and demonstrates training on toy datasets such as fitting a sine wave and a helix projection.


Features

1. gradient_descent.py

Implements the optimiser:

  • class GradientDescent(np.ndarray)
    • Preserves the shape of optimisation variables.
    • Stores metadata (gradient_current, gradient_previous, gradient_step, minima).
    • Supports momentum and noise injection.
  • gradient_descent_step(...) Performs one update step:
    • Learning rate control.
    • Descent or ascent.
    • Momentum support.
    • Optional Gaussian noise at fixed intervals.
  • gradient_descent_from_function(...) Optimises any function with a known gradient:
    • Multiple random restarts.
    • Gradient clipping.
    • Convergence checks (gradient norm).
    • Maximisation as well as minimisation.

2. neural_network.py

Implements a general neural network pipeline:

  • Activation functions: sigmoid, relu, tanh, softmax, identity. Each with corresponding derivatives, accessible via the ActivationFunctions registry.
  • class TensorJacobian(np.ndarray)
    • Generalised Jacobian supporting arbitrary tensor derivatives.
    • Stores numerator and denominator tensor types.
    • Overloads multiplication to compose tensor derivatives via Einstein summation (einsum).
    • Handles higher-order tensor calculus automatically.
  • Forward propagation: arbitrary depth and width defined via layer_node_nums.
  • Backward propagation:
    • Derived from full tensor calculus (Jacobian compositions).
    • Computes gradients with respect to weights and biases (DEDWs, DEDbs).
    • Supports multiple hidden layers with arbitrary sizes.
  • Loss function: column-wise mean squared error
    E(Y, Yhat) = (1 / (2m)) * Σ ||yᵢ - ŷᵢ||²
        
    with derivative included.

3. neural_network_test.py

Demonstrates training the network on two toy problems:

  • Sine wave fitting:
    • Random samples of sin(x) between 0 and 2π, rescaled to [0, 1].
    • Network: multilayer perceptron, e.g. [1, 6, 1].
    • Plots:
      • True sine curve (black).
      • Training samples (red points, smaller markers for clarity).
      • Intermediate predictions every 10% of training (low opacity lines).
      • Final prediction (green line).
    • Also plots training loss over iterations.
  • Helix projection fitting:
    • Function: f(t) = [t cos t, t sin t].
    • Network: [1, 10, 2].
    • Plots:
      • True XY helix projection (black).
      • Training samples (red points).
      • Intermediate predictions every 10% of training (low opacity lines).
      • Final NN prediction (green dashed line).
    • Also plots training loss.

Example Run

python neural_network_test.py

This will:

  • Initialise random neural networks.
  • Train on sine and helix datasets.
  • Output progress logs (iterations + loss).
  • Generate plots for true functions, training data, predictions, and training losses.

Tests

test_gradient_descent.py contains pytest-based unit tests:

  • Quartic function minimisation.
  • Simple quadratic minimum.
  • Higher-dimensional exponential minimisation.
  • Gradient ascent demonstration (maximisation).

Tolerance for all tests: 1e-8.

pytest

Summary

This project demonstrates:

  • Subclassing NumPy arrays to store optimisation metadata.
  • Implementing gradient descent with momentum and noise.
  • Deriving a neural network training pipeline from first principles of tensor calculus.
  • Creating a generalised Jacobian (TensorJacobian) that supports higher-order tensor derivatives.
  • Training a custom neural network on both sine wave regression and helix projection.
  • Visualising training with snapshots of predictions and loss curves.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages