Skip to content

SuchetBhalla/flux

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

This repository is a machine learning framework for training Multi-Layer Perceptrons, built from scratch using C++20.

The goal is to explore the systems-level design of automatic differentiation engines, with an emphasis on memory layout, data ownership & performance.

No external ML, autodiff or tensor libraries are used.

Design Focus

The implementation emphasizes:

  • aligned, contiguous memory layouts to improve cache locality
  • SIMD (AVX2) intrinsics for compute-heavy kernels
  • explicit ownership and lifetime management of tensors

Architecture

  • engine::Tensor owns aligned, contiguous storage for numerical data
  • Backpropagation is performed using reverse-mode automatic differentiation over a thread-local operation tape rather than an explicit computation graph

Demo: MNIST

Two reference training runs are provided.

cmake -S . -B build-release -DCMAKE_BUILD_TYPE=Release
cmake --build build-release
./build-release/cvg
./build-release/mnist

The first serves as a functional and performance sanity-check for the autodiff engine. The second trains a 2-layer MLP on MNIST.

Performance & Tooling

Performance analysis was conducted using the Intel® VTune™ Profiler

With single threaded execution,

  • Physical core utilization: 94.6%
  • CPI Rate: 0.697

Results & Conclusions are stored under docs/PERFORMANCE.md

Versioning

v1 (this):

  • Uses std::shared_ptr for managing lifetimes of engine::Tensor
  • Simplifies correctness at the cost of atomic reference-count updates

v2 (next):

  • Arena-based allocation for engine::Tensor objects
  • Elimination of False sharing; due to reference-count contention under parallel execution

Acknowledgements

  1. Conceptual inspiration from micrograd.
  2. The C++ Programming Language by Bjarne Stroustrup (4th Edition)
    • all references to page numbers (such as p. xxx), are to this book
  3. Simple C++ reader for MNIST dataset
  4. xoshiro256+
  5. Intel Intrinsics Guide

About

SIMD autograd for MLPs in C++

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published