# NoProp: Community Implementation
Hyungon Ryu | Sr. Solution Architect | NVIDIA AI Technology Center Korea

A community-driven reference implementation of the NoProp method described in Li et al., "NoProp: Training Neural Networks without Back-propagation or Forward-propagation" [arXiv:2503.24322v1](https://arxiv.org/html/2503.24322v1)

## Overview

NoProp is a novel approach for training neural networks without relying on standard back-propagation or forward-propagation steps. This repository provides:
 - Discrete-Time (DT) and Continuous-Time (CT) implementations.
 - Support for benchmark image classification tasks (MNIST, CIFAR-10, CIFAR-100).
 - Scripts for training, evaluation, and visualization of results.

![Architecture](https://arxiv.org/html/2503.24322v1/extracted/6324620/plots/Noprop_clear.png)

- Figure 1:Architecture of NoProp. $z_0$ represents Gaussian noise, while $z_1,…,z_T$ are successive transformations of $z_0$ through the learned dynamics $u_1,…,u_T$, with each layer conditioned on the image $x$, ultimately producing the class prediction $\hat{y}$.

![log](https://arxiv.org/html/2503.24322v1/extracted/6324620/plots/continuous_CIFAR-100.png)

- Figure 3:Test accuracies (%) plotted against cumulative training time (in seconds) for models using one-hot label embedding in the continuous-time setting. All models within each plot were trained on the same type of GPU to ensure a fair comparison. NoProp-CT achieves strong performance in terms of both accuracy and speed compared to adjoint sensitivity. For CIFAR-100, NoProp-FM does not learn effectively with one-hot label embedding.

for more detail, check the original paper [arXiv:2503.24322v1](https://arxiv.org/html/2503.24322v1)

## implementation  
- Modular Design: Duplicate paper and easily extend and investigate NoProp with different model architectures and datasets.
- Modular Backbone Design: Easily configure ResNet-18, ResNet-50, or ResNet-152 backbones.
- CLS Headers with time and Noise : Embed noise (Zt) and time-step (T), then fuse with feature header for classification.
- continous train scheme : follow paper's train scheme with random T for continous train.
- Scheduler Options: Support both Euler and Heun integration schemes for diffusion timesteps.
- Evaluation Hooks:
  - Automatic Heun integration with T=40 evaluation at the end of every epoch.
  - Post-training evaluation across customizable T values (e.g., [2,5,10,20,40,60,100]).
  - Benchmarks: Pre-configured for MNIST. you can easily evaluate for CIFAR-10, and CIFAR-100.



In [1]:
! git clone https://github.com/yhgon/NoProp.git

Cloning into 'NoProp'...
remote: Enumerating objects: 60, done.[K
remote: Counting objects: 100% (60/60), done.[K
remote: Compressing objects: 100% (55/55), done.[K
remote: Total 60 (delta 16), reused 0 (delta 0), pack-reused 0 (from 0)[K
Receiving objects: 100% (60/60), 27.49 KiB | 3.44 MiB/s, done.
Resolving deltas: 100% (16/16), done.


In [None]:
!python NoProp/src/nopropct_mnist.py


--- resnet18 on cuda ---
100% 9.91M/9.91M [00:00<00:00, 58.2MB/s]
100% 28.9k/28.9k [00:00<00:00, 1.61MB/s]
100% 1.65M/1.65M [00:00<00:00, 14.8MB/s]
100% 4.54k/4.54k [00:00<00:00, 18.1MB/s]
Epoch 01 loss 13.2554 | train 5.1s
Epoch 02 loss 1.5875 | train 3.2s
Epoch 03 loss 1.0001 | train 3.2s
Epoch 04 loss 0.9099 | train 3.1s
Epoch 05 loss 0.8535 | train 3.1s | Acc 95.83% | infer 5.4s
Epoch 06 loss 0.9559 | train 3.0s
Epoch 07 loss 0.8878 | train 2.9s
Epoch 08 loss 0.8088 | train 2.8s


## Citation
```
@misc{li2025noprop,
  title={NoProp: Training Neural Networks without Back-propagation or Forward-propagation},
  author={Li, Qinyu and Teh, Yee Whye and Pascanu, Razvan},
  year={2025},
  eprint={2503.24322v1},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}
```

```
@misc{ryu2025nopropcode,
  title={NoProp: Community Implementation Code},
  author={Ryu, Hyungon},
  year={2025},
  howpublished={\url{https://github.com/yhgon/NoProp}}
}
```

## log
[train/eval for mnist](log01.md)

