tiny-recursive-model provides a fast and accurate implementation of Tiny Recursive Model (TRM) as described inLess is More: Recursive Reasoning with Tiny Networks by Alexia Jolicoeur-Martineau The model replicates the original results from the paper on the Sudoku-Extreme dataset.
Experimented with different layer sizes, number of layers and more. More information can be found in results section.
You will need Python 3.7 or higher.
Dependencies can be found in requirements.txt. You can install these dependencies in a virtual environment like this:
python -m vev venv # Create the virtual environment
source venv/bin/activate # Activate the virtual environment
pip install -r requirements.txt # Install Python dependencies
To train a model on a task, follow these 3 steps
Create a .cfg file. example.cfg contains the original hyperparameters from the TRM Paper
After creating your .cfg file, you can run it using the train.py script. python train.py -config my_cfg.cfg
Sample with sample.py
My implementation is completely based off the one described in the original paper. This includes, but is not limited to, RMSNorm for normilzation, SwiGLU activation function, and no biases used. I also heavily used einops and einsum. Not for any reason other than practice.