Skip to content

BooBSD/Tsetlin.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tsetlin Machine

“Speed is the most important feature.”

Fred Wilson

The Tsetlin Machine library with zero external dependencies performs quite well. Over 50 million MNIST predictions per second is achieved on a desktop CPU.

Tsetlin Machine benchmark

Key features

  • Single-thread or multi-thread learning and inference.
  • Blazingly fast batch inference is achieved through the utilization of bitwise instructions, SIMD instructions, and specialized batch processing techniques.
  • Compacting/shrinking TM models to save memory and increase inference speed.
  • Combining models with various clauses trained using different hyperparameters into a single model is an approach aimed at achieving higher accuracy using two algorithms: merge and join.
  • Binomial combinatorial merging of trained models to achieve the best accuracy. It is a useful approach for increasing accuracy on augmented datasets or for k-fold cross-validation without risking overfitting on the test dataset.
  • Optimizing trained models by rearranging the indexes of included literals to maximize inference performance without using batches.
  • Saving and loading trained models to and from disk is essential for deployment in production or continuing training with modified hyperparameters.
  • A benchmark tool with a pre-trained model.

Introduction

Here is a quick "Hello, World!" example of a typical use case.

Importing the necessary functions and MNIST dataset:

using MLDatasets: MNIST
using .Tsetlin: TMInput, TMClassifier, train!, predict, accuracy, save, load, unzip

x_train, y_train = unzip([MNIST(:train)...])
x_test, y_test = unzip([MNIST(:test)...])

Booleanizing input data:

x_train = [TMInput(vec([
    [x > 0 ? true : false for x in i];
    [x > 0.5 ? true : false for x in i];
])) for i in x_train]
x_test = [TMInput(vec([
    [x > 0 ? true : false for x in i];
    [x > 0.5 ? true : false for x in i];
])) for i in x_test]

There are some different hyperparameters compared to the Vanilla Tsetlin Machine. The hyperparameter R is a float in the range of 0.0 to 1.0. To get the actual R from the Vanilla S parameter, use the following formula: R = S / (S + 1). The hyperparameter L limits the number of included literals in a clause. best_tms_size is the number of the best TM models collected during the training process. After training, you can save this ensemble of models to your drive or increase accuracy by using Binomial Combinatorial Merge with the combine() function.

const EPOCHS = 1000
const CLAUSES = 2048
const T = 32
const R = 0.94
const L = 12
const best_tms_size = 500

Training the Tsetlin Machine over 1000 epochs and saving the best TM model to disk:

tm = TMClassifier(CLAUSES, T, R, L=L, states_num=256, include_limit=128)
tm_best, tms = train!(tm, x_train, y_train, x_test, y_test, EPOCHS, best_tms_size=best_tms_size, best_tms_compile=true, shuffle=true, batch=true)
save(tm_best, "/tmp/tm_best.tm")

Load the best Tsetlin Machine model and calculate the actual test accuracy:

tm = load("/tmp/tm_best.tm")
println(accuracy(predict(tm, x_test), y_test))

How to run examples

  1. Make sure that you have installed the latest version of the Julia language.
  2. Go to the examples directory: cd ./examples
  3. Run julia --project=. -O3 -t 32 --gcthreads=32,1 mnist_simple.jl where 32 is the number of your logical CPU cores.

Benchmark

The maximum MNIST inference speed achieved is 52 million predictions per second in batch mode on a desktop CPU Ryzen 7950X3D, utilizing 32 threads.

Trained and optimized models can be found in ./examples/models/.

How to run MNIST inference benchmark:

  1. Please close all other programs such as web browsers, antivirus software, torrent clients, music players, etc.
  2. Go to the examples directory: cd ./examples
  3. Run julia --project=. -O3 -t 32 mnist_benchmark_inference.jl where 32 is the number of your logical CPU cores.

Build Status

About

The Tsetlin Machine library with zero external dependencies performs quite well.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages