# Using BOLT
## Basics.
Let's learn to use the BOLT Python API with an exercise. We'll do a simple image classification task on the MNIST dataset. To perform this task, we'll build a fully connected neural network with the following specifications:
* 784 (28 x 28) input dimension
* A single 1000-dim hidden layer with ReLU
* 10-dim output layer with Softmax

In [None]:
from thirdai import bolt

mnist_layers = [
    bolt.LayerConfig(dim=1000, activation_function=bolt.ActivationFunctions.ReLU),
    bolt.LayerConfig(dim=10, activation_function=bolt.ActivationFunctions.Softmax)
]
mnist_network = bolt.Network(layers=mnist_layers, input_dim=784)

We now load the mnist dataset with our data loader, imported from the dataset submodule.

In [None]:
from thirdai import dataset

print("Loading train dataset...")
mnist_train = dataset.load_bolt_svm_dataset(filename="datasets/mnist/mnist", batch_size=256)

print("Loading test dataset...")
mnist_test = dataset.load_bolt_svm_dataset(filename="datasets/mnist/mnist.t", batch_size=256)


We now train the network to minimize categorical cross entropy loss and measure our success with the categorical accuracy metric.

In [None]:
mnist_network.train(train_data=mnist_train, loss_fn=bolt.CategoricalCrossEntropyLoss(), learning_rate=0.001, epochs=1)
mnist_network.predict(test_data=mnist_test, metrics=["categorical_accuracy"], verbose=True)

## What about bigger models?
We will now use a 10,000-dimensional hidden layer. Typically, a model of this size takes around 100ms to train per epoch. With BOLT, we can leverage sparsity by passing a `load_factor` argument to the layer configuration.

In [None]:
bigger_layers = [
    bolt.LayerConfig(dim=10000, load_factor=0.1, activation_function=bolt.ActivationFunctions.ReLU),
    bolt.LayerConfig(dim=151, activation_function=bolt.ActivationFunctions.Softmax)
]
bigger_network = bolt.Network(layers=bigger_layers, input_dim=5512)

In [None]:
print("Loading train dataset...")
intent_class_train = dataset.load_bolt_svm_dataset(filename="datasets/intent_classification/train_shuf.svm", batch_size=256)

print("Loading test dataset...")
intent_class_test = dataset.load_bolt_svm_dataset(filename="datasets/intent_classification/test_shuf.svm", batch_size=256)

### Sparse inference
You can also use sparsity to accelerate inference. Simply call the `enable_sparse_inference()` method. Notice that we call the method before the last training epoch. This freezes the hash functions, effectively locking specialized subnetworks for each input vector, and then fine-tunes these subnetworks.

In [None]:
bigger_network.train(train_data=intent_class_train, loss_fn=bolt.CategoricalCrossEntropyLoss(), learning_rate=0.001, epochs=2)
bigger_network.enable_sparse_inference()
bigger_network.train(train_data=intent_class_train, loss_fn=bolt.CategoricalCrossEntropyLoss(), learning_rate=0.001, epochs=1)
bigger_network.predict(test_data=intent_class_test, metrics=["categorical_accuracy"], verbose=True)

## What does this enable?
We trained a 200 million parameter model on the Yelp Reviews public dataset. As a benchmark, we fine-tuned RoBERTa on this dataset and got an accuracy of 83%. Let's see how well BOLT does!

In [None]:
yelp_sentiment_analysis_layers = [
    bolt.LayerConfig(dim=2000, 
        load_factor=0.2, 
        activation_function=bolt.ActivationFunctions.ReLU,
        sampling_config=bolt.SamplingConfig(
            hashes_per_table=4,
            num_tables=64,
            range_pow=4 * 3,
            reservoir_size=64,
        )),
    bolt.LayerConfig(dim=2,
        load_factor=1.0, 
        activation_function=bolt.ActivationFunctions.Softmax,
        )     
]
yelp_sentiment_analysis_network = bolt.Network(layers=yelp_sentiment_analysis_layers, input_dim=100000)

### Load & Save
BOLT supports loading and saving networks from previous training sessions. 

To save, call the `save()` method on the trained network. 

In [None]:
train_data = dataset.load_bolt_svm_dataset("../sa_demo/text_data/yelp_review_full_2class_train.svm", 1024)
yelp_sentiment_analysis_network.train(train_data,bolt.CategoricalCrossEntropyLoss(), 0.0001, epochs=20, rehash=6400, rebuild=128000,)
yelp_sentiment_analysis_network.save(filename="yelp_sentiment_analysis_cp")

To load a trained model, call the `bolt.Network.load()` static method.

In [None]:
test_data = dataset.load_bolt_svm_dataset("../sa_demo/text_data/yelp_review_full_2class_test.svm", 256)
yelp_sentiment_analysis_network = bolt.Network.load(filename="yelp_sentiment_analysis_cp")
res = yelp_sentiment_analysis_network.predict(test_data, metrics=["categorical_accuracy"], verbose=True)

We also trained an even larger 2 billion parameter model on a larger text corpus to build an interactive sentiment analysis demo. We first load the trained model.

In [None]:
sentiment_analysis_network = bolt.Network.load("interactive_demo_cp")

Let's load the demo to get a feel of what this network can do!

In [None]:
import interactive_sentiment_analysis
interactive_sentiment_analysis.demo(sentiment_analysis_network, verbose=False)

### Let's talk speed.

In [None]:
import time
from transformers import pipeline
sentiment_analysis = pipeline("sentiment-analysis",model="siebert/sentiment-roberta-large-english")
t1 = time.time()
out = sentiment_analysis("I love chocolate.")
t2 = time.time()
print(out, flush=True)
print('time elapsed: ',str(t2-t1),'s', flush=True)

In [None]:
# TODO: Make the accuracy disappear when doing interactive demo 
# TODO: Write scripts to download datasets and saved models
# TODO: Clean up the interactive sentiment demo code