# Using Pytorch
* This notebook demonstrates how use Pytorch with CapyMOA.
* It contains examples showing 
    * how to define a Pytorch Network to be used with CapyMOA
    * how a simple Pytorch model can be used in a CapyMOA ```Instance``` loop
    * how to define a Pytorch CapyMOA Classifier based on CapyMOA ```Classifier``` framework and how to use it with ```prequential_evaluation()```
    * how to use TensorBoard with a PyTorchClassifier and the instance loop. 
    * how to use a Pytorch dataset with a CapyMOA classifier

## 1. Setup
* Sets random seed for reproducibility
* Sets Pytorch network 

### 1.1 Set random seeds

In [1]:
import random
random_seed=1
random.seed(random_seed)

### 1.2 Define network structure
* Here, network uses the CPU device

In [2]:
import torch
from torch import nn

torch.manual_seed(random_seed)
torch.use_deterministic_algorithms(True)

# Get cpu device for training.
device = ("cpu")
print(f"Using {device} device")

# Define model
class NeuralNetwork(nn.Module):
    def __init__(self, input_size=0, number_of_classes=0):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(input_size, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, number_of_classes)
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits


Using cpu device


### 1.3 Using instance loop
* Model is initialized after receiving the first instance

In [3]:
from capymoa.evaluation import ClassificationEvaluator
from capymoa.datasets import ElectricityTiny

elec_stream = ElectricityTiny()

# Creating the evaluator
evaluator = ClassificationEvaluator(schema=elec_stream.get_schema())

model = None
optimizer = None
loss_fn = nn.CrossEntropyLoss()

i = 0
while elec_stream.has_more_instances():
    i += 1
    instance = elec_stream.next_instance()
    if model is None:
        moa_instance = instance.java_instance.getData()
        # initialize the model and send it to the device
        model = NeuralNetwork(input_size=elec_stream.get_schema().get_num_attributes(), 
                              number_of_classes=elec_stream.get_schema().get_num_classes()).to(device)
        # set the optimizer
        optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)
        print(model)
    
    X = torch.tensor(instance.x, dtype=torch.float32)
    y = torch.tensor(instance.y_index, dtype=torch.long)
    # set the device and add a dimension to the tensor
    X, y = torch.unsqueeze(X.to(device), 0), torch.unsqueeze(y.to(device),0) 
    
    # turn off gradient collection for test
    with torch.no_grad():
        pred = model(X)
        prediction = torch.argmax(pred)

    # update evaluator with predicted class
    evaluator.update(instance.y_index, prediction.item())
  
    # Compute prediction error
    pred = model(X)
    loss = loss_fn(pred, y)

    # Backpropagation
    loss.backward()
    optimizer.step()
    optimizer.zero_grad()
    
    if i % 500 == 0:
        print(f'Accuracy at {i} : {evaluator.accuracy()}')
    
print(f'Accuracy at {i} : {evaluator.accuracy()}')

capymoa_root: /Users/ng98/Desktop/CODE/CapyMOA_Latest/src/capymoa
MOA jar path location (config.ini): /Users/ng98/Desktop/CODE/CapyMOA_Latest/src/capymoa/jar/moa.jar
JVM Location (system): 
JAVA_HOME: /Users/ng98/Library/Java/JavaVirtualMachines/openjdk-14.0.1/Contents/Home
JVM args: ['-Xmx8g', '-Xss10M']
Sucessfully started the JVM and added MOA jar to the class path
NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=6, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=2, bias=True)
  )
)
Accuracy at 500 : 50.4
Accuracy at 1000 : 55.2
Accuracy at 1500 : 61.199999999999996
Accuracy at 2000 : 61.1
Accuracy at 2000 : 61.1


## 2. PyTorchClassifier with prequential_evaluation
* Defining a PyTorchClassifier using CapyMOA API allows better compatibility with CapyMOA functions like ```prequential_evaluation()```
* Model is initialized after receiving the first instance

### 2.1 Define a CapyMOA PyTorchClassifier
* PyTorchClassifier is based on CapyMOA ```Classifier``` framework

In [4]:
from capymoa.base import Classifier
import numpy as np

class PyTorchClassifier(Classifier):
    def __init__(self, schema=None, random_seed=1, nn_model: nn.Module = None, optimizer=None, loss_fn=nn.CrossEntropyLoss(), device=("cpu"), lr=1e-3):
        super().__init__(schema, random_seed)
        self.model = None
        self.optimizer = None
        self.loss_fn = loss_fn
        self.lr = lr
        self.device = device
        
        torch.manual_seed(random_seed)
        
        if nn_model is None:
            self.set_model(None)
        else:
            self.model = nn_model.to(device)
        if optimizer is None:
            if self.model is not None:
                self.optimizer = torch.optim.SGD(self.model.parameters(), lr=lr)
        else:
            self.optimizer = optimizer
        
    def __str__(self):
        return str(self.model)

    def CLI_help(self):
        return str('schema=None, random_seed=1, nn_model: nn.Module = None, optimizer=None, loss_fn=nn.CrossEntropyLoss(), device=("cpu"), lr=1e-3')

    def set_model(self, instance):
        if self.schema is None:
            moa_instance = instance.java_instance.getData()
            self.model = NeuralNetwork(input_size=moa_instance.get_num_attributes(), number_of_classes=moa_instance.get_num_classes()).to(self.device)
        elif instance is not None:
            self.model = NeuralNetwork(input_size=self.schema.get_num_attributes(), number_of_classes=self.schema.get_num_classes()).to(self.device)
            
    def train(self, instance):
        if self.model is None:
            self.set_model(instance)
    
        X = torch.tensor(instance.x, dtype=torch.float32)
        y = torch.tensor(instance.y_index, dtype=torch.long)
        # set the device and add a dimension to the tensor
        X, y = torch.unsqueeze(X.to(self.device), 0), torch.unsqueeze(y.to(self.device),0)

        # Compute prediction error
        pred = self.model(X)
        loss = self.loss_fn(pred, y)
    
        # Backpropagation
        loss.backward()
        self.optimizer.step()
        self.optimizer.zero_grad()

    def predict(self, instance):
        return np.argmax(self.predict_proba(instance))

    def predict_proba(self, instance):
        if self.model is None:
            self.set_model(instance)
        X = torch.unsqueeze(torch.tensor(instance.x, dtype=torch.float32).to(self.device), 0)
        # turn off gradient collection
        with torch.no_grad():
            pred = np.asarray(self.model(X).numpy(), dtype=np.double)
        return pred


### 2.2 Using PyTorchClassifier + prequential_evaluation

In [5]:
from capymoa.evaluation import prequential_evaluation

## Opening a file as a stream
elec_stream = ElectricityTiny()

# Creating a learner
simple_pyTorch_classifier = PyTorchClassifier(
    schema=elec_stream.get_schema(), 
    nn_model=NeuralNetwork(input_size=elec_stream.get_schema().get_num_attributes(), number_of_classes=elec_stream.get_schema().get_num_classes()).to(device)
)

evaluator = prequential_evaluation(stream=elec_stream, learner=simple_pyTorch_classifier, window_size=4500, optimise=False)

evaluator['cumulative'].accuracy()

62.849999999999994

## 3. How to use TensorBoard with PyTorch
* One can use TensorBoard to visualize logged data online 

### 3.1 Install TensorBoard
Clear any logs from previous runs

```sh
rm -rf ./runs
```

In [6]:
!pip install tensorboard



### 3.2  Example using PyTorchClassifier + the instance loop + TensorBoard
* Here we use instance loop to log relevant log information to TensorBoard
* These information can be viewed while the processing is happening using TensorBoard

In [7]:
from capymoa.evaluation import ClassificationEvaluator
from torch.utils.tensorboard import SummaryWriter

# Create a SummaryWriter instance.
writer = SummaryWriter()
## Opening a file again to start from the beginning
elec_stream = ElectricityTiny()

# Creating the evaluator
evaluator = ClassificationEvaluator(schema=elec_stream.get_schema())

# Creating a learner
simple_pyTorch_classifier = PyTorchClassifier(
    schema=elec_stream.get_schema(), 
    nn_model=NeuralNetwork(input_size=elec_stream.get_schema().get_num_attributes(), number_of_classes=elec_stream.get_schema().get_num_classes()).to(device)
)

i = 0
while elec_stream.has_more_instances():
    i += 1
    instance = elec_stream.next_instance()

    prediction = simple_pyTorch_classifier.predict(instance)
    evaluator.update(instance.y_index, prediction)
    simple_pyTorch_classifier.train(instance)
    
    if i % 1000 == 0:
        writer.add_scalar("accuracy", evaluator.accuracy(), i)

writer.add_scalar("accuracy", evaluator.accuracy(), i)
# Call flush() method to make sure that all pending events have been written to disk.
writer.flush()

# If you do not need the summary writer anymore, call close() method.
writer.close()

#### Run TensorBoard
Now, start TensorBoard, specifying the root log directory you used above. 
Argument ``logdir`` points to directory where TensorBoard will look to find 
event files that it can display. TensorBoard will recursively walk 
the directory structure rooted at ``logdir``, looking for ``.*tfevents.*`` files.

```sh
tensorboard --logdir=runs
```
Go to the URL it provides

This dashboard shows how the accuracy change with time. 
You can use it to also track training speed, learning rate, and other 
scalar values.

## 4 How to use a Pytorch dataset with a CapyMOA classifier
* One may want to use various Pytorch datasets with different CapyMOA classifiers

### 4.1 Using Pytorch Dataset + prequential evaluation + CapyMOA Classifier

In [8]:
from capymoa.classifier import OnlineBagging
from capymoa.stream import PytorchStream
from capymoa.evaluation import prequential_evaluation

from torchvision import datasets
from torchvision.transforms import ToTensor

pytorchDtaset = datasets.FashionMNIST(
            root="data",
            train=True,
            download=True,
            transform=ToTensor()
        )
pytorchc_stream = PytorchStream(dataset=pytorchDtaset)

# Creating a learner
ob_learner = OnlineBagging(schema=pytorchc_stream.get_schema(), ensemble_size=5)

results_ob_learner = prequential_evaluation(stream=pytorchc_stream, learner=ob_learner, window_size=1000, max_instances=1000)

print(results_ob_learner['cumulative'].accuracy())
print(results_ob_learner['windowed'].accuracy())

43.5
43.5
