## Using simple Pytorch NeuralNetwork model with a MOA evaluator

* Example showing how a simple Pytorch model can be used with our ```Instance``` representation and MOA evaluator
**Make sure you install Pytorch in your environment (https://pytorch.org/)**

**notebook last updated on 03/12/2023**

## 0. Reading data and accessing x()

In [1]:
from capymoa.stream import stream_from_file
from capymoa.datasets import ElectricityTiny

DATA_PATH = "../data/"

## Opening a file as a stream
elec_stream = ElectricityTiny()

elec_stream.restart()
i = 0
while elec_stream.has_more_instances():
    instance = elec_stream.next_instance()
    if i < 20: # prevent printing all the instances
        print(f'x: {instance.x}, y: {instance.y_label}')
    i+=1

capymoa_root: /home/antonlee/github.com/tachyonicClock/MOABridge/src/capymoa
MOA jar path location (config.ini): /home/antonlee/github.com/tachyonicClock/MOABridge/src/capymoa/jar/moa.jar
JVM Location (system): 
JAVA_HOME: /usr/lib/jvm/java-17-openjdk
JVM args: ['-Xmx8g', '-Xss10M']
Sucessfully started the JVM and added MOA jar to the class path


x: [0.       0.056443 0.439155 0.003467 0.422915 0.414912], y: 1
x: [0.021277 0.051699 0.415055 0.003467 0.422915 0.414912], y: 1
x: [0.042553 0.051489 0.385004 0.003467 0.422915 0.414912], y: 1
x: [0.06383  0.045485 0.314639 0.003467 0.422915 0.414912], y: 1
x: [0.085106 0.042482 0.251116 0.003467 0.422915 0.414912], y: 0
x: [0.106383 0.041161 0.207528 0.003467 0.422915 0.414912], y: 0
x: [0.12766  0.041161 0.171824 0.003467 0.422915 0.414912], y: 0
x: [0.148936 0.041161 0.152782 0.003467 0.422915 0.414912], y: 0
x: [0.170213 0.041161 0.13493  0.003467 0.422915 0.414912], y: 0
x: [0.191489 0.041161 0.140583 0.003467 0.422915 0.414912], y: 0
x: [0.212766 0.044374 0.168997 0.003467 0.422915 0.414912], y: 1
x: [0.234043 0.049868 0.212437 0.003467 0.422915 0.414912], y: 1
x: [0.255319 0.051489 0.298721 0.003467 0.422915 0.414912], y: 1
x: [0.276596 0.042482 0.39036  0.003467 0.422915 0.414912], y: 0
x: [0.297872 0.040861 0.402261 0.003467 0.422915 0.414912], y: 0
x: [0.319149 0.040711 0.4

In [2]:
# Getting some extra information about the instance through the MOA representation. 
moa_instance = instance.java_instance.getData()
print(f'Number of classes: {moa_instance.numClasses()}')
print(f'Number of features/attributes: {moa_instance.numInputAttributes()}')

for i in range(0, moa_instance.numInputAttributes()):
    print(f'    {moa_instance.attribute(i)}')
    print(f'    {moa_instance.value(i)}')

Number of classes: 2
Number of features/attributes: 6
    @attribute period numeric
    0.659574
    @attribute nswprice numeric
    0.10475
    @attribute nswdemand numeric
    0.543737
    @attribute vicprice numeric
    0.003467
    @attribute vicdemand numeric
    0.422915
    @attribute transfer numeric
    0.414912


## 1. Using Pytorch model with MOA evaluator

* Example showing how a simple Pytorch model can be used with our ```Instance``` representation and MOA evaluator
* Uses CPU device
* Model is initialized after receiving the first instance

In [3]:
import torch
from torch import nn

# Get cpu device for training.
device = ("cpu")
print(f"Using {device} device")

# Define model
class NeuralNetwork(nn.Module):
    def __init__(self, input_size=0, number_of_classes=0):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(input_size, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, number_of_classes)
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits


model = None
optimizer = None
loss_fn = nn.CrossEntropyLoss()


Using cpu device


In [4]:
from capymoa.evaluation import ClassificationEvaluator

# Creating the evaluator
evaluator = ClassificationEvaluator(schema=elec_stream.get_schema())

## Opening a file again to strat from the beginning
elec_stream = ElectricityTiny()
i = 0
while elec_stream.has_more_instances():
    i += 1
    instance = elec_stream.next_instance()
    if model is None:
        moa_instance = instance.java_instance.getData()
        # initialize the model and send it to the device
        model = NeuralNetwork(input_size=moa_instance.numInputAttributes(), number_of_classes=moa_instance.numClasses()).to(device)
        # set the optimizer
        optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)
        print(model)
    
    X = torch.tensor(instance.x, dtype=torch.float32)
    y = torch.tensor(instance.y_index, dtype=torch.long)
    # set the device and add a dimension to the tensor
    X, y = torch.unsqueeze(X.to(device), 0), torch.unsqueeze(y.to(device),0) 
    
    # turn off gradient collection for test
    with torch.no_grad():
        pred = model(X)
        prediction = torch.argmax(pred)

    # update evaluator with predicted class
    evaluator.update(instance.y_label, instance.schema.get_value_for_index(prediction))
  
    # Compute prediction error
    pred = model(X)
    loss = loss_fn(pred, y)

    # Backpropagation
    loss.backward()
    optimizer.step()
    optimizer.zero_grad()
    
    if i % 500 == 0:
        print(f'Accuracy at {i} : {evaluator.accuracy()}')
    
print(f'Accuracy at {i} : {evaluator.accuracy()}')

NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=6, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=2, bias=True)
  )
)


Accuracy at 500 : 54.800000000000004


Accuracy at 1000 : 60.199999999999996


Accuracy at 1500 : 65.06666666666666


Accuracy at 2000 : 64.0
Accuracy at 2000 : 64.0


## 2. How to use TensorBoard with PyTorch

Install TensorBoard through the command line to visualize data you logged

```sh
pip install tensorboard
```

Clear any logs from previous runs

```sh
rm -rf ./runs
```

Create a SummaryWriter instance.

In [5]:
!pip install tensorboard





In [6]:
from torch.utils.tensorboard import SummaryWriter
writer = SummaryWriter()

Writer will output to ./runs/ directory by default.

In [7]:
from capymoa.evaluation import ClassificationEvaluator

# Creating the evaluator
evaluator = ClassificationEvaluator(schema=elec_stream.get_schema())

## Opening a file again to strat from the beginning
elec_stream = ElectricityTiny()
i = 0
while elec_stream.has_more_instances():
    i += 1
    instance = elec_stream.next_instance()
    if model is None:
        moa_instance = instance.java_instance.getData()
        # initialize the model and send it to the device
        model = NeuralNetwork(input_size=moa_instance.numInputAttributes(), number_of_classes=moa_instance.numClasses()).to(device)
        # set the optimizer
        optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)
        print(model)
    
    X = torch.tensor(instance.x, dtype=torch.float32)
    y = torch.tensor(instance.y_index, dtype=torch.long)
    # set the device and add a dimension to the tensor
    X, y = torch.unsqueeze(X.to(device), 0), torch.unsqueeze(y.to(device),0) 
    
    # turn off gradient collection for test
    with torch.no_grad():
        pred = model(X)
        prediction = instance.schema.get_value_for_index(torch.argmax(pred))

    # update evaluator with predicted class
    evaluator.update(instance.y_label, prediction)
  
    # Compute prediction error
    pred = model(X)
    loss = loss_fn(pred, y)

    # Backpropagation
    loss.backward()
    optimizer.step()
    optimizer.zero_grad()
    
    if i % 500 == 0:
        print(f'Accuracy at {i} : {evaluator.accuracy()}')
        writer.add_scalar("accuracy", evaluator.accuracy(), i)
    
writer.add_scalar("accuracy", evaluator.accuracy(), i)
writer.flush()

Accuracy at 500 : 55.00000000000001


Accuracy at 1000 : 60.9


Accuracy at 1500 : 65.8


Accuracy at 2000 : 64.64999999999999


Call flush() method to make sure that all pending events have been written to disk.

See torch.utils.tensorboard tutorials to find more TensorBoard visualization types you can log.


In [8]:
# If you do not need the summary writer anymore, call close() method.
writer.close()


```
# This is formatted as code
```

## Run TensorBoard
Now, start TensorBoard, specifying the root log directory you used above. 
Argument ``logdir`` points to directory where TensorBoard will look to find 
event files that it can display. TensorBoard will recursively walk 
the directory structure rooted at ``logdir``, looking for ``.*tfevents.*`` files.

```sh
tensorboard --logdir=runs
```
Go to the URL it provides

This dashboard shows how the accuracy change with time. 
You can use it to also track training speed, learning rate, and other 
scalar values.