# pybela tutorial
In this workshop we'll be using jupyter notebooks and python to:
1. Record a dataset of potentiometer values
2. Train an RNN to predict the potentiometer's values
3. Cross-compile and deploy the model to run in real-time in Bela

First, we need to copy the dataset capturing code into Bela. Connect the Bela to your laptop, wait for a few seconds so that the connection is established, and run the cell below:

In [None]:
! ssh-keyscan $BBB_HOSTNAME >> ~/.ssh/known_hosts

In [None]:
! rsync -rvL bela-code/dataset-capture root@$BBB_HOSTNAME:Bela/projects/

## 1 - Collect dataset
We will record a dataset of potentiometer movements. 
- Connect the left and right pins of the potentiometer to the ground and 3.3V pins in Bela and the middle pin to the analog input A0.
- Run the `dataset-capture` project on Bela (you can do so from the IDE)
- Connect an aux cable to your phone and play a song, and plug it into the Bela input. Connect your headphones to the Bela output.

The potentiometer controls the shape of an LFO applied to the input audio signal. Play a bit with the potentiometer, and when you are ready for a 1-2min performance, run the cell below to start recording a dataset:


In [None]:
from pybela import Logger
import asyncio
import os

logger=Logger(ip=os.environ["BBB_HOSTNAME"])
logger.connect()

In [None]:
file_paths = logger.start_logging(variables=["pot"])
await asyncio.sleep(10)
logger.stop_logging()

After ~1-2 min, you can stop recording the dataset by running the next cell. You can also record for longer if you prefer!

In [None]:
raw = logger.read_binary_file(
        file_path=file_paths["local_paths"]["pot"], timestamp_mode=logger.get_prop_of_var("pot", "timestamp_mode"))

## 2 - Train model
Now we are ready to train our model. First, import the necessary dependencies:

In [None]:
import torch
import torch.nn as nn
import numpy as np
from tqdm import tqdm 
import pprint as pp
import matplotlib.pyplot as plt
from torch.utils.data import Dataset, DataLoader

Now we can load the dataset using the streamer method `load_data_from_file()`. We can generate a pytorch compatible dataset using the `PotentiometerDataset` class. This class divides the data you recorded previously in sequences of 32 values. A sequence of 32 values will be passed to the network, which will predict the next sequence of 32 values.

In [None]:
data = [data for _buffer in raw["buffers"] for data in _buffer["data"]]

seq_len = 32
batch_size = 64

class PotentiometerDataset(Dataset):
    def __init__(self, data, seq_len=32):
        super().__init__()
        
        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        # make len divisible by seq_len
        data = data[:len(data) - (len(data) % seq_len)]
        sequences = [data[i:i+seq_len] for i in range(0, len(data), seq_len)]

        self.inputs = torch.tensor(sequences[:-1]).float().to(self.device)
        self.outputs = torch.tensor(sequences[1:]).float().to(self.device)
        
    def __len__(self):
        return len(self.inputs)
    
    def __getitem__(self, i):
        return self.inputs[i].unsqueeze(dim=1), self.outputs[i].unsqueeze(dim=1)
    
dataset = PotentiometerDataset(data, seq_len)

# Split dataset
train_count = int(0.9 * dataset.__len__())
test_count = dataset.__len__() - train_count
train_dataset, test_dataset = torch.utils.data.random_split(
    dataset, (train_count, test_count)
)

# Dataloaders
train_loader = DataLoader(
    train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(
    test_dataset, batch_size=batch_size, shuffle=True)

Below we define a simple RNN with a hidden size of 64. We will use an SGD optimiser with a learning rate of 0.001 and use the mean square error as loss.

In [None]:
class RNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(RNN, self).__init__()
        self.hidden_size = hidden_size
        self.rnn = nn.RNN(input_size, hidden_size, batch_first=True, nonlinearity='relu')
        self.fc = nn.Linear(hidden_size, output_size)
        
        self.initialize_weights()
        
    def initialize_weights(self):
        for name, param in self.named_parameters():
            if 'weight' in name:
                nn.init.xavier_uniform_(param)
            elif 'bias' in name:
                nn.init.constant_(param, 0)
            
    def forward(self, x):
        # Initialize hidden state with zeros
        h0 = torch.zeros(1, x.size(0), self.hidden_size).to(x.device)
        
        # Forward propagate the RNN
        out, _ = self.rnn(x, h0)
        
        # Apply the linear layer to get the final output
        out = self.fc(out)
   
        
        return out
    
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = RNN(input_size=1, hidden_size=64, output_size=1).to(device=device)
optimizer = torch.optim.SGD(model.parameters(), lr=0.001)
criterion = torch.nn.MSELoss(reduction='mean')

We can now train our model:

In [None]:
epochs = 50

print("Running on device: {}".format(device))
for epoch in range(1, epochs+1):

    print("█▓░ Epoch: {} ░▓█".format(epoch))

    # training loop
    train_it_losses = np.array([])
    model.train()

    for batch_idx, (data, targets) in enumerate(tqdm(train_loader)):
        # (batch_size, seq_len, input_size)
        data = data.to(device=device, non_blocking=True)
        # (batch_size, seq_len, input_size)
        targets = targets.to(device=device, non_blocking=True)

        optimizer.zero_grad(set_to_none=True)  # lower memory footprint
        out = model(data)
        train_loss = torch.sqrt(criterion(out, targets))
        train_it_losses = np.append(train_it_losses, train_loss.item())
        train_loss.backward()
        optimizer.step()

    # test loop
    test_it_losses = []

    for batch_idx, (data, targets) in enumerate(tqdm(test_loader)):
        # (batch_size, seq_length, input_size)
        data = data.to(device=device, non_blocking=True)
        # (batch_size, seq_length, out_size)
        targets = targets.to(device=device, non_blocking=True)
        model.eval()
        with torch.no_grad():
            out = model(data)  # using predict method to avoid backprop
        test_loss = torch.sqrt(criterion(out, targets))
        test_it_losses = np.append(
            test_it_losses, test_loss.item())

    losses = {"train_loss": train_it_losses.mean().round(
        8), "test_loss": test_it_losses.mean().round(8)}
    pp.pprint(losses, sort_dicts=False)

**NOTE:** If you get a `RuntimeError: could not create a primitive descriptor for a matmul primitive` error here --> check the `readme-silicon.md`. This error seems to happen when training on a jupyter notebook running on a docker container on Mac M1/M2. In the `readme-silicon.md` there are instructions for running the notebook locally (in your laptop, not in the container) so that this error doesn't appear.

Let's make sure the model trained correctly by visualising some of the predictions in the test set. 

In [None]:
# Select random indexes for plotting
num_figures = 8
random_indexes = np.random.choice(len(test_dataset), size=num_figures, replace=False)

# Calculate the number of rows and columns for the subplots
num_rows = num_figures // 2
num_cols = 2

# Set up subplots
fig, axes = plt.subplots(num_rows, num_cols, figsize=(12, 3 * num_rows))

# Flatten the axes array to simplify iteration
axes = axes.flatten()

# Loop through random indexes and plot predictions
for idx, ax in zip(random_indexes, axes):
    input, target = test_dataset.__getitem__(idx)
    output = model(input.unsqueeze(0))
    
    ax.plot(target.view(-1).detach().cpu(), label='Target')
    ax.plot(output.view(-1).detach().cpu(), label='Predictions')
    ax.set_xlabel('Time')
    ax.set_ylabel('Value')
    ax.legend()
    ax.set_ylim(0, 3)
    ax.set_title(f'Figure for Index {idx}')

# Hide any empty subplots
for i in range(len(random_indexes), len(axes)):
    axes[i].axis('off')

plt.tight_layout()
plt.show()


When you're ready, save the model so that we can export it into Bela.

In [None]:
model.to(device='cpu')
model.eval()
script = torch.jit.script(model)
path = "bela-code/pot-inference/model.jit"
script.save(path)

In [None]:
torch.jit.load(path) # check model is properly saved

## 3 - Deploy and run

The cell below will cross-compile and deploy the project to Bela.

In [None]:
! cd bela-code/pot-inference/ && sh build.sh

Once deployed, you can run it from the Bela terminal (which you can access from your regular terminal typing `ssh root@bela.local`) by typing:
```bash
cd Bela/projects/pot-inference
./pot-inference --modelPath model.jit
```