# Undertale & Deltarune Soundtrack Generator

---

## Table of Contents

0. [**Table of Contents**](#Table-of-Contents)

1. [**Imports**](#Imports)

2. [**Data Processing**](#Data-Processing)

    2.1 [Data Loading](#Data-Loading)
    
    2.2 [Data Preprocessing](#Data-Preprocessing)
    
    2.3 [Dataset & Dataloader Definition](#Dataset-&-Dataloader-Definition)
    
3. [**Model Definition**](#Model-Definition)
    
4. [**Hyperparameters & Instantiation**](#Hyperparameters-&-Instantiation)

5. [**Training**](#Training)

---

## Imports
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

Import required packages

In [1]:
import os                                         # File handling
import itertools                                  # chain() for merging lists
import random                                     # Shuffling
import collections                                # Useful tools like Counter, OrderedDict
import math                                       # For... math
from decimal import Decimal                       # Scientific notations in string formatting
from time import time                             # For use in progress bar

import tqdm.auto as tqdm                          # Progress bar

from IPython.display import clear_output

import torch                                      # Deep Learning Framework
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader

import matplotlib.pyplot as plt                   # Plotting training progress
from matplotlib.ticker import AutoLocator
%matplotlib inline

fig_bg_color = "lightsteelblue"
plot_bg_color = "slategray"
fontsize = 20

---

## Data Processing
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

### Data Loading
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

Read the text files in the target directory.

Do some processing to make sure the texts are clean.

In [2]:
def get_texts(texts_dir):

    if not os.path.isdir(texts_dir):
        raise FileNotFoundError("given text directory not found: {}".format(texts_dir))

    texts = []
    
    for text_path in (file.path for file in os.scandir(texts_dir) if file.is_file() and file.name.endswith(".txt")):
        with open(file=text_path, mode='r', encoding="utf-8") as text_file:
            
            text = text_file.read().strip()

            if not text.replace(' ', '').isdigit():
                raise RuntimeError("one or more characters other than digits and white spaces are detected: {}".format(text_path))

            while "  " in text:
                text = text.replace("  ", ' ')
            
            texts.append((text_path, text))
    
    return dict(texts)


[(os.path.split(text_path)[1], text[:20]) for text_path, text in get_texts("./source/converted_texts").items()]

[('ANOTHER_HIM_-_DeltaRune.txt', '42 46 49 53 0 42 46 '),
 ('A_Town_Called_Hometown_Deltarune_-_Arranged_for_Piano.txt',
  '73 89 0 73 89 0 73 8'),
 ('Basement_Deltarune_-_Arranged_for_Piano.txt', '39 51 0 39 51 0 39 5'),
 ('Before_the_Story_Deltarune_-_Arranged_for_piano_.txt',
  '48 0 48 0 48 0 48 0 '),
 ('Card_Castle_Deltarune_-_Arranged_for_Piano.txt', '39 0 39 0 39 0 39 0 '),
 ('Checker_Dance_Deltarune_-_Arranged_for_Piano.txt', '30 0 30 0 30 0 30 0 '),
 ('Deltarune_-_Beginning.txt', '48 55 0 48 55 0 48 5'),
 ('Deltarune_-_Chaos_King.txt', '27 39 0 27 39 0 27 3'),
 ('Deltarune_-_Darkness_Falls.txt', '61 64 71 75 0 61 64 '),
 ('Deltarune_-_Dont_Forget_Ending_Theme_Solo_Piano_Version.txt',
  '77 0 77 0 77 0 77 0 '),
 ('Deltarune_-_Friendship.txt', '74 0 74 0 74 0 74 0 '),
 ('Deltarune_-_Gallery.txt', '32 36 39 68 0 32 36 '),
 ('Deltarune_-_Lancer_Battle.txt', '62 0 62 0 62 0 0 0 0'),
 ('DELTARUNE_-_Lancer_piano_solo.txt', '0 0 0 62 0 62 0 62 0'),
 ('Deltarune_-_Lantern.txt', '49 0 4

### Data Preprocessing
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

Get integers out of the text and make lists of ints.

These lists can be used for the input of the models, or be further processed to compress or simplify the sequences.

In this notebook, I'll leave the data as it is and do note-by-note. (Similar to Character-By-Character approach)

In [3]:
def texts_to_intlists(text_list):
    
    intlists = []
    
    for i, text in enumerate(iterable=text_list):
        
        int_strings = text.split(' ')
        
        if not all(int_str.isdigit() for int_str in int_strings):
            raise RuntimeError("non-digit string detected in text {}".format(i))

        ints = [int(int_str) for int_str in int_strings]
        
        intlists.append(ints)
        
    return intlists


print([ints[:10] for ints in texts_to_intlists(get_texts("./source/converted_texts").values())])

[[42, 46, 49, 53, 0, 42, 46, 49, 53, 0], [73, 89, 0, 73, 89, 0, 73, 89, 0, 73], [39, 51, 0, 39, 51, 0, 39, 51, 0, 39], [48, 0, 48, 0, 48, 0, 48, 0, 48, 0], [39, 0, 39, 0, 39, 0, 39, 0, 39, 0], [30, 0, 30, 0, 30, 0, 30, 0, 30, 0], [48, 55, 0, 48, 55, 0, 48, 55, 0, 48], [27, 39, 0, 27, 39, 0, 27, 39, 0, 27], [61, 64, 71, 75, 0, 61, 64, 71, 75, 0], [77, 0, 77, 0, 77, 0, 77, 0, 77, 0], [74, 0, 74, 0, 74, 0, 74, 0, 74, 0], [32, 36, 39, 68, 0, 32, 36, 39, 68, 0], [62, 0, 62, 0, 62, 0, 0, 0, 0, 65], [0, 0, 0, 62, 0, 62, 0, 62, 0, 62], [49, 0, 49, 0, 49, 0, 49, 0, 49, 0], [31, 43, 0, 31, 43, 0, 31, 43, 0, 31], [24, 31, 0, 24, 31, 0, 24, 31, 0, 24], [45, 57, 0, 45, 57, 0, 45, 57, 0, 45], [39, 0, 39, 0, 39, 0, 39, 0, 39, 0], [46, 0, 46, 0, 46, 0, 46, 0, 46, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [58, 70, 0, 58, 70, 0, 58, 70, 0, 58], [37, 49, 0, 37, 49, 0, 37, 49, 0, 37], [44, 68, 0, 44, 68, 0, 44, 68, 0, 44], [67, 0, 67, 0, 67, 0, 67, 0, 67, 0], [61, 0, 61, 0, 61, 0, 61, 0, 61, 0], [49, 0, 49, 0, 

### Dataset & Dataloader Definition
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

Create a Dataset class from which training data can be sampled.

This Dataset should convert the encoded sequence above into tensors

and have a method for shuffling the order of multiple sequences while

leaving the patterns inside of each sequence untouched.

In [4]:
class UndertaleDeltaruneDataset(Dataset):
    def __init__(self, texts_dir, batch_size=1):
        self.texts = get_texts(texts_dir) # read and get a dictionary of {file_paths: text_contents}
        self.sequences = texts_to_intlists(self.texts.values())

        self.texts_dir = texts_dir
        self.batch_size = batch_size

    def __len__(self):
        return self.batch_size

    def data_len(self):
        return sum([len(sequence) for sequence in self.sequences])

    def __getitem__(self, index):
        shuffled_list = list(itertools.chain(*random.sample(self.sequences, len(self.sequences))))
        inputs = torch.LongTensor(shuffled_list[:-1])
        labels = torch.LongTensor(shuffled_list[1:])
        return inputs, labels

Create a custom class that loads the data from the dataset above and

allows iteration over the dataset, yielding a small sequence batch at a time.

In [None]:
class UDBatchLoader:
    def __init__(self, ud_dataset, batch_size, sequence_len, drop_last=False, batch_first=True):
        self.ud_dataset = ud_dataset
        self.batch_size = batch_size
        self.sequence_len = sequence_len
        self.drop_last = drop_last
        self.batch_first = batch_first
    
    def __len__(self):
        if self.drop_last:
            return math.floor((self.ud_dataset.data_len() - 1) / self.sequence_len)
        return math.ceil((self.ud_dataset.data_len() - 1) / self.sequence_len)
    
    def generator(self):
        seq_len = self.sequence_len
        n_seq_batches = self.__len__()
        batch_first = self.batch_first
        
        input_batch, target_batch = next(iter(DataLoader(self.ud_dataset, self.batch_size)))
        if not batch_first:
            input_batch = input_batch.transpose(0, 1).contiguous()
            target_batch = target_batch.transpose(0, 1).contiguous()
        
        for start, end in zip(range(0, seq_len * n_seq_batches, seq_len), range(seq_len, (seq_len + 1) * n_seq_batches, seq_len)):
            if batch_first:
                yield (input_batch[:, start:end].contiguous(), target_batch[:, start:end].contiguous())
            else:
                yield (input_batch[start:end], target_batch[start:end])
    
    def __iter__(self):
        return self.generator()

---

## Model Definition
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

Define the model architectures.

In [None]:
class UDNet(nn.Module):
    def __init__(self, hidden_size, num_layers, dropout):
        super(UDNet, self).__init__()

        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.dropout = dropout

        self.init_hiddens = nn.Parameter(torch.randn(num_layers, 1, hidden_size))
        self.init_cells   = nn.Parameter(torch.randn(num_layers, 1, hidden_size))

        self.embed = nn.Embedding(num_embeddings=129, embedding_dim=hidden_size)

        self.lstm = nn.LSTM(input_size=hidden_size, hidden_size=hidden_size, num_layers=num_layers, dropout=dropout, batch_first=True)

        self.fc0 = nn.Sequential(
            nn.LayerNorm(hidden_size),
            nn.Dropout(p=dropout),
            nn.Linear(in_features=hidden_size, out_features=256)
        )

        self.fc1 = nn.Sequential(
            nn.ReLU(),
            nn.LayerNorm(256),
            nn.Dropout(p=dropout),
            nn.Linear(in_features=256, out_features=512)
        )

        self.fc2 = nn.Sequential(
            nn.ReLU(),
            nn.LayerNorm(512),
            nn.Dropout(p=dropout),
            nn.Linear(in_features=512, out_features=129)
        )

    def forward(self, x, hiddens=None):
        if hiddens is None:
            hiddens = self.get_init_hiddens(x.size(0))

        x = self.embed(x)

        x, new_hiddens = self.lstm(x, hiddens)

        x = self.fc0(x)
        x = self.fc1(x)
        x = self.fc2(x)

        return x, new_hiddens

    def get_init_hiddens(self, n_batches):
        return [self.init_hiddens.repeat(1, n_batches, 1), self.init_cells.repeat(1, n_batches, 1)]

---

## Training
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

In [None]:
def free_running_generation(generator, inputs, seq_len):
    """
    Params
    ======
    generator (Generator): the generator model.
    inputs (LongTensor): 2D tensor with dimensions [n_batches, 1].
                         If the given dimensions are [n_batches, seq_len],
                         then only the first timestep is used.
    seq_len (int): length of sequence to generate.

    Returns
    =======
    output_sequence (LongTensor): tensor of generated outputs.
    hiddens (list[Tensor]): updated hidden states.
    """
    output_sequence = []
    if return_all_hiddens:
        internal_hiddens = []

    hiddens = generator.get_init_hiddens(inputs.size(0))
    x = inputs[:, :1]

    for i in range(seq_len):
        y, hiddens = generator(x, hiddens, return_all_hiddens=False)
        y = torch.multinomial(y.squeeze(1).softmax(dim=-1), num_samples=1)
        output_sequence.append(y)
        x = y
    output_sequence = torch.cat(output_sequence, dim=1)

    return output_sequence, hiddens

In [None]:
seed                   = 0
batch_size             = 4
sequence_length        = 12800

n_logs                 = 30

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

random.seed(seed)
torch.manual_seed(seed)

ud_dataset = UndertaleDeltaruneDataset("./source/converted_texts", batch_size)
ud_loader = UDBatchLoader(ud_dataset, batch_size, sequence_length, drop_last=True, batch_first=True)

model = UDNet(hidden_size=256, num_layers=10, dropout=0.2).to(device)

optimizer = optim.Adam(model.parameters(), lr=1e-3)

lr_scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, patience=5, verbose=True)

print()
print('Data Sequence Total Length:', ud_dataset.data_len())
print()
print(model)


Data Sequence Total Length: 893324

UDNet(
  (embed): Embedding(129, 256)
  (lstm): LSTM(256, 256, num_layers=10, batch_first=True, dropout=0.2)
  (fc0): Sequential(
    (0): LayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
    (1): Dropout(p=0.2)
    (2): Linear(in_features=256, out_features=256, bias=True)
  )
  (fc1): Sequential(
    (0): ReLU()
    (1): LayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
    (2): Dropout(p=0.2)
    (3): Linear(in_features=256, out_features=512, bias=True)
  )
  (fc2): Sequential(
    (0): ReLU()
    (1): LayerNorm(torch.Size([512]), eps=1e-05, elementwise_affine=True)
    (2): Dropout(p=0.2)
    (3): Linear(in_features=512, out_features=129, bias=True)
  )
)


In [None]:
model.train()
logs = {'epoch': [], 'lr':[], 'loss_avg': [], 'acc_1': [], 'acc_5': []}
i_epoch = 0
while True:
    
    hiddens = model.get_init_hiddens(batch_size)

    running_loss = 0
    n_top1_corrects = 0
    n_top5_corrects = 0
    n_instances = 0
    for i, (inputs, labels) in enumerate(ud_loader):
        print("{:d}/{:d}".format(i, len(ud_loader)-1), end='\r')

        inputs = inputs.to(device)
        labels = labels.view(-1).to(device)

        outputs, hidden_states = model(inputs, hiddens)
        outputs = outputs.view(-1, outputs.size(-1))

        hiddens = [hiddens[0].detach(), hiddens[1].detach()]

        loss = F.cross_entropy(outputs, labels)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        n_instances     += labels.size(0)
        running_loss    += loss.item() * labels.size(0)
        top5_match       = outputs.data.topk(k=5, dim=1)[1].eq(labels.unsqueeze(1))
        n_top1_corrects += top5_match[:, 0].sum().item()
        n_top5_corrects += top5_match.sum().item()
        del top5_match

    loss_avg = running_loss / n_instances
    acc_1    = n_top1_corrects / n_instances
    acc_5    = n_top5_corrects / n_instances

    logs['epoch'].append(i_epoch)
    logs['lr'].append(optimizer.param_groups[0]['lr'])
    logs['loss_avg'].append(loss_avg)
    logs['acc_1'].append(acc_1)
    logs['acc_5'].append(acc_5)

    clear_output(wait=True)
    print('\n\n'.join(["Epoch {:d} - LR={:e}\n===============================================\n".format(i_e, lr)
                       + "Average Loss: {:f}\nAverage Top-1 Accuracy: {:f}\nAverage Top-5 Accuracy: {:f}\n".format(l_a, a_1, a_5)
                       + "==============================================="
                       for i_e, lr, l_a, a_1, a_5 in list(zip(*list(logs.values())))[-n_logs:]]), end='\n\n')

    if (i_epoch + 1) % 10 == 0:
        torch.save({'logs': logs, 'state_dict': model.state_dict(), 'optim_dict': optimizer.state_dict(), 'lr_dict': lr_scheduler.state_dict()},
                   "deep/{:d}.pth".format(i_epoch))

    lr_scheduler.step(loss_avg)
    i_epoch += 1

Epoch 76 - LR=1.000000e-04
Average Loss: 0.410981
Average Top-1 Accuracy: 0.900944
Average Top-5 Accuracy: 0.972756

Epoch 77 - LR=1.000000e-04
Average Loss: 0.409148
Average Top-1 Accuracy: 0.901425
Average Top-5 Accuracy: 0.972948

Epoch 78 - LR=1.000000e-04
Average Loss: 0.408127
Average Top-1 Accuracy: 0.901641
Average Top-5 Accuracy: 0.973089

Epoch 79 - LR=1.000000e-04
Average Loss: 0.407110
Average Top-1 Accuracy: 0.901728
Average Top-5 Accuracy: 0.973266

Epoch 80 - LR=1.000000e-04
Average Loss: 0.404146
Average Top-1 Accuracy: 0.902516
Average Top-5 Accuracy: 0.973523

Epoch 81 - LR=1.000000e-04
Average Loss: 0.404962
Average Top-1 Accuracy: 0.902107
Average Top-5 Accuracy: 0.973498

Epoch 82 - LR=1.000000e-04
Average Loss: 0.404567
Average Top-1 Accuracy: 0.902197
Average Top-5 Accuracy: 0.973481

Epoch 83 - LR=1.000000e-04
Average Loss: 0.401844
Average Top-1 Accuracy: 0.902748
Average Top-5 Accuracy: 0.973841

Epoch 84 - LR=1.000000e-04
Average Loss: 0.402269
Average Top-1 

In [13]:
# torch.save({'g': generator.state_dict(), 'd': discriminator.state_dict()}, 'professor_forcing_temp/GD_8500.pth')

In [14]:
# generated_sequence = free_running_generation(generator, inputs[:1], 50000, return_all_hiddens=False)[0]

In [16]:
# torch.save(generated_sequence[0].cpu().numpy(), "professor_forcing_temp/generated_sequence.pth")

---