# Undertale & Deltarune Soundtrack Generator

---

## Table of Contents

0. [**Table of Contents**](#Table-of-Contents)

1. [**Imports**](#Imports)

2. [**Data Processing**](#Data-Processing)

    2.1 [Data Loading](#Data-Loading)
    
    2.2 [Data Preprocessing](#Data-Preprocessing)
    
    2.3 [Dataset Definition](#Dataset-Definition)
    
3. [**Model Definition**](#Model-Definition)

    3.1 [Model Classes](#Model-Classes)
    
    3.2 [Hyperparameters](#Hyperparameters)
    
    3.3 [Model Instantiation](#Model-Instantiation)

4. [**Training**](#Training)
    
    4.1 [Training Function](#Training-Function)
    
    4.2 [Training Session](#Training-Session)

5. [**Saving Trained Model**](#Saving-Trained-Model)

6. [**Generation**](#Generation)

    6.1 [Generation Function](#Generation-Function)
    
    6.2 [Sampling Function](#Sampling-Function)
    
    6.3 [Music Generation](#Music-Generation)

7. [**Final Summary, Notes, and Thoughts**](#Final-Summary,-Notes,-and-Thoughts)

---

## Imports
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

Import required packages:

    * os (for file handling)
    
    * itertools (chain() for merging lists)
    
    * collections (useful tools like Counter, OrderedDict)
    
    * random (for sequence shuffling)
    
    * tqdm (progress bar)

    * PyTorch (Deep Learning Framework)
    
    * Matplotlib (Plotting)

In [14]:
import os
import itertools
import random
import collections

import tqdm

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader

import matplotlib.pyplot as plt
%matplotlib inline

---

## Data Processing
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

### Data Loading
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

Read the text files in the target directory.

Do some processing to make sure the texts are clean.

In [16]:
def get_texts(texts_dir):

    if not os.path.isdir(texts_dir):
        raise FileNotFoundError("given text directory not found: {}".format(texts_dir))

    texts = []
    
    for text_path in (file.path for file in os.scandir(texts_dir) if file.is_file() and file.name.endswith(".txt")):
        with open(file=text_path, mode='r', encoding="utf-8") as text_file:
            
            text = text_file.read().strip()

            if not text.replace(' ', '').isdigit():
                raise RuntimeError("one or more characters other than digits and white spaces are detected: {}".format(text_path))

            while "  " in text:
                text = text.replace("  ", ' ')
            
            texts.append(text)
    
    return texts


print([text[:30] for text in get_texts("./source/converted_texts")])

['42 46 49 53 0 42 46 49 53 0 42', '73 89 0 73 89 0 73 89 0 73 89 ', '39 51 0 39 51 0 39 51 0 39 51 ', '48 0 48 0 48 0 48 0 48 0 48 0 ', '39 0 39 0 39 0 39 0 39 0 0 0 0', '30 0 30 0 30 0 30 0 30 0 30 0 ', '48 55 0 48 55 0 48 55 0 48 55 ', '27 39 0 27 39 0 27 39 0 27 39 ', '61 64 71 75 0 61 64 71 75 0 61', '77 0 77 0 77 0 77 0 77 0 77 0 ', '74 0 74 0 74 0 74 0 74 0 74 0 ', '32 36 39 68 0 32 36 39 68 0 32', '62 0 62 0 62 0 0 0 0 65 0 65 0', '0 0 0 62 0 62 0 62 0 62 0 65 0', '49 0 49 0 49 0 49 0 49 0 49 0 ', '31 43 0 31 43 0 31 43 0 31 43 ', '24 31 0 24 31 0 24 31 0 24 31 ', '45 57 0 45 57 0 45 57 0 45 57 ', '39 0 39 0 39 0 39 0 39 0 39 0 ', '46 0 46 0 46 0 46 0 46 0 46 0 ', '0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ', '58 70 0 58 70 0 58 70 0 58 70 ', '37 49 0 37 49 0 37 49 0 37 49 ', '44 68 0 44 68 0 44 68 0 44 68 ', '67 0 67 0 67 0 67 0 67 0 67 0 ', '61 0 61 0 61 0 61 0 61 0 61 0 ', '49 0 49 0 49 0 49 0 54 0 54 0 ', '43 74 0 43 74 0 43 74 0 43 74 ', '55 0 55 0 55 0 55 0 55 0 0 0 0', '38 66 0 38 6

### Data Preprocessing
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

In [17]:
def texts_to_intlists(text_list):
    
    intlists = []
    
    for i, text in enumerate(iterable=text_list):
        
        int_strings = text.split(' ')
        
        if not all(int_str.isdigit() for int_str in int_strings):
            raise RuntimeError("non-digit string detected in text {}".format(i))

        ints = [int(int_str) for int_str in int_strings]
        
        intlists.append(ints)
        
    return intlists


print([ints[:10] for ints in texts_to_intlists(get_texts("./source/converted_texts"))])

[[42, 46, 49, 53, 0, 42, 46, 49, 53, 0], [73, 89, 0, 73, 89, 0, 73, 89, 0, 73], [39, 51, 0, 39, 51, 0, 39, 51, 0, 39], [48, 0, 48, 0, 48, 0, 48, 0, 48, 0], [39, 0, 39, 0, 39, 0, 39, 0, 39, 0], [30, 0, 30, 0, 30, 0, 30, 0, 30, 0], [48, 55, 0, 48, 55, 0, 48, 55, 0, 48], [27, 39, 0, 27, 39, 0, 27, 39, 0, 27], [61, 64, 71, 75, 0, 61, 64, 71, 75, 0], [77, 0, 77, 0, 77, 0, 77, 0, 77, 0], [74, 0, 74, 0, 74, 0, 74, 0, 74, 0], [32, 36, 39, 68, 0, 32, 36, 39, 68, 0], [62, 0, 62, 0, 62, 0, 0, 0, 0, 65], [0, 0, 0, 62, 0, 62, 0, 62, 0, 62], [49, 0, 49, 0, 49, 0, 49, 0, 49, 0], [31, 43, 0, 31, 43, 0, 31, 43, 0, 31], [24, 31, 0, 24, 31, 0, 24, 31, 0, 24], [45, 57, 0, 45, 57, 0, 45, 57, 0, 45], [39, 0, 39, 0, 39, 0, 39, 0, 39, 0], [46, 0, 46, 0, 46, 0, 46, 0, 46, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [58, 70, 0, 58, 70, 0, 58, 70, 0, 58], [37, 49, 0, 37, 49, 0, 37, 49, 0, 37], [44, 68, 0, 44, 68, 0, 44, 68, 0, 44], [67, 0, 67, 0, 67, 0, 67, 0, 67, 0], [61, 0, 61, 0, 61, 0, 61, 0, 61, 0], [49, 0, 49, 0, 

To use "words" as the input and output instead of "characters",

consider '0's as spaces and find all existing words in the texts.

(Here, each word becomes a "token")

In [29]:
def tokenize(intlists):
    
    counter = collections.Counter()
    tokenized_lists = []
    
    for intlist in intlists:
        token = []
        tokenized = []
        for int_val in intlist:
            if int_val != 0:
                token.append(int_val)
            else:
                token = tuple(sorted(token))
                counter.update((token,))
                tokenized.append(token)
                token = []
        tokenized_lists.append(tokenized)
    
    tokens_token_to_idx = collections.OrderedDict((token_key, i) for i, (token_key, _) in enumerate(counter.most_common()))
    tokens_idx_to_token = collections.OrderedDict((i, token_key) for token_key, i in tokens_token_to_idx.items())
    print(len(tokens_idx_to_token), "tokens")
    
    for tokenized in tokenized_lists:
        for i, token_key in enumerate(tokenized):
            tokenized[i] = tokens_token_to_idx[token_key]

    return tokenized_lists, tokens_idx_to_token

tokenize(texts_to_intlists(get_texts("./source/converted_texts")))    

7521 tokens


([[49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,
   49,

### Dataset Definition
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

---

## Model Definition
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

### Model Classes
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

### Hyperparameters
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

### Model Instantiation
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

---

## Training
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

### Training Function
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

### Training Session
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

---

## Saving Trained Model
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

---

## Generation
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

### Generation Function
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

### Sampling Function
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

### Music Generation
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

---

## Final Summary, Notes, and Thoughts
[(go to top)](#Undertale-&-Deltarune-Soundtrack-Generator)

---