### Learning LSTM

    Author: 彭日骏
    Time: 2025/10/14

Code a RNN & LSTM to do a **MusicGenerationProject**

---

In [5]:
import torch
import torch.nn as nn
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
from torch.utils.data import TensorDataset, DataLoader

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Using device:", device)

Using device: cuda


#### Load data


---

**音乐语法化 (Data Representation)**

Through `grammar.py` & `preprocess.py`

Define different notes(音符类型)

`C` (Chord tone)

`S` (Scale tone)

`A` (Approach tone)

`R` (Rest):

`X` (Arbitrary):

**Token的格式:** (类型, 时值, 与上一个音符的音程)

已知Tokens total = 78

**序列化与数据准备 (Sequence Modeling)**

Through `preprocess.py` & `music_utils.py` generate

`Corpus(语料库)`: A long sequence contains all **Tokens**

`Training dataset(训练集)`: 切分sequence为多个 Tx = 30 的序列X

`Label dataset(标签集)`: 等效序列X向后平移一位得到Y,  $Y = (x_{1}, x_{2}, ..., x_{n}, x_{n+1})$ 去掉 $x_{0}$

`One-Hot 编码`: 长成78维向量，对应Token的向量为对应位置1

已知X = (60, 30, 78) , 被切成: 60个片段, 每个片段长30个Tokens, 每个Token对应78维向量

In [7]:
from music_utils import * 
from preprocess import * 

def load_music_utils():
    chords, abstract_grammars = get_musical_data('data/original_metheny.mid')
    corpus, tones, tones_indices, indices_tones = get_corpus_data(abstract_grammars)
    N_tones = len(set(corpus))
    X, Y, N_tones = data_processing(corpus, tones_indices, 60, 30)   
    return (X, Y, N_tones, indices_tones)

X, Y, n_values, indices_values = load_music_utils()

ValueError: not enough values to unpack (expected 2, got 0)

In [None]:
# LSTM Model
class temp_predict_LSTM(nn.Module):
    def __init__(self, input_size=1, hidden_layer_size=50, output_size=1):
        '''
        Def LSTM layer
        Parameters:
            input_size: num of features for the input sequence（输入特征的数量），只需要预测温度一个特征
            hidden_layer_size
            batch_first 
                        let Tensor turn to (batch_size, seq_length, input_size)
        '''
        super().__init__()
        self.hidden_layer_size = hidden_layer_size
        # Change!!!
        self.lstm = nn.LSTM(input_size, hidden_layer_size, batch_first=True)
        self.linear = nn.Linear(hidden_layer_size, output_size)

    def forward(self, input_seq):
        '''
        Forward Algorithm
        LSTM need 2 of origin status hidden_layer h0 & cell_layer c0
        '''
        # LSTM需要初始化隐藏状态h_0 = (num_layers, batch_size, hidden_size)
        # LSTM需要初始化隐藏状态c_0 = (num_layers, batch_size, hidden_size)
        # 且我们需要把h0, c0移植至device
        h0 = torch.zeros(1, input_seq.size(0), self.hidden_layer_size).to(input_seq.device)
        c0 = torch.zeros(1, input_seq.size(0), self.hidden_layer_size).to(input_seq.device)

        # lstm_out 是所有时间步的输出
        # lstm_out 为常用Tensor张量(batch_size, seq_length, input_size) ---???为何这样处理数据更简便
        # hidden 是最后一个时间步的隐藏状态
        lstm_out, hidden = self.lstm(input_seq, h0)

        # 只关心序列最后一个时间步的输出
        # lstm[:, -1, :]可丢弃掉中间维度seq_length, 变为二维Tensor张量(batch_size, hidden_size)
        predictions = self.linear(lstm_out[:, -1, :])

        return predictions

---