# Music Transfomer

# 평가 루브릭

아래의 기준을 바탕으로 프로젝트를 평가합니다.

## 평가문항	상세기준
1. MAESTRO 데이터셋의 전처리를 통해 학습 데이터셋 구성을 체계적으로 진행하였다.: MIDI 파일의 구조와 특성에 맞게 적절한 가공과 augmentation을 통해 학습데이터를 생성하였다.
2. Music Transformer 모델의 구현 및 학습이 원활이 진행되었다.: 모델의 학습이 원활히 진행되었으며, loss가 안정적으로 감소하였다.
3. 다양한 조건 변경을 통해 다양한 음악을 생성하는 실험을 진행하였다.: 생성테스트 전 변경 가능한 조건을 정확히 제시하고, 고품질의 midi 파일을 제출하였다.

# ★ 프로젝트 : 다양한 조건의 음악 생성하기 ★ 
음악 생성 모델을 다루느라 고생 많으셨습니다.

이전 스텝에서 음악 생성 테스트시 바꾸어 볼 수 있는 초기값들을 다양하게 바꾸어 보면서 음악 생성을 테스트해 보시기 바랍니다. 모델과 초기 입력이 같더라도 매번의 시도에 따라 생성된 midi 파일이 달라집니다.

# MIDI 파일구조 분석
- 여러분들 대부분은 MIDI 파일 구조가 낯설 것입니다. 그래서 파일 하나만 골라 어떤 형태로 구성되었는지 살펴보겠습니다.

- 우선 MIDI 파일을 분석하기 위해 다음 라이브러리를 설치해 주세요.

- \$ pip install mido


In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
!pip install mido

Collecting mido
[?25l  Downloading https://files.pythonhosted.org/packages/20/0a/81beb587b1ae832ea6a1901dc7c6faa380e8dd154e0a862f0a9f3d2afab9/mido-1.2.9-py2.py3-none-any.whl (52kB)
[K     |██████▎                         | 10kB 17.0MB/s eta 0:00:01[K     |████████████▌                   | 20kB 20.5MB/s eta 0:00:01[K     |██████████████████▊             | 30kB 11.8MB/s eta 0:00:01[K     |█████████████████████████       | 40kB 9.3MB/s eta 0:00:01[K     |███████████████████████████████▏| 51kB 8.1MB/s eta 0:00:01[K     |████████████████████████████████| 61kB 4.8MB/s 
[?25hInstalling collected packages: mido
Successfully installed mido-1.2.9


In [3]:
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.preprocessing.sequence import pad_sequences

import pandas as pd
import numpy as np

import time
import os
# os.environ["CUDA_VISIBLE_DEVICES"]="1"
import concurrent.futures

import mido

In [4]:
tf.config.list_physical_devices('GPU')

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

In [5]:
# 샘플로 1개의 MIDI 파일을 골라봅니다.
midi_file = '/content/drive/Shared drives/GOFOODA/DOS/music_transformer/data/maestro-v2.0.0/2018/MIDI-Unprocessed_Chamber1_MID--AUDIO_07_R3_2018_wav--2.midi'

midi = mido.MidiFile(midi_file)

- MIDI 파일을 열어보면 이벤트의 리스트(list)로 구성되어 있음을 알 수 있습니다. 아래 코드를 통해 MIDI 파일 앞부분의 30개 정도의 이벤트 구조가 어떻게 되어 있는지 파악해 봅시다.

In [6]:
ON = 1
OFF = 0
CC = 2

current_time = 0
eventlist = []
cc = False
for idx, msg in enumerate(midi):
    print('MSG [{}]----------------'.format(idx))
    current_time += msg.time
    print(current_time)
    print(msg.type)
    if msg.type is 'note_on' and msg.velocity > 0:
        event = [current_time, ON, msg.note, msg.velocity]
        print(event)
    elif msg.type is 'note_off' or (msg.type is 'note_on' and msg.velocity == 0):
        event = [current_time, OFF, msg.note, msg.velocity]
        print(event)
        
    if msg.type is 'control_change':
        if msg.control != 64:
            continue
        if cc == False and msg.value > 0:
            cc = True
            event = [current_time, CC, 0, 1]
            print(event)
        elif cc == True and msg.value == 0:
            cc = False
            event = [current_time, CC, 0, 0]
            print(event)

    if idx > 30:
        break

MSG [0]----------------
0
set_tempo
MSG [1]----------------
0
time_signature
MSG [2]----------------
0
program_change
MSG [3]----------------
0
control_change
MSG [4]----------------
0
control_change
MSG [5]----------------
0.5143229166666666
control_change
MSG [6]----------------
0.6328125
control_change
MSG [7]----------------
0.7903645833333333
control_change
MSG [8]----------------
0.9999999999999999
control_change
MSG [9]----------------
1.0325520833333333
note_on
[1.0325520833333333, 1, 74, 86]
MSG [10]----------------
1.0442708333333333
note_on
[1.0442708333333333, 1, 38, 77]
MSG [11]----------------
1.0794270833333333
control_change
MSG [12]----------------
1.1184895833333333
control_change
MSG [13]----------------
1.1588541666666665
control_change
MSG [14]----------------
1.2174479166666665
control_change
MSG [15]----------------
1.2265624999999998
note_on
[1.2265624999999998, 0, 74, 0]
MSG [16]----------------
1.2369791666666665
control_change
MSG [17]----------------
1.23958

# Data Augmentation 진행 
- MIDI 앞부분의 이벤트 메시지 타입은 control_change 등의 세팅 부분이고, 실제 악보 부분은 note_on 메시지를 통해 구현됩니다. 
- 위 코드에서 이벤트 구조는 [음 지속시간, ON/OFF, 음고(pitch), 속도(velocity)]에 해당합니다.

- 오늘 실험을 위해 MIDI 파일을 일부 전처리한 파일을 함께 제공하겠습니다. 
- 전처리 로직은 아래 get_data()를 통해 구현되었습니다. 
- 단순히 MIDI 파일을 가공만 한 것이 아니라 time, note, interval 등에 대한 augmentation까지 함께 진행하는 것입니다.

In [7]:
IntervalDim = 100

VelocityDim = 32
VelocityOffset = IntervalDim

NoteOnDim = NoteOffDim = 128
NoteOnOffset = IntervalDim + VelocityDim
NoteOffOffset = IntervalDim + VelocityDim + NoteOnDim

CCDim = 2
CCOffset = IntervalDim + VelocityDim + NoteOnDim + NoteOffDim

EventDim = IntervalDim + VelocityDim + NoteOnDim + NoteOffDim + CCDim # 390

def get_data(data, length):    
    # time augmentation
    data[:, 0] *= np.random.uniform(0.80, 1.20)
    
    # absolute time to relative interval
    data[1:, 0] = data[1:, 0] - data[:-1, 0]
    data[0, 0] = 0
    
    # discretize interval into IntervalDim
    data[:, 0] = np.clip(np.round(data[:, 0] * IntervalDim), 0, IntervalDim - 1)
    
    # Note augmentation
    data[:, 2] += np.random.randint(-6, 6)
    data[:, 2] = np.clip(data[:, 2], 0, NoteOnDim - 1)
    
    eventlist = []
    for d in data:
        # append interval
        interval = d[0]
        eventlist.append(interval)
    
        # note on case
        if d[1] == 1:
            velocity = (d[3] / 128) * VelocityDim + VelocityOffset
            note = d[2] + NoteOnOffset
            eventlist.append(velocity)
            eventlist.append(note)
            
        # note off case
        elif d[1] == 0:
            note = d[2] + NoteOffOffset
            eventlist.append(note)
        # CC
        elif d[1] == 2:
            event = CCOffset + d[3]
            eventlist.append(event)
            
    eventlist = np.array(eventlist).astype(np.int)
    
    if len(eventlist) > (length+1):
        start_index = np.random.randint(0, len(eventlist) - (length+1))
        eventlist = eventlist[start_index:start_index+(length+1)]
        
    # pad zeros
    if len(eventlist) < (length+1):
        pad = (length+1) - len(eventlist)
        eventlist = np.pad(eventlist, (pad, 0), 'constant')
        
    x = eventlist[:length]
    y = eventlist[1:length+1]
    
    return x, y

# STEP 1 : MAESTRO 데이터셋을 전처리하여 훈련용 데이터셋 구성하기


- 데이터셋 구성
- 이전 스텝에서 확보한 데이터셋 파일을 가공하여 학습용 데이터셋을 구성해 봅시다.


In [8]:
data_path = '/content/drive/Shared drives/GOFOODA/DOS/music_transformer/data/midi_test.npy'

get_midi = np.load(data_path, allow_pickle=True)
get_midi.shape

(1282,)

In [9]:
length = 256
train = []
labels = []

for midi_list in get_midi:
    cut_list = [midi_list[i:i+length] for i in range(0, len(midi_list), length)]
    for sublist in cut_list:
        x, y = get_data(np.array(sublist), length)
        train.append(x)
        labels.append(y)

In [10]:
train = np.array(train)
labels = np.array(labels)

print(train.shape, labels.shape)   # 학습을 위해 MIDI list를 256 길이로 나누었다.

(59268, 256) (59268, 256)


- 256의 길이를 가지는 59268개의 시퀀스가 생성되었습니다. label은 train 데이터를 1만큼 shift한 것입니다.

- 우리 데이터셋은 마치 자연어처리에서의 language model 훈련용 데이터셋같은 구성을 가집니다. 그래서 데이터셋을 구성하는 기법도 비슷해집니다.



In [11]:
train_data_pad = pad_sequences(train,
                               maxlen=length,
                               padding='post',
                               value=0)
train_label_pad = pad_sequences(labels,
                                maxlen=length,
                                padding='post',
                                value=0)

In [12]:
def tensor_casting(train, label):
    train = tf.cast(train, tf.int64)
    label = tf.cast(label, tf.int64)

    return train, label

In [13]:
train_dataset = tf.data.Dataset.from_tensor_slices((train_data_pad, train_label_pad))
train_dataset = train_dataset.map(tensor_casting)
train_dataset = train_dataset.shuffle(10000).batch(batch_size=45)

- 배치사이즈는 16 정도가 적당할 것입니다. 16GB RAM을 가진 모델에서 32 정도의 배치사이즈는 메모리 오류가 발생할 가능성이 있습니다.

In [14]:
for t,l in train_dataset.take(1):
    print(t)
    print(l)

tf.Tensor(
[[115 190   1 ... 327   4 116]
 [116 202   1 ... 207  11 335]
 [  3 330  23 ... 205   7 333]
 ...
 [302   0 305 ... 212   1 115]
 [  9 122 187 ...   1 299   1]
 [113 206   1 ...   1 111 204]], shape=(45, 256), dtype=int64)
tf.Tensor(
[[190   1 114 ...   4 116 194]
 [202   1 314 ...  11 335   3]
 [330  23 118 ...   7 333   2]
 ...
 [  0 305  11 ...   1 115 222]
 [122 187   2 ... 299   1 114]
 [206   1 115 ... 111 204   0]], shape=(45, 256), dtype=int64)


# STEP 2 : Music Transformer 모델을 구현하여 학습 진행하기
단, 20Epoch를 완전히 학습 진행해야 하는 것은 아닙니다. 하지만 최초의 체크포인트가 저장되는 2Epoch까지는 진행해 주세요.

# Music Transformer 모델 구현
- 이제 본격적으로 모델을 구현해 봅시다.

In [15]:
def create_padding_mask(seq):
    seq = tf.cast(tf.math.equal(seq, 1), tf.float32)

    # add extra dimensions to add the padding
    # to the attention logits.
    return seq[:, tf.newaxis, tf.newaxis, :]  # (batch_size, 1, 1, seq_len)


def create_look_ahead_mask(size):
    mask = tf.linalg.band_part(tf.ones((size, size)), -1, 0)
    return mask  # (seq_len, seq_len)


def point_wise_feed_forward_network(d_model, dff):
    return tf.keras.Sequential([
        tf.keras.layers.Dense(dff, activation='relu'),  # (batch_size, seq_len, dff)
        tf.keras.layers.Dense(d_model)  # (batch_size, seq_len, d_model)
    ])

- 아래 레이어가 Music Transformer의 가장 핵심적인 원리를 구성하는 RelativeGlobalAttention 으로서, self-attention 을 대신하여 사용됩니다.

In [16]:
class RelativeGlobalAttention(tf.keras.layers.Layer):
    def __init__(self, d_model, num_heads):
        super(RelativeGlobalAttention, self).__init__()
        self.num_heads = num_heads
        self.d_model = d_model
        self.headDim = d_model // num_heads
        self.contextDim = int(self.headDim * self.num_heads)
        self.eventDim = 390
        self.E = self.add_weight('E', shape=[self.num_heads, length, self.headDim])

        assert d_model % self.num_heads == 0

        self.wq = tf.keras.layers.Dense(self.headDim)
        self.wk = tf.keras.layers.Dense(self.headDim)
        self.wv = tf.keras.layers.Dense(self.headDim)
    
    def call(self, v, k, q, mask):
        # [Heads, Batch, Time, HeadDim]
        q = tf.stack([self.wq(q) for _ in range(self.num_heads)])
        k = tf.stack([self.wk(k) for _ in range(self.num_heads)])
        v = tf.stack([self.wv(v) for _ in range(self.num_heads)])

        self.batch_size = q.shape[1]
        self.max_len = q.shape[2]
        
        #skewing
        # E = Heads, Time, HeadDim
        # [Heads, Batch * Time, HeadDim]
        Q_ = tf.reshape(q, [self.num_heads, self.batch_size * self.max_len, self.headDim])
        # [Heads, Batch * Time, Time]
        S = tf.matmul(Q_, self.E, transpose_b=True)
        # [Heads, Batch, Time, Time]
        S = tf.reshape(S, [self.num_heads, self.batch_size, self.max_len, self.max_len])
        # [Heads, Batch, Time, Time+1]
        S = tf.pad(S, ((0, 0), (0, 0), (0, 0), (1, 0)))
        # [Heads, Batch, Time+1, Time]
        S = tf.reshape(S, [self.num_heads, self.batch_size, self.max_len + 1, self.max_len])   
        # [Heads, Batch, Time, Time]
        S = S[:, :, 1:]
        # [Heads, Batch, Time, Time]
        attention = (tf.matmul(q, k, transpose_b=True) + S) / np.sqrt(self.headDim)
        # mask tf 2.0 == tf.linalg.band_part
        get_mask = tf.linalg.band_part(tf.ones([self.max_len, self.max_len]), -1, 0)
        attention = attention * get_mask - tf.cast(1e10, attention.dtype) * (1-get_mask)
        score = tf.nn.softmax(attention, axis=3)

        # [Heads, Batch, Time, HeadDim]
        context = tf.matmul(score, v)
        # [Batch, Time, Heads, HeadDim]
        context = tf.transpose(context, [1, 2, 0, 3])
        # [Batch, Time, ContextDim]
        context = tf.reshape(context, [self.batch_size, self.max_len, self.d_model])
        # [Batch, Time, ContextDim]
        logits = tf.keras.layers.Dense(self.d_model)(context)

        return logits, score

In [17]:
class EncoderLayer(tf.keras.layers.Layer):
    def __init__(self, d_model, num_heads, dff, rate=0.1):
        super(EncoderLayer, self).__init__()

        self.rga = RelativeGlobalAttention(d_model, num_heads)
        self.ffn = point_wise_feed_forward_network(d_model, dff)

        self.layernorm1 = tf.keras.layers.LayerNormalization(epsilon=1e-6)
        self.layernorm2 = tf.keras.layers.LayerNormalization(epsilon=1e-6)

        self.dropout1 = tf.keras.layers.Dropout(rate)
        self.dropout2 = tf.keras.layers.Dropout(rate)

    def call(self, x, training, mask):
        attn_output, _ = self.rga(x, x, x, mask)  # (batch_size, input_seq_len, d_model)
        attn_output = self.dropout1(attn_output, training=training)
        out1 = self.layernorm1(x + attn_output)  # (batch_size, input_seq_len, d_model)

        ffn_output = self.ffn(out1)  # (batch_size, input_seq_len, d_model)
        ffn_output = self.dropout2(ffn_output, training=training)
        out2 = self.layernorm2(out1 + ffn_output)  # (batch_size, input_seq_len, d_model)

        return out2

In [18]:
class DecoderLayer(tf.keras.layers.Layer):
    def __init__(self, d_model, num_heads, dff, rate=0.1):
        super(DecoderLayer, self).__init__()

        self.rga1 = RelativeGlobalAttention(d_model, num_heads)
        self.rga2 = RelativeGlobalAttention(d_model, num_heads)

        self.ffn = point_wise_feed_forward_network(d_model, dff)

        self.layernorm1 = tf.keras.layers.LayerNormalization(epsilon=1e-6)
        self.layernorm2 = tf.keras.layers.LayerNormalization(epsilon=1e-6)
        self.layernorm3 = tf.keras.layers.LayerNormalization(epsilon=1e-6)

        self.dropout1 = tf.keras.layers.Dropout(rate)
        self.dropout2 = tf.keras.layers.Dropout(rate)
        self.dropout3 = tf.keras.layers.Dropout(rate)

    def call(self, x, enc_output, training, look_ahead_mask, padding_mask):
        # enc_output.shape == (batch_size, input_seq_len, d_model)

        attn1, attn_weights_block1 = self.rga1(x, x, x, look_ahead_mask)  # (batch_size, target_seq_len, d_model)
        attn1 = self.dropout1(attn1, training=training)
        out1 = self.layernorm1(attn1 + x)

        attn2, attn_weights_block2 = self.rga2(
            enc_output, enc_output, out1, padding_mask)  # (batch_size, target_seq_len, d_model)
        attn2 = self.dropout2(attn2, training=training)
        out2 = self.layernorm2(attn2 + out1)  # (batch_size, target_seq_len, d_model)

        ffn_output = self.ffn(out2)  # (batch_size, target_seq_len, d_model)
        ffn_output = self.dropout3(ffn_output, training=training)
        out3 = self.layernorm3(ffn_output + out2)  # (batch_size, target_seq_len, d_model)

        return out3, attn_weights_block1, attn_weights_block2

In [19]:
class Encoder(tf.keras.layers.Layer):
    def __init__(self, num_layers, d_model, num_heads, dff, rate=0.1):
        super(Encoder, self).__init__()

        self.num_layers = num_layers
        self.enc_layers = [EncoderLayer(d_model, num_heads, dff, rate) 
                           for _ in range(num_layers)]

        self.dropout = tf.keras.layers.Dropout(rate)

    def call(self, x, training, mask):
        seq_len = tf.shape(x)[1]
        x = self.dropout(x, training=training)

        for i in range(self.num_layers):
            x = self.enc_layers[i](x, training, mask)

        return x  # (batch_size, input_seq_len, d_model)

In [20]:
class Decoder(tf.keras.layers.Layer):
    def __init__(self, num_layers, d_model, num_heads, dff, rate=0.1):
        super(Decoder, self).__init__()
        self.num_layers = num_layers
        self.dec_layers = [DecoderLayer(d_model, num_heads, dff, rate) 
                           for _ in range(num_layers)]
        self.dropout = tf.keras.layers.Dropout(rate)

    def call(self, x, enc_output, training, look_ahead_mask, padding_mask):
        attention_weights = {}
        x = self.dropout(x, training=training)

        for i in range(self.num_layers):
            x, block1, block2 = self.dec_layers[i](x, enc_output, training,
                                                   look_ahead_mask, padding_mask)

            attention_weights['decoder_layer{}_block1'.format(i+1)] = block1
            attention_weights['decoder_layer{}_block2'.format(i+1)] = block2

        # x.shape == (batch_size, target_seq_len, d_model)
        return x, attention_weights

In [21]:
class MusicTransformer(tf.keras.Model):
    def __init__(self, num_layers, d_model, num_heads, dff, input_vocab_size, rate=0.1):
        super(MusicTransformer, self).__init__()
        self.d_model = d_model
        self.embedding = tf.keras.layers.Embedding(input_vocab_size, d_model)

        self.encoder = Encoder(num_layers, d_model, num_heads, dff, rate)
        self.decoder = Decoder(num_layers, d_model, num_heads, dff, rate)

        self.final_layer = tf.keras.layers.Dense(input_vocab_size)

    def call(self, inp, training, enc_padding_mask, 
             look_ahead_mask, dec_padding_mask):
        embed = self.embedding(inp)
        embed *= tf.math.sqrt(tf.cast(self.d_model, tf.float32))

        enc_output = self.encoder(embed, training, enc_padding_mask)  # (batch_size, inp_seq_len, d_model)

        # dec_output.shape == (batch_size, tar_seq_len, d_model)
        dec_output, attention_weights = self.decoder(
            embed, enc_output, training, look_ahead_mask, dec_padding_mask)

        final_output = self.final_layer(dec_output)  # (batch_size, tar_seq_len, target_vocab_size)

        return final_output, attention_weights

- 어떻습니까? 일반적인 Transformer Encoder-Decoder 모델과 거의 같은 구조를 가지지만 
- Self-Attention 레이어가 RelativeGlobalAttention 로 바뀌었다는 차이만 있습니다.

- RelativeGlobalAttention 레이어 구조를 유심히 파악해 주세요.

# Music Transformer 모델 학습
- 이전 스텝에서 구현한 Music Transformer 모델을 활용하여 실제 학습을 진행해 보겠습니다.

In [22]:
num_layers = 4
d_model = 128
dff = 512
num_heads = 8

input_vocab_size = 390   # MIDI가 낼 수 있는 소리의 종류
dropout_rate = 0.1

In [23]:
# 모델 선언
music_transformer = MusicTransformer(num_layers, d_model, num_heads, dff,
                                     input_vocab_size, rate=dropout_rate)

In [24]:
class CustomSchedule(tf.keras.optimizers.schedules.LearningRateSchedule):
    def __init__(self, d_model, warmup_steps=4000):
        super(CustomSchedule, self).__init__()

        self.d_model = d_model
        self.d_model = tf.cast(self.d_model, tf.float32)

        self.warmup_steps = warmup_steps

    def __call__(self, step):
        arg1 = tf.math.rsqrt(step)
        arg2 = step * (self.warmup_steps ** -1.5)

        return tf.math.rsqrt(self.d_model) * tf.math.minimum(arg1, arg2)

In [25]:
learning_rate = CustomSchedule(d_model)

optimizer = tf.keras.optimizers.Adam(learning_rate, beta_1=0.9, beta_2=0.98, 
                                     epsilon=1e-9)

- language model이 다음에 올 단어를 맞추는 classification task 구성을 가지는 것처럼, MIDI 생성 모델도 390가지 음향 종류 중 어느 것이 올지를 맞추는 문제로 구성하였으므로 아래와 같이 SparseCategoricalCrossentropy로 loss 함수를 정의합니다.

In [26]:
loss_object = tf.keras.losses.SparseCategoricalCrossentropy(
    from_logits=True, reduction='none')

In [27]:
def loss_function(real, pred):
    mask = tf.math.logical_not(tf.math.equal(real, 0))
    loss_ = loss_object(real, pred)

    mask = tf.cast(mask, dtype=loss_.dtype)
    loss_ *= mask

    return tf.reduce_sum(loss_)/tf.reduce_sum(mask)

In [28]:
train_loss = tf.keras.metrics.Mean(name='train_loss')

In [29]:
checkpoint_path = '/content/drive/Shared drives/GOFOODA/DOS/music_transformer/models/'


ckpt = tf.train.Checkpoint(music_transformer=music_transformer,
                           optimizer=optimizer)

ckpt_manager = tf.train.CheckpointManager(ckpt, checkpoint_path, max_to_keep=5)

# if a checkpoint exists, restore the latest checkpoint.
if ckpt_manager.latest_checkpoint:
    ckpt.restore(ckpt_manager.latest_checkpoint)
    print ('Latest checkpoint restored!!')

Latest checkpoint restored!!


- 이제 모델 학습에 들어갑니다.

- (주의) 이 모델은 총 20epoch를 학습해야 하지만, 1epoch만 학습하는 데도 1시간 가까운 시간이 소요됩니다. 따라서 당일 전체 모델학습을 마무리하는 것은 무리스러우므로 1epoch만 학습을 진행해 보겠습니다.

In [30]:
EPOCHS = 30  
# EPOCHS = 1  # 1epoch가 매우 오래 걸립니다. 

for epoch in range(EPOCHS):
    start = time.time()

    train_loss.reset_states()

    for (batch, (inp, tar)) in enumerate(train_dataset):
        with tf.GradientTape() as tape:
            predictions, _ = music_transformer(inp, True, None, None, None)
            loss = loss_function(tar, predictions)

        gradients = tape.gradient(loss, music_transformer.trainable_variables)    
        optimizer.apply_gradients(zip(gradients, music_transformer.trainable_variables))

        train_loss(loss)

        if batch % 50 == 0:
            print ('Epoch {} Batch {} Loss {:.4f}'.format(
                epoch + 1, batch, train_loss.result()))

    if (epoch + 1) % 2 == 0:
        ckpt_save_path = ckpt_manager.save()
        print ('Saving checkpoint for epoch {} at {}'.format(epoch+1,
                                                             ckpt_save_path))

    print ('Epoch {} Loss {:.4f}'.format(epoch + 1, train_loss.result()))

    print ('Time taken for 1 epoch: {} secs\n'.format(time.time() - start))

Epoch 1 Batch 0 Loss 3.2731
Epoch 1 Batch 50 Loss 3.2873
Epoch 1 Batch 100 Loss 3.2888
Epoch 1 Batch 150 Loss 3.2890
Epoch 1 Batch 200 Loss 3.2892
Epoch 1 Batch 250 Loss 3.2890
Epoch 1 Batch 300 Loss 3.2882
Epoch 1 Batch 350 Loss 3.2891
Epoch 1 Batch 400 Loss 3.2901
Epoch 1 Batch 450 Loss 3.2900
Epoch 1 Batch 500 Loss 3.2903
Epoch 1 Batch 550 Loss 3.2917
Epoch 1 Batch 600 Loss 3.2915
Epoch 1 Batch 650 Loss 3.2929
Epoch 1 Batch 700 Loss 3.2940
Epoch 1 Batch 750 Loss 3.2948
Epoch 1 Batch 800 Loss 3.2948
Epoch 1 Batch 850 Loss 3.2944
Epoch 1 Batch 900 Loss 3.2945
Epoch 1 Batch 950 Loss 3.2946
Epoch 1 Batch 1000 Loss 3.2949
Epoch 1 Batch 1050 Loss 3.2952
Epoch 1 Batch 1100 Loss 3.2956
Epoch 1 Batch 1150 Loss 3.2956
Epoch 1 Batch 1200 Loss 3.2955
Epoch 1 Batch 1250 Loss 3.2956
Epoch 1 Batch 1300 Loss 3.2957
Epoch 1 Loss 3.2956
Time taken for 1 epoch: 790.1564404964447 secs

Epoch 2 Batch 0 Loss 3.3171
Epoch 2 Batch 50 Loss 3.2822
Epoch 2 Batch 100 Loss 3.2823
Epoch 2 Batch 150 Loss 3.2851
E

- 매 2 epoch마다 checkpoint를 저장하도록 구성하였습니다. 1epoch만 학습시 모델이 저장되지 않음에 유의해 주세요.

# STEP 3 : 제공된 체크포인트 파일을 이용하여 다양한 midi 파일 생성하기
midi 파일을 생성하는 단계에서 바꾸어 볼 수 있는 조건에는 무엇이 있는지 찾아 보세요.
조건을 변경해 가며 5개 이상의 midi 파일을 생성해 보세요.
가장 잘 생성된 midi 파일을 첨부하여 제출해 주세요.


# Music Generation 테스트
- 이전 스텝의 학습을 20Epoch 진행한 모델 체크포인트 파일을 제공해 드리겠습니다. 이번 스텝에서는 학습이 완료된 모델의 음악 생성 테스트를 진행해 보겠습니다.

In [30]:
tf.train.latest_checkpoint(checkpoint_path)

'/content/drive/Shared drives/GOFOODA/DOS/music_transformer/models/ckpt-29'

- 테스트용 데이터셋을 별도로 분리하지 않았으므로, 학습에 사용했던 데이터를 다시 활용하겠습니다. 그러나 이 데이터는 실제로는 첫번째 스텝의 데이터만 활용될 것입니다.

In [31]:
test_dataset = tf.data.Dataset.from_tensor_slices((train_data_pad, train_label_pad))
test_dataset = test_dataset.map(tensor_casting)
test_dataset = test_dataset.shuffle(10000).batch(batch_size=1)

- 자연어처리의 text generation 모델처럼 inference 단계는 step-by-step으로 진행됩니다. for문을 돌면서 예측된 단어를 그 다음 입력으로 모델에 전달하는 구조를 눈여겨 봐주세요.

In [32]:
N = 1000
_inputs = np.zeros([1, N], dtype=np.int32)

for x, y in test_dataset.take(1):
    _inputs[:, :length] = x[None, :]
    
for i in range(N - length):
    predictions, _ = music_transformer(_inputs[:, i:i+length], False, None, None, None)
    predictions = tf.squeeze(predictions, 0)    
    
    # select the last word from the seq_len dimension
    predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()
    print(predicted_id)
    
    # 예측된 단어를 다음 입력으로 모델에 전달
    # 이전 은닉 상태와 함께
    _inputs[:, i+length] = predicted_id

_inputs.shape

124
199
1
122
211
2
115
189
1
311
2
112
217
3
345
1
346
1
319
3
329
1
344
9
123
214
1
128
188
3
115
163
4
297
1
341
4
389
1
109
203
1
324
1
388
1
326
5
121
207
1
316
8
124
215
3
337
1
120
213
1
328
9
342
2
120
202
8
334
1
388
8
119
205
1
118
196
9
335
2
114
184
5
331
32
388
2
119
187
4
389
11
313
4
119
205
4
326
1
122
199
4
343
4
118
194
1
117
209
4
110
189
2
389
3
342
1
335
1
310
2
314
4
122
209
2
118
195
5
333
1
323
2
113
190
4
326
1
389
2
118
205
2
118
204
4
324
2
326
3
313
10
320
3
116
175
2
117
184
1
118
195
3
112
178
5
309
3
310
2
307
6
117
182
1
119
204
3
329
1
332
2
330
9
119
203
8
334
10
119
199
4
338
11
125
203
3
323
1
121
206
1
116
198
2
333
2
333
5
300
5
318
9
388
1
121
207
6
324
5
344
1
119
212
1
118
212
4
117
212
2
331
5
339
2
338
3
117
183
5
318
10
113
199
3
118
216
6
342
15
116
207
13
336
2
354
6
116
181
11
120
224
12
117
211
7
351
4
339
8
346
2
117
172
11
304
1
118
215
3
120
194
11
122
211
5
338
17
343
10
307
17
122
214
4
111
209
1
114
183
2
112
182
4
309
3
338
7
111
2

(1, 1000)

- 우리의 Inference 결과가 _inputs 리스트 안에 잘 담겼습니다. 이제 이 내용을 다시 MIDI 파일로 복원해 내기 위해 아래와 같이 MIDI Event 클래스를 정의하였습니다.

In [33]:
class Event():
    def __init__(self, time, note, cc, on, velocity):
        self.time = time
        self.note = note
        self.on = on
        self.cc = cc
        self.velocity = velocity

    def get_event_sequence(self):
        return [self.time, self.note, int(self.on)]

class Note():
    def __init__(self):
        self.pitch = 0
        self.start_time = 0
        self.end_time = 0

- 그럼 _inputs 에 담긴 내용을 차례차례 MIDI Event로 변환해 보겠습니다.

In [34]:
event_list = []
time = 0
event = None

EventDim = IntervalDim + VelocityDim + NoteOnDim + NoteOffDim # 388

for _input in _inputs[0]:
    # interval
    if _input < IntervalDim: 
        time += _input
        event = Event(time, 0, False, 0, 0)

    # velocity
    elif _input < NoteOnOffset:
        if event is None:
            continue
        event.velocity = (_input - VelocityOffset) / VelocityDim * 128

    # note on
    elif _input < NoteOffOffset:
        if event is None:
            continue

        event.note = _input - NoteOnOffset
        event.on = True
        event_list.append(event)

        event = None

    # note off
    elif _input < CCOffset:
        if event is None:
            continue
        event.note = _input - NoteOffOffset
        event.on = False
        event_list.append(event)
        event = None

    ## CC
    else:
        if event is None:
            continue
        event.cc = True
        on = _input - CCOffset == 1
        event.on = on
        event_list.append(event)
        event = None

- 이제 마지막으로, MIDI Event로부터 MIDI 파일을 재구성해 보겠습니다.

(1) MIDI 파일: tempo = 50, ticks_per_beat = 8

In [35]:
from mido import Message, MidiFile, MidiTrack, MetaMessage, bpm2tempo

midi = MidiFile()
output_midi_path = '/content/drive/Shared drives/GOFOODA/DOS/music_transformer/data/output_file1.mid'

# Instantiate a MIDI Track (contains a list of MIDI events)
track = MidiTrack()
track.append(MetaMessage("set_tempo", tempo=bpm2tempo(50)))
# Append the track to the pattern
midi.tracks.append(track)

prev_time = 0
pitches = [None for _ in range(128)]
for event in event_list:
    tick = (event.time - prev_time) // 3
    midi.ticks_per_beat = 8
    prev_time = event.time

    # case NOTE:
    if not event.cc:
        if event.on:
            if pitches[event.note] is not None:
                # Instantiate a MIDI note off event, append it to the track
                off = Message('note_off', note=event.note, velocity=0, time=0)
                track.append(off)
                pitches[event.note] = None

            # Instantiate a MIDI note on event, append it to the track
            on = Message('note_on', note=event.note, velocity=int(event.velocity), time=tick)
            track.append(on)
            pitches[event.note] = prev_time
        else:
            # Instantiate a MIDI note off event, append it to the track
            off = Message('note_off', note=event.note, velocity=0, time=tick)
            track.append(off)
            pitches[event.note] = None

#     case CC:
    elif event.cc:
        if event.on:
            cc = Message('control_change', control=64, time=tick, value=127)
        else:
            cc = Message('control_change', control=64, time=tick, value=0)

        track.append(cc)

    for pitch in range(128):
        if pitches[pitch] is not None and pitches[pitch] + 100 < prev_time:
            off = Message('note_off', note=pitch, velocity=0, time=0)
            track.append(off)
            pitches[pitch] = None


# Add the end of track event, append it to the track
track.append(MetaMessage("end_of_track"))

# Save the pattern to disk
midi.save(output_midi_path)

for i, track in enumerate(midi.tracks):
    print('Track {}: {}'.format(i, track.name))
    for msg in track:
        print(msg)

print('done')

Track 0: 
<meta message set_tempo tempo=1200000 time=0>
note_on channel=0 note=78 velocity=64 time=0
note_off channel=0 note=42 velocity=0 time=0
note_off channel=0 note=66 velocity=0 time=0
note_off channel=0 note=78 velocity=0 time=0
note_off channel=0 note=30 velocity=0 time=0
control_change channel=0 control=64 value=127 time=0
note_off channel=0 note=70 velocity=0 time=0
note_on channel=0 note=66 velocity=80 time=2
note_on channel=0 note=71 velocity=88 time=0
note_on channel=0 note=47 velocity=84 time=0
note_on channel=0 note=78 velocity=52 time=0
note_off channel=0 note=66 velocity=0 time=0
note_off channel=0 note=71 velocity=0 time=0
note_on channel=0 note=35 velocity=48 time=0
note_off channel=0 note=47 velocity=0 time=0
note_off channel=0 note=78 velocity=0 time=0
note_off channel=0 note=35 velocity=0 time=1
note_on channel=0 note=40 velocity=72 time=2
note_on channel=0 note=28 velocity=80 time=0
note_on channel=0 note=68 velocity=84 time=0
note_on channel=0 note=76 velocity=8

(2) MIDI 파일: tempo = 80, ticks_per_beat = 8

In [36]:
from mido import Message, MidiFile, MidiTrack, MetaMessage, bpm2tempo

midi = MidiFile()
output_midi_path = '/content/drive/Shared drives/GOFOODA/DOS/music_transformer/data/output_file2.mid'

# Instantiate a MIDI Track (contains a list of MIDI events)
track = MidiTrack()
track.append(MetaMessage("set_tempo", tempo=bpm2tempo(80)))
# Append the track to the pattern
midi.tracks.append(track)

prev_time = 0
pitches = [None for _ in range(128)]
for event in event_list:
    tick = (event.time - prev_time) // 3
    midi.ticks_per_beat = 8
    prev_time = event.time

    # case NOTE:
    if not event.cc:
        if event.on:
            if pitches[event.note] is not None:
                # Instantiate a MIDI note off event, append it to the track
                off = Message('note_off', note=event.note, velocity=0, time=0)
                track.append(off)
                pitches[event.note] = None

            # Instantiate a MIDI note on event, append it to the track
            on = Message('note_on', note=event.note, velocity=int(event.velocity), time=tick)
            track.append(on)
            pitches[event.note] = prev_time
        else:
            # Instantiate a MIDI note off event, append it to the track
            off = Message('note_off', note=event.note, velocity=0, time=tick)
            track.append(off)
            pitches[event.note] = None

#     case CC:
    elif event.cc:
        if event.on:
            cc = Message('control_change', control=64, time=tick, value=127)
        else:
            cc = Message('control_change', control=64, time=tick, value=0)

        track.append(cc)

    for pitch in range(128):
        if pitches[pitch] is not None and pitches[pitch] + 100 < prev_time:
            off = Message('note_off', note=pitch, velocity=0, time=0)
            track.append(off)
            pitches[pitch] = None


# Add the end of track event, append it to the track
track.append(MetaMessage("end_of_track"))

# Save the pattern to disk
midi.save(output_midi_path)

for i, track in enumerate(midi.tracks):
    print('Track {}: {}'.format(i, track.name))
    for msg in track:
        print(msg)

print('done')

Track 0: 
<meta message set_tempo tempo=750000 time=0>
note_on channel=0 note=78 velocity=64 time=0
note_off channel=0 note=42 velocity=0 time=0
note_off channel=0 note=66 velocity=0 time=0
note_off channel=0 note=78 velocity=0 time=0
note_off channel=0 note=30 velocity=0 time=0
control_change channel=0 control=64 value=127 time=0
note_off channel=0 note=70 velocity=0 time=0
note_on channel=0 note=66 velocity=80 time=2
note_on channel=0 note=71 velocity=88 time=0
note_on channel=0 note=47 velocity=84 time=0
note_on channel=0 note=78 velocity=52 time=0
note_off channel=0 note=66 velocity=0 time=0
note_off channel=0 note=71 velocity=0 time=0
note_on channel=0 note=35 velocity=48 time=0
note_off channel=0 note=47 velocity=0 time=0
note_off channel=0 note=78 velocity=0 time=0
note_off channel=0 note=35 velocity=0 time=1
note_on channel=0 note=40 velocity=72 time=2
note_on channel=0 note=28 velocity=80 time=0
note_on channel=0 note=68 velocity=84 time=0
note_on channel=0 note=76 velocity=80

(3) MIDI 파일: tempo = 120, ticks_per_beat = 8


In [37]:
from mido import Message, MidiFile, MidiTrack, MetaMessage, bpm2tempo

midi = MidiFile()
output_midi_path = '/content/drive/Shared drives/GOFOODA/DOS/music_transformer/data/output_file3.mid'

# Instantiate a MIDI Track (contains a list of MIDI events)
track = MidiTrack()
track.append(MetaMessage("set_tempo", tempo=bpm2tempo(120)))
# Append the track to the pattern
midi.tracks.append(track)

prev_time = 0
pitches = [None for _ in range(128)]
for event in event_list:
    tick = (event.time - prev_time) // 3
    midi.ticks_per_beat = 8
    prev_time = event.time

    # case NOTE:
    if not event.cc:
        if event.on:
            if pitches[event.note] is not None:
                # Instantiate a MIDI note off event, append it to the track
                off = Message('note_off', note=event.note, velocity=0, time=0)
                track.append(off)
                pitches[event.note] = None

            # Instantiate a MIDI note on event, append it to the track
            on = Message('note_on', note=event.note, velocity=int(event.velocity), time=tick)
            track.append(on)
            pitches[event.note] = prev_time
        else:
            # Instantiate a MIDI note off event, append it to the track
            off = Message('note_off', note=event.note, velocity=0, time=tick)
            track.append(off)
            pitches[event.note] = None

#     case CC:
    elif event.cc:
        if event.on:
            cc = Message('control_change', control=64, time=tick, value=127)
        else:
            cc = Message('control_change', control=64, time=tick, value=0)

        track.append(cc)

    for pitch in range(128):
        if pitches[pitch] is not None and pitches[pitch] + 100 < prev_time:
            off = Message('note_off', note=pitch, velocity=0, time=0)
            track.append(off)
            pitches[pitch] = None


# Add the end of track event, append it to the track
track.append(MetaMessage("end_of_track"))

# Save the pattern to disk
midi.save(output_midi_path)

for i, track in enumerate(midi.tracks):
    print('Track {}: {}'.format(i, track.name))
    for msg in track:
        print(msg)

print('done')

Track 0: 
<meta message set_tempo tempo=500000 time=0>
note_on channel=0 note=78 velocity=64 time=0
note_off channel=0 note=42 velocity=0 time=0
note_off channel=0 note=66 velocity=0 time=0
note_off channel=0 note=78 velocity=0 time=0
note_off channel=0 note=30 velocity=0 time=0
control_change channel=0 control=64 value=127 time=0
note_off channel=0 note=70 velocity=0 time=0
note_on channel=0 note=66 velocity=80 time=2
note_on channel=0 note=71 velocity=88 time=0
note_on channel=0 note=47 velocity=84 time=0
note_on channel=0 note=78 velocity=52 time=0
note_off channel=0 note=66 velocity=0 time=0
note_off channel=0 note=71 velocity=0 time=0
note_on channel=0 note=35 velocity=48 time=0
note_off channel=0 note=47 velocity=0 time=0
note_off channel=0 note=78 velocity=0 time=0
note_off channel=0 note=35 velocity=0 time=1
note_on channel=0 note=40 velocity=72 time=2
note_on channel=0 note=28 velocity=80 time=0
note_on channel=0 note=68 velocity=84 time=0
note_on channel=0 note=76 velocity=80

(4) MIDI 파일: tempo = 150, ticks_per_beat = 8

In [38]:
from mido import Message, MidiFile, MidiTrack, MetaMessage, bpm2tempo

midi = MidiFile()
output_midi_path = '/content/drive/Shared drives/GOFOODA/DOS/music_transformer/data/output_file4.mid'

# Instantiate a MIDI Track (contains a list of MIDI events)
track = MidiTrack()
track.append(MetaMessage("set_tempo", tempo=bpm2tempo(150)))
# Append the track to the pattern
midi.tracks.append(track)

prev_time = 0
pitches = [None for _ in range(128)]
for event in event_list:
    tick = (event.time - prev_time) // 3
    midi.ticks_per_beat = 8
    prev_time = event.time

    # case NOTE:
    if not event.cc:
        if event.on:
            if pitches[event.note] is not None:
                # Instantiate a MIDI note off event, append it to the track
                off = Message('note_off', note=event.note, velocity=0, time=0)
                track.append(off)
                pitches[event.note] = None

            # Instantiate a MIDI note on event, append it to the track
            on = Message('note_on', note=event.note, velocity=int(event.velocity), time=tick)
            track.append(on)
            pitches[event.note] = prev_time
        else:
            # Instantiate a MIDI note off event, append it to the track
            off = Message('note_off', note=event.note, velocity=0, time=tick)
            track.append(off)
            pitches[event.note] = None

#     case CC:
    elif event.cc:
        if event.on:
            cc = Message('control_change', control=64, time=tick, value=127)
        else:
            cc = Message('control_change', control=64, time=tick, value=0)

        track.append(cc)

    for pitch in range(128):
        if pitches[pitch] is not None and pitches[pitch] + 100 < prev_time:
            off = Message('note_off', note=pitch, velocity=0, time=0)
            track.append(off)
            pitches[pitch] = None


# Add the end of track event, append it to the track
track.append(MetaMessage("end_of_track"))

# Save the pattern to disk
midi.save(output_midi_path)

for i, track in enumerate(midi.tracks):
    print('Track {}: {}'.format(i, track.name))
    for msg in track:
        print(msg)

print('done')

Track 0: 
<meta message set_tempo tempo=400000 time=0>
note_on channel=0 note=78 velocity=64 time=0
note_off channel=0 note=42 velocity=0 time=0
note_off channel=0 note=66 velocity=0 time=0
note_off channel=0 note=78 velocity=0 time=0
note_off channel=0 note=30 velocity=0 time=0
control_change channel=0 control=64 value=127 time=0
note_off channel=0 note=70 velocity=0 time=0
note_on channel=0 note=66 velocity=80 time=2
note_on channel=0 note=71 velocity=88 time=0
note_on channel=0 note=47 velocity=84 time=0
note_on channel=0 note=78 velocity=52 time=0
note_off channel=0 note=66 velocity=0 time=0
note_off channel=0 note=71 velocity=0 time=0
note_on channel=0 note=35 velocity=48 time=0
note_off channel=0 note=47 velocity=0 time=0
note_off channel=0 note=78 velocity=0 time=0
note_off channel=0 note=35 velocity=0 time=1
note_on channel=0 note=40 velocity=72 time=2
note_on channel=0 note=28 velocity=80 time=0
note_on channel=0 note=68 velocity=84 time=0
note_on channel=0 note=76 velocity=80

(5) MIDI 파일: tempo = 180, ticks_per_beat = 8

In [39]:
from mido import Message, MidiFile, MidiTrack, MetaMessage, bpm2tempo

midi = MidiFile()
output_midi_path = '/content/drive/Shared drives/GOFOODA/DOS/music_transformer/data/output_file5.mid'

# Instantiate a MIDI Track (contains a list of MIDI events)
track = MidiTrack()
track.append(MetaMessage("set_tempo", tempo=bpm2tempo(180)))
# Append the track to the pattern
midi.tracks.append(track)

prev_time = 0
pitches = [None for _ in range(128)]
for event in event_list:
    tick = (event.time - prev_time) // 8
    midi.ticks_per_beat = 8
    prev_time = event.time

    # case NOTE:
    if not event.cc:
        if event.on:
            if pitches[event.note] is not None:
                # Instantiate a MIDI note off event, append it to the track
                off = Message('note_off', note=event.note, velocity=0, time=0)
                track.append(off)
                pitches[event.note] = None

            # Instantiate a MIDI note on event, append it to the track
            on = Message('note_on', note=event.note, velocity=int(event.velocity), time=tick)
            track.append(on)
            pitches[event.note] = prev_time
        else:
            # Instantiate a MIDI note off event, append it to the track
            off = Message('note_off', note=event.note, velocity=0, time=tick)
            track.append(off)
            pitches[event.note] = None

#     case CC:
    elif event.cc:
        if event.on:
            cc = Message('control_change', control=100, time=tick, value=127)
        else:
            cc = Message('control_change', control=100, time=tick, value=0)

        track.append(cc)

    for pitch in range(128):
        if pitches[pitch] is not None and pitches[pitch] + 100 < prev_time:
            off = Message('note_off', note=pitch, velocity=0, time=0)
            track.append(off)
            pitches[pitch] = None


# Add the end of track event, append it to the track
track.append(MetaMessage("end_of_track"))

# Save the pattern to disk
midi.save(output_midi_path)

for i, track in enumerate(midi.tracks):
    print('Track {}: {}'.format(i, track.name))
    for msg in track:
        print(msg)

print('done')

Track 0: 
<meta message set_tempo tempo=333333 time=0>
note_on channel=0 note=78 velocity=64 time=0
note_off channel=0 note=42 velocity=0 time=0
note_off channel=0 note=66 velocity=0 time=0
note_off channel=0 note=78 velocity=0 time=0
note_off channel=0 note=30 velocity=0 time=0
control_change channel=0 control=100 value=127 time=0
note_off channel=0 note=70 velocity=0 time=0
note_on channel=0 note=66 velocity=80 time=1
note_on channel=0 note=71 velocity=88 time=0
note_on channel=0 note=47 velocity=84 time=0
note_on channel=0 note=78 velocity=52 time=0
note_off channel=0 note=66 velocity=0 time=0
note_off channel=0 note=71 velocity=0 time=0
note_on channel=0 note=35 velocity=48 time=0
note_off channel=0 note=47 velocity=0 time=0
note_off channel=0 note=78 velocity=0 time=0
note_off channel=0 note=35 velocity=0 time=0
note_on channel=0 note=40 velocity=72 time=1
note_on channel=0 note=28 velocity=80 time=0
note_on channel=0 note=68 velocity=84 time=0
note_on channel=0 note=76 velocity=8

(6) MIDI 파일: tempo = 20, ticks_per_beat = 16

In [40]:
from mido import Message, MidiFile, MidiTrack, MetaMessage, bpm2tempo

midi = MidiFile()
output_midi_path = '/content/drive/Shared drives/GOFOODA/DOS/music_transformer/data/output_file6.mid'

# Instantiate a MIDI Track (contains a list of MIDI events)
track = MidiTrack()
track.append(MetaMessage("set_tempo", tempo=bpm2tempo(20)))
# Append the track to the pattern
midi.tracks.append(track)

prev_time = 0
pitches = [None for _ in range(128)]
for event in event_list:
    tick = (event.time - prev_time) // 8
    midi.ticks_per_beat = 16
    prev_time = event.time

    # case NOTE:
    if not event.cc:
        if event.on:
            if pitches[event.note] is not None:
                # Instantiate a MIDI note off event, append it to the track
                off = Message('note_off', note=event.note, velocity=0, time=0)
                track.append(off)
                pitches[event.note] = None

            # Instantiate a MIDI note on event, append it to the track
            on = Message('note_on', note=event.note, velocity=int(event.velocity), time=tick)
            track.append(on)
            pitches[event.note] = prev_time
        else:
            # Instantiate a MIDI note off event, append it to the track
            off = Message('note_off', note=event.note, velocity=0, time=tick)
            track.append(off)
            pitches[event.note] = None

#     case CC:
    elif event.cc:
        if event.on:
            cc = Message('control_change', control=100, time=tick, value=127)
        else:
            cc = Message('control_change', control=100, time=tick, value=0)

        track.append(cc)

    for pitch in range(128):
        if pitches[pitch] is not None and pitches[pitch] + 100 < prev_time:
            off = Message('note_off', note=pitch, velocity=0, time=0)
            track.append(off)
            pitches[pitch] = None


# Add the end of track event, append it to the track
track.append(MetaMessage("end_of_track"))

# Save the pattern to disk
midi.save(output_midi_path)

for i, track in enumerate(midi.tracks):
    print('Track {}: {}'.format(i, track.name))
    for msg in track:
        print(msg)

print('done')

Track 0: 
<meta message set_tempo tempo=3000000 time=0>
note_on channel=0 note=78 velocity=64 time=0
note_off channel=0 note=42 velocity=0 time=0
note_off channel=0 note=66 velocity=0 time=0
note_off channel=0 note=78 velocity=0 time=0
note_off channel=0 note=30 velocity=0 time=0
control_change channel=0 control=100 value=127 time=0
note_off channel=0 note=70 velocity=0 time=0
note_on channel=0 note=66 velocity=80 time=1
note_on channel=0 note=71 velocity=88 time=0
note_on channel=0 note=47 velocity=84 time=0
note_on channel=0 note=78 velocity=52 time=0
note_off channel=0 note=66 velocity=0 time=0
note_off channel=0 note=71 velocity=0 time=0
note_on channel=0 note=35 velocity=48 time=0
note_off channel=0 note=47 velocity=0 time=0
note_off channel=0 note=78 velocity=0 time=0
note_off channel=0 note=35 velocity=0 time=0
note_on channel=0 note=40 velocity=72 time=1
note_on channel=0 note=28 velocity=80 time=0
note_on channel=0 note=68 velocity=84 time=0
note_on channel=0 note=76 velocity=

(7) MIDI 파일: tempo = 40, ticks_per_beat = 16

In [41]:
from mido import Message, MidiFile, MidiTrack, MetaMessage, bpm2tempo

midi = MidiFile()
output_midi_path = '/content/drive/Shared drives/GOFOODA/DOS/music_transformer/data/output_file7.mid'

# Instantiate a MIDI Track (contains a list of MIDI events)
track = MidiTrack()
track.append(MetaMessage("set_tempo", tempo=bpm2tempo(40)))
# Append the track to the pattern
midi.tracks.append(track)

prev_time = 0
pitches = [None for _ in range(128)]
for event in event_list:
    tick = (event.time - prev_time) // 8
    midi.ticks_per_beat = 16
    prev_time = event.time

    # case NOTE:
    if not event.cc:
        if event.on:
            if pitches[event.note] is not None:
                # Instantiate a MIDI note off event, append it to the track
                off = Message('note_off', note=event.note, velocity=0, time=0)
                track.append(off)
                pitches[event.note] = None

            # Instantiate a MIDI note on event, append it to the track
            on = Message('note_on', note=event.note, velocity=int(event.velocity), time=tick)
            track.append(on)
            pitches[event.note] = prev_time
        else:
            # Instantiate a MIDI note off event, append it to the track
            off = Message('note_off', note=event.note, velocity=0, time=tick)
            track.append(off)
            pitches[event.note] = None

#     case CC:
    elif event.cc:
        if event.on:
            cc = Message('control_change', control=100, time=tick, value=127)
        else:
            cc = Message('control_change', control=100, time=tick, value=0)

        track.append(cc)

    for pitch in range(128):
        if pitches[pitch] is not None and pitches[pitch] + 100 < prev_time:
            off = Message('note_off', note=pitch, velocity=0, time=0)
            track.append(off)
            pitches[pitch] = None


# Add the end of track event, append it to the track
track.append(MetaMessage("end_of_track"))

# Save the pattern to disk
midi.save(output_midi_path)

for i, track in enumerate(midi.tracks):
    print('Track {}: {}'.format(i, track.name))
    for msg in track:
        print(msg)

print('done')

Track 0: 
<meta message set_tempo tempo=1500000 time=0>
note_on channel=0 note=78 velocity=64 time=0
note_off channel=0 note=42 velocity=0 time=0
note_off channel=0 note=66 velocity=0 time=0
note_off channel=0 note=78 velocity=0 time=0
note_off channel=0 note=30 velocity=0 time=0
control_change channel=0 control=100 value=127 time=0
note_off channel=0 note=70 velocity=0 time=0
note_on channel=0 note=66 velocity=80 time=1
note_on channel=0 note=71 velocity=88 time=0
note_on channel=0 note=47 velocity=84 time=0
note_on channel=0 note=78 velocity=52 time=0
note_off channel=0 note=66 velocity=0 time=0
note_off channel=0 note=71 velocity=0 time=0
note_on channel=0 note=35 velocity=48 time=0
note_off channel=0 note=47 velocity=0 time=0
note_off channel=0 note=78 velocity=0 time=0
note_off channel=0 note=35 velocity=0 time=0
note_on channel=0 note=40 velocity=72 time=1
note_on channel=0 note=28 velocity=80 time=0
note_on channel=0 note=68 velocity=84 time=0
note_on channel=0 note=76 velocity=

(8) MIDI 파일: tempo = 60, ticks_per_beat = 16

In [42]:
from mido import Message, MidiFile, MidiTrack, MetaMessage, bpm2tempo

midi = MidiFile()
output_midi_path = '/content/drive/Shared drives/GOFOODA/DOS/music_transformer/data/output_file8.mid'

# Instantiate a MIDI Track (contains a list of MIDI events)
track = MidiTrack()
track.append(MetaMessage("set_tempo", tempo=bpm2tempo(60)))
# Append the track to the pattern
midi.tracks.append(track)

prev_time = 0
pitches = [None for _ in range(128)]
for event in event_list:
    tick = (event.time - prev_time) // 8
    midi.ticks_per_beat = 16
    prev_time = event.time

    # case NOTE:
    if not event.cc:
        if event.on:
            if pitches[event.note] is not None:
                # Instantiate a MIDI note off event, append it to the track
                off = Message('note_off', note=event.note, velocity=0, time=0)
                track.append(off)
                pitches[event.note] = None

            # Instantiate a MIDI note on event, append it to the track
            on = Message('note_on', note=event.note, velocity=int(event.velocity), time=tick)
            track.append(on)
            pitches[event.note] = prev_time
        else:
            # Instantiate a MIDI note off event, append it to the track
            off = Message('note_off', note=event.note, velocity=0, time=tick)
            track.append(off)
            pitches[event.note] = None

#     case CC:
    elif event.cc:
        if event.on:
            cc = Message('control_change', control=100, time=tick, value=127)
        else:
            cc = Message('control_change', control=100, time=tick, value=0)

        track.append(cc)

    for pitch in range(128):
        if pitches[pitch] is not None and pitches[pitch] + 100 < prev_time:
            off = Message('note_off', note=pitch, velocity=0, time=0)
            track.append(off)
            pitches[pitch] = None


# Add the end of track event, append it to the track
track.append(MetaMessage("end_of_track"))

# Save the pattern to disk
midi.save(output_midi_path)

for i, track in enumerate(midi.tracks):
    print('Track {}: {}'.format(i, track.name))
    for msg in track:
        print(msg)

print('done')

Track 0: 
<meta message set_tempo tempo=1000000 time=0>
note_on channel=0 note=78 velocity=64 time=0
note_off channel=0 note=42 velocity=0 time=0
note_off channel=0 note=66 velocity=0 time=0
note_off channel=0 note=78 velocity=0 time=0
note_off channel=0 note=30 velocity=0 time=0
control_change channel=0 control=100 value=127 time=0
note_off channel=0 note=70 velocity=0 time=0
note_on channel=0 note=66 velocity=80 time=1
note_on channel=0 note=71 velocity=88 time=0
note_on channel=0 note=47 velocity=84 time=0
note_on channel=0 note=78 velocity=52 time=0
note_off channel=0 note=66 velocity=0 time=0
note_off channel=0 note=71 velocity=0 time=0
note_on channel=0 note=35 velocity=48 time=0
note_off channel=0 note=47 velocity=0 time=0
note_off channel=0 note=78 velocity=0 time=0
note_off channel=0 note=35 velocity=0 time=0
note_on channel=0 note=40 velocity=72 time=1
note_on channel=0 note=28 velocity=80 time=0
note_on channel=0 note=68 velocity=84 time=0
note_on channel=0 note=76 velocity=

(9) MIDI 파일: tempo = 80, ticks_per_beat = 16

In [43]:
from mido import Message, MidiFile, MidiTrack, MetaMessage, bpm2tempo

midi = MidiFile()
output_midi_path = '/content/drive/Shared drives/GOFOODA/DOS/music_transformer/data/output_file9.mid'

# Instantiate a MIDI Track (contains a list of MIDI events)
track = MidiTrack()
track.append(MetaMessage("set_tempo", tempo=bpm2tempo(80)))
# Append the track to the pattern
midi.tracks.append(track)

prev_time = 0
pitches = [None for _ in range(128)]
for event in event_list:
    tick = (event.time - prev_time) // 8
    midi.ticks_per_beat = 16
    prev_time = event.time

    # case NOTE:
    if not event.cc:
        if event.on:
            if pitches[event.note] is not None:
                # Instantiate a MIDI note off event, append it to the track
                off = Message('note_off', note=event.note, velocity=0, time=0)
                track.append(off)
                pitches[event.note] = None

            # Instantiate a MIDI note on event, append it to the track
            on = Message('note_on', note=event.note, velocity=int(event.velocity), time=tick)
            track.append(on)
            pitches[event.note] = prev_time
        else:
            # Instantiate a MIDI note off event, append it to the track
            off = Message('note_off', note=event.note, velocity=0, time=tick)
            track.append(off)
            pitches[event.note] = None

#     case CC:
    elif event.cc:
        if event.on:
            cc = Message('control_change', control=100, time=tick, value=127)
        else:
            cc = Message('control_change', control=100, time=tick, value=0)

        track.append(cc)

    for pitch in range(128):
        if pitches[pitch] is not None and pitches[pitch] + 100 < prev_time:
            off = Message('note_off', note=pitch, velocity=0, time=0)
            track.append(off)
            pitches[pitch] = None


# Add the end of track event, append it to the track
track.append(MetaMessage("end_of_track"))

# Save the pattern to disk
midi.save(output_midi_path)

for i, track in enumerate(midi.tracks):
    print('Track {}: {}'.format(i, track.name))
    for msg in track:
        print(msg)

print('done')

Track 0: 
<meta message set_tempo tempo=750000 time=0>
note_on channel=0 note=78 velocity=64 time=0
note_off channel=0 note=42 velocity=0 time=0
note_off channel=0 note=66 velocity=0 time=0
note_off channel=0 note=78 velocity=0 time=0
note_off channel=0 note=30 velocity=0 time=0
control_change channel=0 control=100 value=127 time=0
note_off channel=0 note=70 velocity=0 time=0
note_on channel=0 note=66 velocity=80 time=1
note_on channel=0 note=71 velocity=88 time=0
note_on channel=0 note=47 velocity=84 time=0
note_on channel=0 note=78 velocity=52 time=0
note_off channel=0 note=66 velocity=0 time=0
note_off channel=0 note=71 velocity=0 time=0
note_on channel=0 note=35 velocity=48 time=0
note_off channel=0 note=47 velocity=0 time=0
note_off channel=0 note=78 velocity=0 time=0
note_off channel=0 note=35 velocity=0 time=0
note_on channel=0 note=40 velocity=72 time=1
note_on channel=0 note=28 velocity=80 time=0
note_on channel=0 note=68 velocity=84 time=0
note_on channel=0 note=76 velocity=8

(10) MIDI 파일: tempo = 100, ticks_per_beat = 16

In [44]:
from mido import Message, MidiFile, MidiTrack, MetaMessage, bpm2tempo

midi = MidiFile()
output_midi_path = '/content/drive/Shared drives/GOFOODA/DOS/music_transformer/data/output_file10.mid'

# Instantiate a MIDI Track (contains a list of MIDI events)
track = MidiTrack()
track.append(MetaMessage("set_tempo", tempo=bpm2tempo(100)))
# Append the track to the pattern
midi.tracks.append(track)

prev_time = 0
pitches = [None for _ in range(128)]
for event in event_list:
    tick = (event.time - prev_time) // 8
    midi.ticks_per_beat = 16
    prev_time = event.time

    # case NOTE:
    if not event.cc:
        if event.on:
            if pitches[event.note] is not None:
                # Instantiate a MIDI note off event, append it to the track
                off = Message('note_off', note=event.note, velocity=0, time=0)
                track.append(off)
                pitches[event.note] = None

            # Instantiate a MIDI note on event, append it to the track
            on = Message('note_on', note=event.note, velocity=int(event.velocity), time=tick)
            track.append(on)
            pitches[event.note] = prev_time
        else:
            # Instantiate a MIDI note off event, append it to the track
            off = Message('note_off', note=event.note, velocity=0, time=tick)
            track.append(off)
            pitches[event.note] = None

#     case CC:
    elif event.cc:
        if event.on:
            cc = Message('control_change', control=100, time=tick, value=127)
        else:
            cc = Message('control_change', control=100, time=tick, value=0)

        track.append(cc)

    for pitch in range(128):
        if pitches[pitch] is not None and pitches[pitch] + 100 < prev_time:
            off = Message('note_off', note=pitch, velocity=0, time=0)
            track.append(off)
            pitches[pitch] = None


# Add the end of track event, append it to the track
track.append(MetaMessage("end_of_track"))

# Save the pattern to disk
midi.save(output_midi_path)

for i, track in enumerate(midi.tracks):
    print('Track {}: {}'.format(i, track.name))
    for msg in track:
        print(msg)

print('done')

Track 0: 
<meta message set_tempo tempo=600000 time=0>
note_on channel=0 note=78 velocity=64 time=0
note_off channel=0 note=42 velocity=0 time=0
note_off channel=0 note=66 velocity=0 time=0
note_off channel=0 note=78 velocity=0 time=0
note_off channel=0 note=30 velocity=0 time=0
control_change channel=0 control=100 value=127 time=0
note_off channel=0 note=70 velocity=0 time=0
note_on channel=0 note=66 velocity=80 time=1
note_on channel=0 note=71 velocity=88 time=0
note_on channel=0 note=47 velocity=84 time=0
note_on channel=0 note=78 velocity=52 time=0
note_off channel=0 note=66 velocity=0 time=0
note_off channel=0 note=71 velocity=0 time=0
note_on channel=0 note=35 velocity=48 time=0
note_off channel=0 note=47 velocity=0 time=0
note_off channel=0 note=78 velocity=0 time=0
note_off channel=0 note=35 velocity=0 time=0
note_on channel=0 note=40 velocity=72 time=1
note_on channel=0 note=28 velocity=80 time=0
note_on channel=0 note=68 velocity=84 time=0
note_on channel=0 note=76 velocity=8

# 결론

### - Transformer모델을 이용해서 다양한 음악을 생성해 보았다.
   - MAESTRO 데이터셋의 전처리를 통해 학습 데이터셋 구성을 체계적으로 진행하였다.
   - MIDI 파일의 구조와 특성에 맞게 적절한 가공과 augmentation을 통해 학습데이터를 생성하였다.
   
### - Music Transformer 모델의 구현 및 학습이 원활이 진행되었다.
   - 모델의 학습이 원활히 진행되었으며, loss가 안정적으로 감소하였다.
   - Batch를 45로 조절하고, 30 Epochs 까지 학습을 진행하면서 Loss가 3.2731에서 ***3.2532***까지 줄어들었다. 
   
### - 다양한 조건 변경을 통해 10개의 다른 음악을 생성하는 실험을 진행하였다.
   - 생성테스트 전 아래와 같은 조건으로 다양하게 음악을 생성해 보았다. 
       1. MIDI 파일: tempo = 50, ticks_per_beat = 8
       2. MIDI 파일: tempo = 100, ticks_per_beat = 8
       3. MIDI 파일: tempo = 120, ticks_per_beat = 8
       4. MIDI 파일: tempo = 150, ticks_per_beat = 8
       5. MIDI 파일: tempo = 200, ticks_per_beat = 8
       6. ***MIDI 파일: tempo = 50, ticks_per_beat = 16 →가장 잘 생성된 음악 파일***
       7. MIDI 파일: tempo = 100, ticks_per_beat = 16
       8. MIDI 파일: tempo = 120, ticks_per_beat = 16
       9. MIDI 파일: tempo = 150, ticks_per_beat = 16
       10. MIDI 파일: tempo = 200, ticks_per_beat = 16
     
     
   - 가장 잘 생성되었다고 생각하는 ***output_file6.mid*** 고품질의 midi 파일을 같이 제출하였다.
