Copyright 2017 Google LLC.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

# MusicVAE: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music.
### ___Adam Roberts, Jesse Engel, Colin Raffel, Curtis Hawthorne, and Douglas Eck___

[MusicVAE](https://g.co/magenta/music-vae) learns a latent space of musical scores, providing different modes
of interactive musical creation, including:

* Random sampling from the prior distribution.
* Interpolation between existing sequences.
* Manipulation of existing sequences via attribute vectors.

Examples of these interactions can be generated below, and selections can be heard in our
[YouTube playlist](https://www.youtube.com/playlist?list=PLBUMAYA6kvGU8Cgqh709o5SUvo-zHGTxr).

For short sequences (e.g., 2-bar "loops"), we use a bidirectional LSTM encoder
and LSTM decoder. For longer sequences, we use a novel hierarchical LSTM
decoder, which helps the model learn longer-term structures.

We also model the interdependencies between instruments by training multiple
decoders on the lowest-level embeddings of the hierarchical decoder.

For additional details, check out our [blog post](https://g.co/magenta/music-vae) and [paper](https://goo.gl/magenta/musicvae-paper).
___

This colab notebook is self-contained and should run natively on google cloud. The [code](https://github.com/tensorflow/magenta/tree/master/magenta/models/music_vae) and [checkpoints](http://download.magenta.tensorflow.org/models/music_vae/checkpoints.tar.gz) can be downloaded separately and run locally, which is required if you want to train your own model.

# Basic Instructions

1. Double click on the hidden cells to make them visible, or select "View > Expand Sections" in the menu at the top.
2. Hover over the "`[ ]`" in the top-left corner of each cell and click on the "Play" button to run it, in order.
3. Listen to the generated samples.
4. Make it your own: copy the notebook, modify the code, train your own models, upload your own MIDI, etc.!

# Environment Setup
Includes package installation for sequence synthesis. Will take a few minutes.


In [1]:
#@title Connect to Google Drive
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
%cd /content/drive/MyDrive/Thesis/Code/Magenta/magenta/

/content/drive/MyDrive/University of Alberta/Thesis/Code/Magenta/magenta


In [3]:
#@title Initial Imports
import glob

BASE_DIR = "gs://download.magenta.tensorflow.org/models/music_vae/colab2"

print('Installing dependencies...')
!apt-get update -qq && apt-get install -qq libfluidsynth1 fluid-soundfont-gm build-essential libasound2-dev libjack-dev
!pip install -q pyfluidsynth

# Hack to allow python to pick up the newly-installed fluidsynth lib.
# This is only needed for the hosted Colab environment.
import ctypes.util
orig_ctypes_util_find_library = ctypes.util.find_library
def proxy_find_library(lib):
  if lib == 'fluidsynth':
    return 'libfluidsynth.so.1'
  else:
    return orig_ctypes_util_find_library(lib)
ctypes.util.find_library = proxy_find_library

!pip install tensor2tensor
!pip install note_seq

Installing dependencies...
Selecting previously unselected package fluid-soundfont-gm.
(Reading database ... 155113 files and directories currently installed.)
Preparing to unpack .../fluid-soundfont-gm_3.1-5.1_all.deb ...
Unpacking fluid-soundfont-gm (3.1-5.1) ...
Selecting previously unselected package libfluidsynth1:amd64.
Preparing to unpack .../libfluidsynth1_1.1.9-1_amd64.deb ...
Unpacking libfluidsynth1:amd64 (1.1.9-1) ...
Setting up fluid-soundfont-gm (3.1-5.1) ...
Setting up libfluidsynth1:amd64 (1.1.9-1) ...
Processing triggers for libc-bin (2.27-3ubuntu1.3) ...
/sbin/ldconfig.real: /usr/local/lib/python3.7/dist-packages/ideep4py/lib/libmkldnn.so.0 is not a symbolic link

Collecting tensor2tensor
  Downloading tensor2tensor-1.15.7-py2.py3-none-any.whl (1.4 MB)
[K     |████████████████████████████████| 1.4 MB 8.1 MB/s 
Collecting tensorflow-probability==0.7.0
  Downloading tensorflow_probability-0.7.0-py2.py3-none-any.whl (981 kB)
[K     |████████████████████████████████| 98

In [4]:
!pip install --upgrade --no-deps --force-reinstall -e .

Obtaining file:///content/drive/MyDrive/University%20of%20Alberta/Thesis/Code/Magenta/magenta
Installing collected packages: magenta
  Running setup.py develop for magenta
Successfully installed magenta


In [5]:
#@title Imports and Helpers
# #@title Setup Environment
# #@test {"output": "ignore"}

# import glob

# BASE_DIR = "gs://download.magenta.tensorflow.org/models/music_vae/colab2"

# print('Installing dependencies...')
# !apt-get update -qq && apt-get install -qq libfluidsynth1 fluid-soundfont-gm build-essential libasound2-dev libjack-dev
# !pip install -q pyfluidsynth
# # !pip install -qU magenta
# !pip install -e .

# Hack to allow python to pick up the newly-installed fluidsynth lib.
# This is only needed for the hosted Colab environment.
# import ctypes.util
# orig_ctypes_util_find_library = ctypes.util.find_library
# def proxy_find_library(lib):
#   if lib == 'fluidsynth':
#     return 'libfluidsynth.so.1'
#   else:
#     return orig_ctypes_util_find_library(lib)
# ctypes.util.find_library = proxy_find_library


print('Importing libraries and defining some helper functions...')
from google.colab import files
import magenta.music as mm
from magenta.models.music_vae import configs
from magenta.models.music_vae.trained_model import TrainedModel
import numpy as np
import os
import tensorflow.compat.v1 as tf

tf.disable_v2_behavior()
# tf.enable_eager_execution()

# Necessary until pyfluidsynth is updated (>1.2.5).
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)

def play(note_sequence):
  mm.play_sequence(note_sequence, synth=mm.fluidsynth)

def interpolate(model, start_seq, end_seq, num_steps, max_length=32,
                assert_same_length=True, temperature=0.5,
                individual_duration=4.0):
  """Interpolates between a start and end sequence."""
  note_sequences = model.interpolate(
      start_seq, end_seq,num_steps=num_steps, length=max_length,
      temperature=temperature,
      assert_same_length=assert_same_length)

  print('Start Seq Reconstruction')
  play(note_sequences[0])
  print('End Seq Reconstruction')
  play(note_sequences[-1])
  print('Mean Sequence')
  play(note_sequences[num_steps // 2])
  print('Start -> End Interpolation')
  interp_seq = mm.sequences_lib.concatenate_sequences(
      note_sequences, [individual_duration] * len(note_sequences))
  play(interp_seq)
  mm.plot_sequence(interp_seq)
  return interp_seq if num_steps > 3 else note_sequences[num_steps // 2]

def download(note_sequence, filename):
  mm.sequence_proto_to_midi_file(note_sequence, filename)
  files.download(filename)

print('Done')

Importing libraries and defining some helper functions...
Instructions for updating:
non-resource variables are not supported in the long term
Done


In [6]:
import datetime
%load_ext tensorboard

# 2-Bar Melody Model

The pre-trained model consists of a single-layer bidirectional LSTM encoder with 2048 nodes in each direction, a 3-layer LSTM decoder with 2048 nodes in each layer, and Z with 512 dimensions. The model was given 0 free bits, and had its beta valued annealed at an exponential rate of 0.99999 from 0 to 0.43 over 200k steps. It was trained with scheduled sampling with an inverse sigmoid schedule and a rate of 1000. The final accuracy is 0.95 and KL divergence is 58 bits.

In [7]:
#@title Load the pre-trained model.
mel_2bar_config_sky1 = configs.CONFIG_MAP['cat-mel_2bar_big_sky1']
mel_2bar_config_sky1.data_converter._min_unique_pitches = 1
mel_2bar_base = TrainedModel(mel_2bar_config_sky1, batch_size=4, checkpoint_dir_or_path=BASE_DIR + '/checkpoints/mel_2bar_big.ckpt')

INFO:tensorflow:Building MusicVAE model with BidirectionalLstmEncoder, CategoricalLstmDecoder, and hparams:
{'max_seq_len': 32, 'z_size': 512, 'free_bits': 0, 'max_beta': 0.5, 'beta_rate': 0.99999, 'batch_size': 4, 'grad_clip': 1.0, 'clip_mode': 'global_norm', 'grad_norm_clip_to_zero': 10000, 'learning_rate': 0.001, 'decay_rate': 0.9999, 'min_learning_rate': 1e-05, 'conditional': True, 'dec_rnn_size': [2048, 2048, 2048], 'enc_rnn_size': [2048], 'dropout_keep_prob': 1.0, 'sampling_schedule': 'inverse_sigmoid', 'sampling_rate': 1000, 'use_cudnn': False, 'residual_encoder': False, 'residual_decoder': False, 'control_preprocessing_rnn_size': [256], 'mel_mode': 'skyline1'}
INFO:tensorflow:
Encoder Cells (bidirectional):
  units: [2048]

INFO:tensorflow:
Decoder Cells:
  units: [2048, 2048, 2048]

Instructions for updating:
Use `tf.cast` instead.


  name=name),
  return layer.apply(inputs)
  self._names["W"], [input_size + self._num_units, self._num_units * 4])
  initializer=tf.constant_initializer(0.0))


Instructions for updating:
Please use `keras.layers.Bidirectional(keras.layers.RNN(cell))`, which is equivalent to this API
Instructions for updating:
Please use `keras.layers.RNN(cell)`, which is equivalent to this API
Instructions for updating:
Do not call `graph_parents`.


  kernel_initializer=tf.random_normal_initializer(stddev=0.001))
  kernel_initializer=tf.random_normal_initializer(stddev=0.001))


INFO:tensorflow:Restoring parameters from gs://download.magenta.tensorflow.org/models/music_vae/colab2/checkpoints/mel_2bar_big.ckpt


## Generate Samples

In [8]:
# #@title Generate 4 samples from the prior.
# temperature = 0.5 #@param {type:"slider", min:0.1, max:1.5, step:0.1}
# mel_2_samples = mel_2bar.sample(n=4, length=32, temperature=temperature)
# for ns in mel_2_samples:
#   play(ns)

In [9]:
# #@title Optionally download samples.
# for i, ns in enumerate(mel_2_samples):
#   download(ns, 'mel_2bar_sample_%d.mid' % i)

In [10]:
!python magenta/models/music_vae/music_vae_train_modified.py \
--config=cat-mel_2bar_big_sky1 \
--run_dir=./data/tmp/Persian_problematic/revised_sky1 \
--mode=eval \
--examples_path=./data/tfrecord/Persian/persian_revised.tfrecord \
--log=DEBUG \
--hparams=batch_size=1 \
--cache_dataset=False

Instructions for updating:
non-resource variables are not supported in the long term
INFO:tensorflow:Counting examples in ./data/tfrecord/Persian/persian_revised.tfrecord.
I0207 13:25:25.371279 140521708918656 data.py:1695] Counting examples in ./data/tfrecord/Persian/persian_revised.tfrecord.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`
W0207 13:25:25.371661 140521708918656 deprecation.py:347] From /content/drive/MyDrive/University of Alberta/Thesis/Code/Magenta/magenta/magenta/models/music_vae/data.py:1696: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`
unique pitches 7
unique pitches 8
-------------

-------------

unique pitches 9
unique pitches 9
unique pitches 7
unique pitches 9
unique pitches 9
-------------

-------------

unique pitches 10
unique pitches 1
-------------

-------------



# Data Setup 

In [9]:
tf.enable_eager_execution()

In [None]:
from magenta.models.music_vae import data

In [None]:
!ls ./data/tfrecord/melody/

Fly_me_to_the_Moon.tfrecord  Scarborough_Fair.tfrecord	Take_on_me.tfrecord


In [None]:
data_path = './data/tfrecord/Bo_Burnham_eval.tfrecord'
tf_file_reader = tf.data.TFRecordDataset
file_reader = tf.python_io.tf_record_iterator
mel_2bar_config_base = configs.CONFIG_MAP['cat-mel_2bar_big']
mel_2bar_config_sky1 = configs.CONFIG_MAP['cat-mel_2bar_big_sky1']

In [None]:
mel_2bar_config_base = configs.update_config(mel_2bar_config_base, dict(eval_examples_path=data_path))
mel_2bar_config_sky1 = configs.update_config(mel_2bar_config_sky1, dict(eval_examples_path=data_path))

In [None]:
mel_2bar_config_base.hparams.batch_size = 1
mel_2bar_config_base.data_converter._min_unique_pitches = 1

mel_2bar_config_sky1.hparams.batch_size = 1
mel_2bar_config_sky1.data_converter._min_unique_pitches = 1


In [None]:
dataset_base = data.get_dataset(
    mel_2bar_config_base,
    tf_file_reader=tf_file_reader,
    is_training=False,
    cache_dataset=False)

dataset_sky1 = data.get_dataset(
    mel_2bar_config_sky1,
    tf_file_reader=tf_file_reader,
    is_training=False,
    cache_dataset=False)

INFO:tensorflow:Reading examples from file: ./data/tfrecord/Bo_Burnham_eval.tfrecord
INFO:tensorflow:Reading examples from file: ./data/tfrecord/Bo_Burnham_eval.tfrecord


In [None]:
dataset_base = dataset_base.take(-1)
dataset_sky1 = dataset_sky1.take(-1)

In [None]:
dataset_base

<TakeDataset shapes: ((1, ?, 90), (1, ?, 90), (1, ?, 0), (1,)), types: (tf.bool, tf.bool, tf.bool, tf.int32)>

In [None]:
dataset_sky1

<TakeDataset shapes: ((1, ?, 90), (1, ?, 90), (1, ?, 0), (1,)), types: (tf.bool, tf.bool, tf.bool, tf.int32)>

In [None]:
batch_size_base = mel_2bar_config_base.hparams.batch_size
iterator_base = tf.data.make_one_shot_iterator(dataset_base)
input_seqs_base, output_seqs_base, control_seqs_base, sequence_lengths_base = [], [], [], []

for i, o, c, sl in iterator_base:
  input_seqs_base.append(mel_2bar_config_base.data_converter.from_tensors(i)[0])
  # input_seqs.append(i)
  sequence_lengths_base.append(sl)

unique pitches 4
unique pitches 2
unique pitches 4
unique pitches 6
unique pitches 6
unique pitches 7
unique pitches 7
unique pitches 6
unique pitches 1
unique pitches 6
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 2
unique pitches 2
unique pitches 2
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 7
unique pitches 7
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 1
unique pitches 6
unique pitches 6
unique pitches 2
unique pitches 4
unique pitches 1
unique pitches 1
unique pitches 3
unique pitches 2
unique pitches 1
unique pitches 2
unique pitches 1
unique pitches 2
unique pitches 1
unique pitches 2
unique pitches

In [None]:
batch_size_sky1 = mel_2bar_config_sky1.hparams.batch_size
iterator_sky1 = tf.data.make_one_shot_iterator(dataset_sky1)
input_seqs_sky1, output_seqs_sky1, control_seqs_sky1, sequence_lengths_sky1 = [], [], [], []

for i, o, c, sl in iterator_sky1:
  input_seqs_sky1.append(mel_2bar_config_sky1.data_converter.from_tensors(i)[0])
  # input_seqs.append(i)
  sequence_lengths_sky1.append(sl)

unique pitchesunique pitches 7
-------------

-------------

 6
unique pitches 6
unique pitches 4
unique pitches 3
-------------

-------------

unique pitchesunique pitches 6
-------------

-------------

 1
-------------

-------------

unique pitches 1
-------------

-------------

unique pitches 1
-------------

-------------

unique pitches 1
-------------

-------------

unique pitches 2
-------------

-------------

unique pitches 2
-------------

-------------

unique pitches 2
-------------

-------------

unique pitches 1
-------------

-------------

unique pitches 1
-------------

-------------

unique pitches 1
-------------

-------------

unique pitches 1
-------------

-------------

unique pitches 1
-------------

-------------

-------------

-------------

unique pitches 7
-------------

-------------

unique pitches 1
-------------

-------------

unique pitches 1
-------------

-------------

unique pitches 1
-------------

-------------

unique pitches 1
---------

In [None]:
len(input_seqs_base)

15

In [None]:
len(input_seqs_sky1)

15

In [None]:
# len(input_seqs_sky1[5].notes)

In [None]:
# from note_seq import Melody, sequences_lib, events_lib

In [None]:
# mel = Melody()
# target = input_seqs_base[7]

In [None]:
# target_quantized = sequences_lib.quantize_note_sequence(target, 4)

In [None]:
# play(target_quantized)

In [None]:
# target_quantized

In [None]:
# steps_per_bar_float = sequences_lib.steps_per_bar_in_quantized_sequence(target_quantized)
# if steps_per_bar_float % 1 != 0:
#       raise events_lib.NonIntegerStepsPerBarError(
#           'There are %f timesteps per bar. Time signature: %d/%d' %
#           (steps_per_bar_float, target.time_signatures[0].numerator,
#            target.time_signatures[0].denominator))
# mel._steps_per_bar = steps_per_bar = int(steps_per_bar_float)
# mel._steps_per_quarter = (target_quantized.quantization_info.steps_per_quarter)

# notes = target_quantized.notes


In [None]:
# search_start_step = 0
# melody_start_step = 0

In [None]:
# for note in notes:
#   start_index = note.quantized_start_step - melody_start_step
#   end_index = note.quantized_end_step - melody_start_step
#   mel._add_note(note.pitch, start_index, end_index)
      

In [None]:
# mel._events

In [None]:
# hist = mel.get_note_histogram()

In [None]:
# input_seqs_new = list()
# for i in input_seqs:
#   # try:
#   #   a, b, c = mel_2bar.encode([i])
#   # except:
#   #   print("invalid")
#   # else:
#     input_seqs_new.append(i)

# Reconstruction

In [None]:
# # plain model
# z1, mu1, sigma1 = mel_2bar.encode(input_seqs_new)
# z1 = [z1[i] for i in range(z1.shape[0])]
# output_seqs1 = mel_2bar.decode(z1, length=32)

In [None]:
# # fine-tuned model
# z2, mu2, sigma2 = mel_2bar_finetune.encode(input_seqs_new)
# z2 = [z2[i] for i in range(z2.shape[0])]
# output_seqs2 = mel_2bar_finetune.decode(z2, length=32)

# Comparison

In [None]:
for i in range(max(len(input_seqs_base), len(input_seqs_sky1))):
  if i in range(len(input_seqs_base)):
    print("Base sample {}: ".format(i+1))
    play(input_seqs_base[i])
  if i in range(len(input_seqs_sky1)):
    print("Sky1 sample {}: ".format(i+1))
    play(input_seqs_sky1[i])
  print("----------------------------------")

Base sample 1: 


Sky1 sample 1: 


----------------------------------
Base sample 2: 


Sky1 sample 2: 


----------------------------------
Base sample 3: 


Sky1 sample 3: 


----------------------------------
Base sample 4: 


Sky1 sample 4: 


----------------------------------
Base sample 5: 


Sky1 sample 5: 


----------------------------------
Base sample 6: 


Sky1 sample 6: 


----------------------------------
Base sample 7: 


Sky1 sample 7: 


----------------------------------
Base sample 8: 


Sky1 sample 8: 


----------------------------------
Base sample 9: 


Sky1 sample 9: 


----------------------------------
Base sample 10: 


Sky1 sample 10: 


----------------------------------
Base sample 11: 


Sky1 sample 11: 


----------------------------------
Base sample 12: 


Sky1 sample 12: 


----------------------------------
Base sample 13: 


Sky1 sample 13: 


----------------------------------
Base sample 14: 


Sky1 sample 14: 


----------------------------------
Base sample 15: 


Sky1 sample 15: 


----------------------------------
