Copyright 2017 Google LLC.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

# MusicVAE: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music.
### ___Adam Roberts, Jesse Engel, Colin Raffel, Curtis Hawthorne, and Douglas Eck___

[MusicVAE](https://g.co/magenta/music-vae) learns a latent space of musical scores, providing different modes
of interactive musical creation, including:

* Random sampling from the prior distribution.
* Interpolation between existing sequences.
* Manipulation of existing sequences via attribute vectors.

Examples of these interactions can be generated below, and selections can be heard in our
[YouTube playlist](https://www.youtube.com/playlist?list=PLBUMAYA6kvGU8Cgqh709o5SUvo-zHGTxr).

For short sequences (e.g., 2-bar "loops"), we use a bidirectional LSTM encoder
and LSTM decoder. For longer sequences, we use a novel hierarchical LSTM
decoder, which helps the model learn longer-term structures.

We also model the interdependencies between instruments by training multiple
decoders on the lowest-level embeddings of the hierarchical decoder.

For additional details, check out our [blog post](https://g.co/magenta/music-vae) and [paper](https://goo.gl/magenta/musicvae-paper).
___

This colab notebook is self-contained and should run natively on google cloud. The [code](https://github.com/tensorflow/magenta/tree/master/magenta/models/music_vae) and [checkpoints](http://download.magenta.tensorflow.org/models/music_vae/checkpoints.tar.gz) can be downloaded separately and run locally, which is required if you want to train your own model.

# Basic Instructions

1. Double click on the hidden cells to make them visible, or select "View > Expand Sections" in the menu at the top.
2. Hover over the "`[ ]`" in the top-left corner of each cell and click on the "Play" button to run it, in order.
3. Listen to the generated samples.
4. Make it your own: copy the notebook, modify the code, train your own models, upload your own MIDI, etc.!

# Environment Setup
Includes package installation for sequence synthesis. Will take a few minutes.


In [1]:
#@title Connect to Google Drive
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
%cd /content/drive/MyDrive/Thesis/Code/Magenta/magenta/

/content/drive/MyDrive/University of Alberta/Thesis/Code/Magenta/magenta


In [6]:
!python --version

Python 3.10.6


In [5]:
# Install hmmlearn before downgrading Python
!pip install hmmlearn

# Downgrade Python
!apt-get update -y
!apt-get install python3.8
!update-alternatives --set python3 /usr/bin/python3.8
!curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
!python get-pip.py
import sys
# This path is Colab-runtime specific, check path in other systems.
_ = (sys.path.append("/usr/local/lib/python3.8/dist-packages"))


# Preinstall legacy packages
!pip install numba==0.48
!pip install numpy==1.23
!pip install packaging>=21.3
!pip install librosa==0.7.2

# Install Magenta
!pip install magenta

Collecting hmmlearn
  Downloading hmmlearn-0.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (160 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/160.4 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.4/160.4 kB[0m [31m1.7 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m160.4/160.4 kB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: hmmlearn
Successfully installed hmmlearn-0.3.0
Get:1 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease [3,626 B]
Hit:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease
Get:3 http://security.ubuntu.com/ubuntu jammy-security InRelease [110 kB]
Hit:4 http://archive.ubuntu.com/ubuntu jammy InRelease
Get:5 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [119 kB]
Hit:6 https://ppa.launchpadcontent.net/c2d4u.team/c2

[0mCollecting librosa==0.7.2
  Downloading librosa-0.7.2.tar.gz (1.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m13.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting resampy>=0.2.2 (from librosa==0.7.2)
  Using cached resampy-0.4.2-py3-none-any.whl (3.1 MB)
Building wheels for collected packages: librosa
  Building wheel for librosa (setup.py) ... [?25l[?25hdone
  Created wheel for librosa: filename=librosa-0.7.2-py3-none-any.whl size=1612885 sha256=9ed2b6f284574e0497a344a20294c953b3345f790ec6a871153b0fcfa4834961
  Stored in directory: /root/.cache/pip/wheels/92/c3/d7/e11010142038c78f6c92d8e7a87183ebd66cc0e44605974271
Successfully built librosa
Installing collected packages: resampy, librosa
  Attempting uninstall: librosa
    Found existing installation: librosa 0.10.0.post2
    Uninstalling librosa-0.10.0.post2:
      Successfully uninstalled librosa-0.10.0.post2
Successfully installed 

In [4]:
!pip install -qU google-cloud note-seq==0.0.2 pyfluidsynth

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/209.6 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m [32m204.8/209.6 kB[0m [31m6.0 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m209.6/209.6 kB[0m [31m4.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m41.9 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.6/5.6 MB[0m [31m81.6 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m110.8/110.8 kB[0m [31m12.6 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━

In [None]:
import glob

BASE_DIR = "gs://download.magenta.tensorflow.org/models/music_vae/colab2"

print('Installing dependencies...')
!apt-get update -qq && apt-get install -qq libfluidsynth1 fluid-soundfont-gm build-essential libasound2-dev libjack-dev
!pip install -q pyfluidsynth

# Hack to allow python to pick up the newly-installed fluidsynth lib.
# This is only needed for the hosted Colab environment.
import ctypes.util
orig_ctypes_util_find_library = ctypes.util.find_library
def proxy_find_library(lib):
  if lib == 'fluidsynth':
    return 'libfluidsynth.so.1'
  else:
    return orig_ctypes_util_find_library(lib)
ctypes.util.find_library = proxy_find_library


print('Importing libraries and defining some helper functions...')
from google.colab import files

Installing dependencies...
E: Package 'libfluidsynth1' has no installation candidate
Importing libraries and defining some helper functions...


In [None]:
!pip install tensor2tensor

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting tensor2tensor
  Downloading tensor2tensor-1.15.7-py2.py3-none-any.whl (1.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.4/1.4 MB[0m [31m53.8 MB/s[0m eta [36m0:00:00[0m
Collecting pypng
  Downloading pypng-0.20220715.0-py3-none-any.whl (58 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.1/58.1 KB[0m [31m5.6 MB/s[0m eta [36m0:00:00[0m
Collecting bz2file
  Downloading bz2file-0.98.tar.gz (11 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting gunicorn
  Downloading gunicorn-20.1.0-py3-none-any.whl (79 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m79.5/79.5 KB[0m [31m11.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting tensorflow-gan
  Downloading tensorflow_gan-2.1.0-py2.py3-none-any.whl (367 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m367.1/367.1 KB[0m [31m38.2

In [None]:
!pip install note_seq

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
!pip install -e .

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Obtaining file:///content/drive/MyDrive/University%20of%20Alberta/Thesis/Code/Magenta/magenta
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting dm-sonnet
  Downloading dm_sonnet-2.0.1-py3-none-any.whl (268 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m268.4/268.4 KB[0m [31m22.4 MB/s[0m eta [36m0:00:00[0m
Collecting librosa<0.8.0,>=0.6.2
  Downloading librosa-0.7.2.tar.gz (1.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m73.3 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting mido==1.2.6
  Downloading mido-1.2.6-py2.py3-none-any.whl (69 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m69.8/69.8 KB[0m [31m9.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting mir_eval>=0.4
  Downloading mir_eval-0.7.tar.gz (90 kB)
[2K     [90m━━━━━━━━━━━━━━━━━

In [None]:
!pip install fluidsynth

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting fluidsynth
  Downloading fluidsynth-0.2.tar.gz (3.7 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: fluidsynth
  Building wheel for fluidsynth (setup.py) ... [?25l[?25hdone
  Created wheel for fluidsynth: filename=fluidsynth-0.2-py3-none-any.whl size=4512 sha256=33ae2acf7efb278239817d63ee7c3c84340894e18b0b7b3f06ce6177a213d629
  Stored in directory: /root/.cache/pip/wheels/d4/e6/bf/921b2deb780e2681b0e1626a13995e504dbbd455b47e7eedd4
Successfully built fluidsynth
Installing collected packages: fluidsynth
Successfully installed fluidsynth-0.2


In [None]:
import magenta.music as mm
from magenta.models.music_vae import configs
from magenta.models.music_vae.trained_model import TrainedModel
import numpy as np
import os
import tensorflow.compat.v1 as tf

tf.disable_v2_behavior()
# tf.enable_eager_execution()

# Necessary until pyfluidsynth is updated (>1.2.5).
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)

def play(note_sequence):
  mm.play_sequence(note_sequence, synth=mm.fluidsynth)

def interpolate(model, start_seq, end_seq, num_steps, max_length=32,
                assert_same_length=True, temperature=0.5,
                individual_duration=4.0):
  """Interpolates between a start and end sequence."""
  note_sequences = model.interpolate(
      start_seq, end_seq,num_steps=num_steps, length=max_length,
      temperature=temperature,
      assert_same_length=assert_same_length)

  print('Start Seq Reconstruction')
  play(note_sequences[0])
  print('End Seq Reconstruction')
  play(note_sequences[-1])
  print('Mean Sequence')
  play(note_sequences[num_steps // 2])
  print('Start -> End Interpolation')
  interp_seq = mm.sequences_lib.concatenate_sequences(
      note_sequences, [individual_duration] * len(note_sequences))
  play(interp_seq)
  mm.plot_sequence(interp_seq)
  return interp_seq if num_steps > 3 else note_sequences[num_steps // 2]

def download(note_sequence, filename):
  mm.sequence_proto_to_midi_file(note_sequence, filename)
  files.download(filename)

print('Done')

Instructions for updating:
non-resource variables are not supported in the long term


Done


In [None]:
from datetime import datetime
%load_ext tensorboard

In [None]:
from glob import glob

# Notes

A few important functions are:


*   
```
# tf.train.list_variables(checkpoint_path)
```
This will list every variable in the checkpoint including tensors.


*   
```
# tf.train.load_checkpoint(path).get_variable_to_shape_map()
```


*   
```
from tensorflow.python.tools.inspect_checkpoint import print_tensors_in_checkpoint_file
print_tensors_in_checkpoint_file(path, all_tensors=True, tensor_name=name)
```
This will print all or a specific tensor and their values in checkpoint.










# Setup

In [None]:
import sys
import copy
from magenta.models.music_vae import music_vae_mcts_train

In [None]:
# enable deterministic behaviour to debug
# may make operations slower
import random
seed = 1
random.seed(seed)
np.random.seed(seed)
tf.set_random_seed(1)
# tf.config.experimental.enable_op_determinism()

In [None]:
# Experiment config
config_name = 'cat-mel_2bar_big'
mode = 'train' # mode = {train | eval}
finetune = 'True' # if mode==train, finetune = {True | False}
# if finetune==True, a comma-separated list of variable names to be finetuned
# or 'last_layer' or 'all'.
trainable_vars = 'all'
num_steps = '2000'
batch_size = '32'
learning_rate = '0.001'

run_dir = './data/mcts/CE_MCTS'
train_example_path = './data/tfrecord/Persian/persian_100_v1/fold_4_train.tfrecord'
eval_example_path = './data/tfrecord/Persian/persian_100_v1/fold_4_test.tfrecord'

In [None]:
# Add datetime info to run_dir
run_dir += datetime.now().strftime('-%y-%m-%d-%H-%M/')
print("New run_dir is: ", run_dir)

New run_dir is:  ./data/mcts/CE_MCTS-23-02-15-05-48/


In [None]:
run_dir = './data/mcts/CE_MCTS-23-02-15-02-37/'

In [None]:
# run_dir = './data/mcts/CE_MCTS-23-02-05-07-36/'

In [None]:
# train_example_path = './data/tfrecord/Video_game.tfrecord'
# eval_example_path = './data/tfrecord/Video_game.tfrecord'

In [None]:
mel_2bar_big_ckpt_path = '/content/drive/MyDrive/Code/cat-mel_2bar_big.ckpt'
save_path = './data/mcts/ckpt_test'
log_path = os.path.join(run_dir, 'log.txt')

In [None]:
import logging
log = logging.getLogger()
if not os.path.exists(run_dir):
    os.makedirs(run_dir)
fh = logging.FileHandler(log_path)
log.addHandler(fh)

In [None]:
MODEL_VARIABLES = [
    'decoder/multi_rnn_cell/cell_0/lstm_cell/bias',
    'decoder/multi_rnn_cell/cell_0/lstm_cell/kernel',
    'decoder/multi_rnn_cell/cell_1/lstm_cell/bias',
    'decoder/multi_rnn_cell/cell_1/lstm_cell/kernel',
    'decoder/multi_rnn_cell/cell_2/lstm_cell/bias',
    'decoder/multi_rnn_cell/cell_2/lstm_cell/kernel',
    'decoder/output_projection/bias',
    'decoder/output_projection/kernel',
    'decoder/z_to_initial_state/bias',
    'decoder/z_to_initial_state/kernel',
    'encoder/cell_0/bidirectional_rnn/bw/multi_rnn_cell/cell_0/lstm_cell/bias',
    'encoder/cell_0/bidirectional_rnn/bw/multi_rnn_cell/cell_0/lstm_cell/kernel',
    'encoder/cell_0/bidirectional_rnn/fw/multi_rnn_cell/cell_0/lstm_cell/bias',
    'encoder/cell_0/bidirectional_rnn/fw/multi_rnn_cell/cell_0/lstm_cell/kernel',
    'encoder/mu/bias',
    'encoder/mu/kernel',
    'encoder/sigma/bias',
    'encoder/sigma/kernel',
    'global_step'
]


In [None]:
# Importing gc module
import gc

# Returns the number of
# objects it has collected
# and deallocated
# collected = gc.collect()

# # Prints Garbage collector
# # as 0 object
# print("Garbage collector: collected",
#           "%d objects." % collected)

# Checkpoint

In [None]:
def checkpoint_to_variable_list(ckpt_path):
  # tf.keras.backend.clear_session()
  ckpt_reader = tf.train.load_checkpoint(ckpt_path)
  name_shape_list = tf.train.list_variables(ckpt_path)

  tf.reset_default_graph()
  var_list = list()
  var_names = list()
  for name, _ in name_shape_list:
    if name in MODEL_VARIABLES:
      # print(name)
      var_list.append(ckpt_reader.get_tensor(name))
      var_names.append(name)

  return var_names, var_list

In [None]:
def variable_list_to_checkpoint(var_names, variable_list, save_path):
  tf_vars = list()
  for val, name in zip(variable_list, var_names):
    if name == 'global_step':
      tf_vars.append(tf.Variable(val, name=name, dtype=tf.int64))
    else:
      tf_vars.append(tf.Variable(val, name=name, dtype=tf.float32))

  saver = tf.train.Saver(tf_vars)
  sess = tf.Session()
  sess.run(tf.global_variables_initializer())
  saver.save(sess, save_path)
  tf.keras.backend.clear_session()

# CE-MCTS


## CE Neighbor Functions

In [None]:
# Conceptual expansion neighborhood no. 0
# Multiplying alpha by random number in range [-2, 2] for an index in a layer

def neighbor_0(weights, alphas):
  model_weights = weights.copy()
  model_alphas = alphas.copy()

  for i in range(len(model_weights)):
    model_weights[i] = np.divide(model_weights[i], model_alphas[i])

  idxa = random.randint(0, len(model_weights) - 1)
  current_layer = model_weights[idxa]
  idxb = random.randint(0, current_layer.shape[0] - 1)
  x = np.random.uniform(-2,2)

  if len(current_layer.shape)==1:
    model_alphas[idxa][idxb] *= x
  else:
    for i in range(current_layer.shape[1]):
      model_alphas[idxa][idxb][i] *= x

  for i in range(len(model_weights)):
    model_weights[i] = np.multiply(model_weights[i], model_alphas[i])

  return model_weights, model_alphas

In [None]:
# Conceptual expansion neighborhood no. 1
# Multiplying alpha by random number in range [-2, 2] for a layer

def neighbor_1(weights, alphas):
  model_weights = weights.copy()
  model_alphas = alphas.copy()

  for i in range(len(model_weights)):
    model_weights[i] = np.divide(model_weights[i], model_alphas[i])

  layer_idx = random.randint(0, len(model_weights) - 1)
  current_layer = model_weights[layer_idx]

  x = np.random.uniform(-2,2) * np.ones(shape=current_layer.shape)
  model_alphas[layer_idx] = np.multiply(model_alphas[layer_idx], x)

  for i in range(len(model_weights)):
    model_weights[i] = np.multiply(model_weights[i], model_alphas[i])

  return model_weights, model_alphas

In [None]:
# Conceptual expansion neighborhood no. 2
# Replace a random f with another random f
# The weights must contain at least two layers of the same shape.
# (To be more efficient for each layer, there should be at least another layer with the same shape.)

def neighbor_2(weights, alphas):
  model_weights = weights.copy()
  model_alphas = alphas.copy()

  for i in range(len(model_weights)):
    model_weights[i] = np.divide(model_weights[i], model_alphas[i])

  flag = True
  while flag:
    source_idx = random.randint(0, len(model_weights) - 1)
    idx_choices = [idx for idx in range(len(model_weights))
                  if (idx != source_idx and
                      model_weights[idx].shape == model_weights[source_idx].shape)]
    if len(idx_choices):
      flag = False

  target_idx = np.random.choice(idx_choices)
  model_weights[target_idx] = model_weights[source_idx]

  for i in range(len(model_weights)):
    model_weights[i] = np.multiply(model_weights[i], model_alphas[i])

  return model_weights, model_alphas

In [None]:
# Conceptual expansion neighborhood no. 3
# Add a random f and alpha to a random target f and alpha
# The weights must contain at least two layers of the same shape.
# (To be more efficient for each layer, there should be at least another layer with the same shape.)

def neighbor_3(weights, alphas):
  model_weights = weights.copy()
  model_alphas = alphas.copy()

  for i in range(len(model_weights)):
    model_weights[i] = np.divide(model_weights[i], model_alphas[i])

  flag = True
  while flag:
    source_idx = random.randint(0, len(model_weights) - 1)
    source_layer = model_weights[source_idx]
    source_alpha = model_alphas[source_idx]

    idx_choices = [idx for idx in range(len(model_weights))
                  if (idx != source_idx and
                      model_weights[idx].shape == source_layer.shape)]
    if len(idx_choices):
      flag = False

  target_idx = np.random.choice(idx_choices)
  model_weights[target_idx] += source_layer
  model_alphas[target_idx] += source_alpha

  for i in range(len(model_weights)):
    model_weights[i] = np.multiply(model_weights[i], model_alphas[i])

  return model_weights, model_alphas

In [None]:
# Conceptual expansion neighborhood no. 4
# Swap two random f and alpha
# The weights must contain at least two layers of the same shape.
# (To be more efficient for each layer, there should be at least another layer with the same shape.)

def neighbor_4(weights, alphas):
  model_weights = weights.copy()
  model_alphas = alphas.copy()

  for i in range(len(model_weights)):
    model_weights[i] = np.divide(model_weights[i], model_alphas[i])

  flag = True
  while flag:
    source_idx = random.randint(0, len(model_weights) - 1)
    idx_choices = [idx for idx in range(len(model_weights))
                  if (idx != source_idx and
                      model_weights[idx].shape == model_weights[source_idx].shape)]
    if len(idx_choices):
      flag = False

  target_idx = np.random.choice(idx_choices)

  # swap
  model_weights[source_idx], model_weights[target_idx] = model_weights[target_idx], model_weights[source_idx]
  model_alphas[source_idx], model_alphas[target_idx] = model_alphas[target_idx], model_alphas[source_idx]

  for i in range(len(model_weights)):
    model_weights[i] = np.multiply(model_weights[i], model_alphas[i])

  return model_weights, model_alphas

## MCTS Node Class

The difference in our code is that we work with different checkpoints.

**Caveat:** The memory might become an issue. Might want to use some hacks to not load everything.

In [None]:
# MCTS Node
class MCTSNode:
  def __init__(self, idx, ckpt_path, alpha_values=None, fitness_score=None,
               parent=None, child_nodes=list()):
    self.idx = idx                        # To keep track of nodes
    self.ckpt_path = ckpt_path            # Checkpoint path for the model corresponding to the node
    # self.f_values = f_values              # List of the weights for each layer
    self.alpha_values = alpha_values      # Model alpha values
    self.fitness_score = fitness_score    # Absolute model score
    self.cummulative_score = None         # Score calculated during the backprop
    self.parent = parent                  # Parent node info
    self.child_nodes = child_nodes        # List of all childs to the current node

    if not fitness_score:
      self.set_fitness()

    if not alpha_values:
      self.alpha_values = list()
      for name, shape in tf.train.list_variables(self.ckpt_path):
        if name in MODEL_VARIABLES:
          self.alpha_values.append(np.ones(shape))

  def add_child(self, child):
    self.child_nodes.append(child)

  # str representation of the node is: Model id <index>
  def __repr__(self):
    return repr('Node id ' + str(self.idx))

  # update cummmulative score
  def update_cummulative_score(self, cummulative_score):
    self.cummulative_score = cummulative_score

  # returns model accuracy on training data (Q: loss vs accuracy)
  def set_fitness(self):
    # with open(os.path.abspath(log_path), mode='a+') as sys.stdout:
    res = music_vae_mcts_train.run(
            run_dir=run_dir,
            config=config_name,
            mode='eval',
            hparams='batch_size=1',
            cache_dataset=False,
            examples_path=train_example_path,
            ckpt_path=self.ckpt_path,
            log='FATAL',
            seed=seed
          )

    self.fitness_score = res['metrics/accuracy']

  def create_neighbor_node(self, id=0):
    save_path = os.path.join(run_dir + f'ckpt/ckpt_{id}')
    names, vars = checkpoint_to_variable_list(self.ckpt_path) #load source variables
    # remove 'global_step' from the variables since we want to keep it unchanged
    gs_idx = names.index('global_step')
    global_step = vars.pop(gs_idx)
    names.pop(gs_idx)

    choice = np.random.randint(1, 5)
    if choice == 1:
      print("node generated: neighbor type 1")
      vars, alphas = neighbor_1(vars, self.alpha_values)
    elif choice == 2:
      print("node generated: neighbor type 2")
      vars, alphas = neighbor_2(vars, self.alpha_values)
    elif choice == 3:
      print("node generated: neighbor type 3")
      vars, alphas = neighbor_3(vars, self.alpha_values)
    elif choice == 4:
      print("node generated: neighbor type 4")
      vars, alphas = neighbor_4(vars, self.alpha_values)

    # add 'global_step' back in
    names.append('global_step')
    vars.append(global_step)

    variable_list_to_checkpoint(names, vars, save_path)

    # create the new node
    neighbor_node = MCTSNode(id, save_path, alpha_values=alphas, parent=self, child_nodes = list())
    # print(f"node {neighbor_node} was created as a neighbor to {self}")
    self.add_child(neighbor_node)
    print(f"new noded added to the children of {self} --> {self.child_nodes}")
    del names, vars, alphas
    # garbage collect
    collected = gc.collect()
    print("Garbage collector: collected",
          "%d objects." % collected)

    return neighbor_node

  def display_tree(self, root):
    pass

## Train

In [None]:
import warnings
warnings.filterwarnings('ignore')

In [None]:
# helper functions

# delete an unwanted tree
def delete_tree(root):
    if root:
      print(f"Deleting {root} tree ...")
      for child in root.child_nodes:
          delete_tree(child)
      del root
      gc.collect()

In [None]:
def explore(rollout_idx, depth, HEAD, best_node, best_fitness, top_models):
  explore = True
  print(f'Creating a new branch to {HEAD}\t rollout: {rollout_idx}, depth level: {depth}')
  # create a new neighbor from HEAD and with HEAD as its parent
  current = HEAD.create_neighbor_node(root_id)


  # check against the best node
  if current.fitness_score > best_fitness:
    best_node = current
    best_fitness = current.fitness_score
    top_models.pop(0)
    top_models.append(copy.copy(best_node))
  else:
    for i in range(models_to_keep-1 , -1, -1):
      if top_models[i]==0 or current.fitness_score > top_models[i].fitness_score:
        top_models.insert(i+1, copy.copy(current))
        top_models.pop(0)
        break



  # make the current node, the HEAD node
  HEAD = current
  # print(f"Head is {HEAD}")

  print(f'The current node: {HEAD}, fitness: {current.fitness_score} ==> parent node: {HEAD.parent}')
  return HEAD, best_node, best_fitness, top_models


In [None]:
def exploit(rollout_idx, depth, HEAD):
  print(f'Exploiting nodes at {HEAD} \t rollout: {rollout_idx}, depth: {depth}')

  # pick the child with the best fitness score
  best_score = - np.inf
  next_node = None
  leaf_nodes = list()

  for node in HEAD.child_nodes:
    if node.cummulative_score:
      if node.cummulative_score > best_score:
        best_Score = node.cummulative_score
        next_node = node
    else: # node is a leaf
      leaf_nodes.append(node)

  if len(leaf_nodes):
    next_node = random.choice(leaf_nodes)
    print(f'A random leaf {next_node} has been chosen.')
  else:
    print(f'There are no leafs. The child with the best cummulative score {next_node} was chosen.')

  HEAD = next_node
  del leaf_nodes
  gc.collect()

  return HEAD


In [None]:
def update_cummulative_score(HEAD):
  tmp_head = HEAD.parent
  while tmp_head and tmp_head != root:
    print(f'Update {tmp_head} commulative score.')
    tmp_head.update_cummulative_score(
        tmp_head.fitness_score +
        discount_factor * tmp_head.child_nodes[-1].fitness_score
    )
    tmp_head = tmp_head.parent

  del tmp_head

In [None]:
def change_root_to_best(root, best_node):
  previous_root = root
  root = best_node

  if root.parent!= None:
    print(20*'*' + " Deleting previous root tree... ")
    root.parent.child_nodes.remove(root)
    root.parent = None
    delete_tree(previous_root)
  else:
    print("The root is the best node!")

  return root

In [None]:
# Setup and variables
num_generations = 10 #10
no_of_rollouts = 10 #20
rollout_length = 5 #10
discount_factor = 0.3
epsilon = 0.5

# setup the root of the MCTS tree
root_id = 1
root = MCTSNode(idx=1, ckpt_path=mel_2bar_big_ckpt_path)
root_fitness = root.fitness_score
HEAD = None
# all_nodes = [root_node] # To keep track of all of the nodes

# setup best node
best_node = root
best_fitness = root.fitness_score # represents loss
models_to_keep = 10
top_models = models_to_keep * [0]

# limit the number of models created
limit = 100

In [None]:
best_fitness

0.84375

In [None]:
# start iterations
for gen in range(num_generations):
  if root_id == limit:
    break
  print(33*'=' + f' Generation {gen} ' + 33*'=')
  for rollout_idx in range(no_of_rollouts):
    if root_id == limit:
      break
    print(f'Rollout no {rollout_idx} ---> best node: {best_node} with fitness {best_fitness}. root={root}')
    HEAD = root         # used to traverse the tree
    explore_mode = False     # selecting explore/exploit

    for depth in range(rollout_length):
      if root_id == limit:
        break
      print(50 * '-')
      if gen == 0 and rollout_idx == 0:
        # at the very beginning we want to create a branch
        explore_mode = True
      p = random.uniform(0, 1)
      if explore_mode == False and p < epsilon: # exploit
        HEAD = exploit(rollout_idx, depth, HEAD)
      else: # explore by adding a chain of rollouts / extend to branch to depth length
        explore_mode = True
        root_id += 1
        HEAD, best_node, best_fitness, top_models = explore(rollout_idx, depth, HEAD, best_node, best_fitness, top_models)
        gc.collect()

    if explore:
      update_cummulative_score(HEAD)

  # choose the best node as new root
  previous_root = root
  root = best_node

  if root.parent!= None:
    print(20*'*' + " Deleting previous root tree... ")
    root.parent.child_nodes.remove(root)
    root.parent = None
    delete_tree(previous_root)
  else:
    print("The root is the best node!")


Rollout no 0 ---> best node: 'Node id 1' with fitness 0.84375. root='Node id 1'
--------------------------------------------------
Creating a new branch to 'Node id 1'	 rollout: 0, depth level: 0
node generated: neighbor type 2
new noded added to the children of 'Node id 1' --> ['Node id 2']
Garbage collector: collected 0 objects.
The current node: 'Node id 2', fitness: 0.65625 ==> parent node: 'Node id 1'
--------------------------------------------------
Creating a new branch to 'Node id 2'	 rollout: 0, depth level: 1
node generated: neighbor type 1
new noded added to the children of 'Node id 2' --> ['Node id 3']
Garbage collector: collected 15706 objects.
The current node: 'Node id 3', fitness: 0.59375 ==> parent node: 'Node id 2'
--------------------------------------------------
Creating a new branch to 'Node id 3'	 rollout: 0, depth level: 2
node generated: neighbor type 2
new noded added to the children of 'Node id 3' --> ['Node id 4']
Garbage collector: collected 15706 objects.

In [None]:
print(top_models)

['Node id 83', 'Node id 67', 'Node id 11', 'Node id 99', 'Node id 68', 'Node id 43', 'Node id 91', 'Node id 80', 'Node id 79', 'Node id 42']


In [None]:
for m in top_models:
  print(m.idx, m.fitness_score)

83 0.9375
67 0.9375
11 0.9375
99 0.96875
68 0.96875
43 0.96875
91 1.0
80 1.0
79 1.0
42 1.0


In [None]:
best_fitness

1.0

## Test

In [None]:
for m in top_models:
  res = music_vae_mcts_train.run(
            run_dir=run_dir,
            config=config_name,
            mode='eval',
            hparams='batch_size=1',
            cache_dataset=False,
            examples_path=eval_example_path,
            ckpt_path=m.ckpt_path,
            log='FATAL',
            seed=seed
          )
  accuracy = res['metrics/accuracy']
  print(f"Model: {m} \t train accuracy: {(m.fitness_score * 100):5.3} \t test accuracy: {(accuracy * 100):5.3}")


Model: 'Node id 83' 	 train accuracy:  93.8 	 test accuracy: 1e+02
Model: 'Node id 67' 	 train accuracy:  93.8 	 test accuracy:  90.6
Model: 'Node id 11' 	 train accuracy:  93.8 	 test accuracy: 1e+02
Model: 'Node id 99' 	 train accuracy:  96.9 	 test accuracy:  93.8
Model: 'Node id 68' 	 train accuracy:  96.9 	 test accuracy:  90.6
Model: 'Node id 43' 	 train accuracy:  96.9 	 test accuracy:  93.8
Model: 'Node id 91' 	 train accuracy: 1e+02 	 test accuracy:  93.8
Model: 'Node id 80' 	 train accuracy: 1e+02 	 test accuracy:  96.9
Model: 'Node id 79' 	 train accuracy: 1e+02 	 test accuracy: 1e+02
Model: 'Node id 42' 	 train accuracy: 1e+02 	 test accuracy:  84.4


# Generate Samples

In [None]:
gen_node = top_models[6]

In [None]:
path = os.path.abspath(gen_node.ckpt_path)

In [None]:
path

'/content/drive/MyDrive/University of Alberta/Thesis/Code/Magenta/magenta/data/mcts/CE_MCTS-23-02-06-01-36/ckpt/ckpt_4'

In [None]:
./data/mcts/CE_MCTS-23-02-15-02-37/

In [None]:
path = '/content/drive/MyDrive/Magenta/magenta/data/mcts/CE_MCTS-23-02-15-02-37/ckpt/ckpt_68'

In [None]:
config=configs.CONFIG_MAP[config_name]

In [None]:
model = TrainedModel(config=config,
                     batch_size=4,
                     checkpoint_dir_or_path=mel_2bar_big_ckpt_path)

Instructions for updating:
Use `tf.cast` instead.
  tf.layers.dense(
  self._kernel = self.add_variable(
  self._bias = self.add_variable(
Instructions for updating:
Please use `keras.layers.Bidirectional(keras.layers.RNN(cell))`, which is equivalent to this API
Instructions for updating:
Please use `keras.layers.RNN(cell)`, which is equivalent to this API
  mu = tf.layers.dense(
  sigma = tf.layers.dense(
Instructions for updating:
Do not call `graph_parents`.


In [None]:
#@title Random Samples

temperature = 0.96 #@param {type:"slider", min:0.01, max:1.5, step:0.01}
seqs = model.sample(n=4, length=128, temperature=temperature)


In [None]:
for i in range(4):
  download(seqs[i], f"note_seq_{i}.mid")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

# Extra Checkpoint Checks

In [None]:
var_name = 'encoder/cell_0/bidirectional_rnn/fw/multi_rnn_cell/cell_0/lstm_cell/kernel'
# var_name = 'decoder/output_projection/kernel'

checkpoint_path = '/content/drive/MyDrive/Code/cat-mel_2bar_big.ckpt'

In [None]:
Old_ckpt = tf.train.load_checkpoint(checkpoint_path)

In [None]:
import tensorflow as tf
tf.reset_default_graph()
w1 = tf.Variable(tf.random_normal(shape=[2]), name='w1')
w2 = tf.Variable(tf.random_normal(shape=[5]), name='w2')
saver = tf.train.Saver()
sess = tf.Session()
sess.run(tf.global_variables_initializer())
saver.save(sess, './data/test/my_test_model')

In [None]:
print(tf.get_default_session())

None


In [None]:
test_ckpt_reader = tf.train.load_checkpoint('./data/test/my_test_model')


In [None]:
tf.train.list_variables('./data/test/my_test_model')

[('w1', [2]), ('w2', [5])]

In [None]:
test_ckpt_reader.get_tensor('w1')

array([-0.929671  , -0.27028295], dtype=float32)

In [None]:
test_ckpt_reader.get_tensor('w1_1')

array([-0.46935463,  1.9606491 ], dtype=float32)

In [None]:
Old_A = tf.train.load_checkpoint(checkpoint_path).get_tensor(var_name)

In [None]:
New_A = tf.train.load_checkpoint(run_dir + 'train/model.ckpt-100').get_tensor(var_name)

In [None]:
Old_A

array([[ 0.01359052, -0.08663205,  0.03307157, ...,  0.07788762,
         0.01152594,  0.24845648],
       [-0.06781328, -0.17682482,  0.03815088, ...,  0.41707203,
         0.17010953, -0.2761376 ],
       [-0.00600539,  0.150931  ,  0.00929637, ...,  0.02380793,
        -0.06370527, -0.233501  ],
       ...,
       [-0.14717636, -0.00371401, -0.04210675, ..., -0.04117652,
        -0.08962385, -0.01789565],
       [-0.00903243,  0.03428619,  0.02984675, ..., -0.01778605,
         0.02633332, -0.04182264],
       [-0.0650212 ,  0.05001441,  0.02747146, ..., -0.08395307,
        -0.09573532,  0.02805939]], dtype=float32)

In [None]:
New_A

array([[ 0.01359052, -0.08663205,  0.03307157, ...,  0.07788762,
         0.01152594,  0.24845648],
       [-0.06781328, -0.17682482,  0.03815088, ...,  0.41707203,
         0.17010953, -0.2761376 ],
       [-0.00600539,  0.150931  ,  0.00929637, ...,  0.02380793,
        -0.06370527, -0.233501  ],
       ...,
       [-0.14717636, -0.00371401, -0.04210675, ..., -0.04117652,
        -0.08962385, -0.01789565],
       [-0.00903243,  0.03428619,  0.02984675, ..., -0.01778605,
         0.02633332, -0.04182264],
       [-0.0650212 ,  0.05001441,  0.02747146, ..., -0.08395307,
        -0.09573532,  0.02805939]], dtype=float32)

In [None]:
(New_A==Old_A).all()

True

In [None]:
tf.train.list_variables(checkpoint_path)

[('beta1_power', []),
 ('beta2_power', []),
 ('decoder/multi_rnn_cell/cell_0/lstm_cell/bias', [8192]),
 ('decoder/multi_rnn_cell/cell_0/lstm_cell/bias/Adam', [8192]),
 ('decoder/multi_rnn_cell/cell_0/lstm_cell/bias/Adam_1', [8192]),
 ('decoder/multi_rnn_cell/cell_0/lstm_cell/kernel', [2650, 8192]),
 ('decoder/multi_rnn_cell/cell_0/lstm_cell/kernel/Adam', [2650, 8192]),
 ('decoder/multi_rnn_cell/cell_0/lstm_cell/kernel/Adam_1', [2650, 8192]),
 ('decoder/multi_rnn_cell/cell_1/lstm_cell/bias', [8192]),
 ('decoder/multi_rnn_cell/cell_1/lstm_cell/bias/Adam', [8192]),
 ('decoder/multi_rnn_cell/cell_1/lstm_cell/bias/Adam_1', [8192]),
 ('decoder/multi_rnn_cell/cell_1/lstm_cell/kernel', [4096, 8192]),
 ('decoder/multi_rnn_cell/cell_1/lstm_cell/kernel/Adam', [4096, 8192]),
 ('decoder/multi_rnn_cell/cell_1/lstm_cell/kernel/Adam_1', [4096, 8192]),
 ('decoder/multi_rnn_cell/cell_2/lstm_cell/bias', [8192]),
 ('decoder/multi_rnn_cell/cell_2/lstm_cell/bias/Adam', [8192]),
 ('decoder/multi_rnn_cell/cel

In [None]:
from tensorflow.python.tools.inspect_checkpoint import print_tensors_in_checkpoint_file

In [None]:
print_tensors_in_checkpoint_file('/content/drive/MyDrive/Magenta/magenta/data/tmp/persian-finetune-11-21-01/train/model.ckpt-500', all_tensors=True, tensor_name='decoder/multi_rnn_cell/cell_0/lstm_cell/bias/Adam')

tensor: beta1_power (float32) []
1.18984805e-23
tensor: beta2_power (float32) []
0.6057766
tensor: decoder/multi_rnn_cell/cell_0/lstm_cell/bias (float32) [8192]
[-5.46724489e-03  4.73049423e-03 -9.11294576e-03 ... -1.55035285e-02
 -6.62921369e-03  3.34413999e-05]
tensor: decoder/multi_rnn_cell/cell_0/lstm_cell/bias/Adam (float32) [8192]
[ 7.5676769e-07  1.7772088e-08  7.1647941e-07 ... -8.9731898e-08
 -1.5525816e-07  1.7042743e-07]
tensor: decoder/multi_rnn_cell/cell_0/lstm_cell/bias/Adam_1 (float32) [8192]
[1.0983309e-11 7.4119937e-13 6.5951307e-12 ... 2.3710599e-12 3.7374895e-12
 1.2077212e-11]
tensor: decoder/multi_rnn_cell/cell_0/lstm_cell/kernel (float32) [2650, 8192]
[[-0.02101763 -0.0324685  -0.06040208 ...  0.0023125  -0.0393163
  -0.02289756]
 [-0.10999614 -0.05015213 -0.11449713 ... -0.02922615 -0.02492424
  -0.08833413]
 [ 0.00557702  0.01264934 -0.01215714 ...  0.00178084 -0.02203693
  -0.02272074]
 ...
 [-0.02724702 -0.01943417 -0.02045782 ...  0.01954178  0.00792572
  -0.

In [None]:
tf.train.load_checkpoint('/content/drive/MyDrive/Code/cat-mel_2bar_big.ckpt').get_variable_to_shape_map()

{'encoder/sigma/kernel/Adam_1': [4096, 512],
 'encoder/sigma/kernel': [4096, 512],
 'global_step': [],
 'encoder/sigma/bias/Adam': [512],
 'encoder/mu/kernel/Adam_1': [4096, 512],
 'encoder/mu/kernel/Adam': [4096, 512],
 'encoder/cell_0/bidirectional_rnn/fw/multi_rnn_cell/cell_0/lstm_cell/kernel/Adam_1': [2138,
  8192],
 'encoder/cell_0/bidirectional_rnn/fw/multi_rnn_cell/cell_0/lstm_cell/kernel/Adam': [2138,
  8192],
 'encoder/cell_0/bidirectional_rnn/fw/multi_rnn_cell/cell_0/lstm_cell/kernel': [2138,
  8192],
 'encoder/cell_0/bidirectional_rnn/fw/multi_rnn_cell/cell_0/lstm_cell/bias/Adam_1': [8192],
 'encoder/cell_0/bidirectional_rnn/fw/multi_rnn_cell/cell_0/lstm_cell/bias/Adam': [8192],
 'encoder/cell_0/bidirectional_rnn/fw/multi_rnn_cell/cell_0/lstm_cell/bias': [8192],
 'encoder/cell_0/bidirectional_rnn/bw/multi_rnn_cell/cell_0/lstm_cell/kernel/Adam': [2138,
  8192],
 'encoder/cell_0/bidirectional_rnn/bw/multi_rnn_cell/cell_0/lstm_cell/bias/Adam': [8192],
 'encoder/cell_0/bidirecti